Solving Wordle and similar games with recursion and entropy

1. Introduction

The goal of the wordle game is to ask questions to identify a secret word chosen by an adversary out of a large (but known) set of words. Per experiment we get feedback about how `near’ our question, i.e., our selected word, is to the secret. Now I am not particularly interested in playing this game, but I do find it interesting to see how to let the computer find the secret word in an efficient way. There were some sites on the internet that (kind of) `explained’ how to do this; these sites mention entropy and say that word frequencies can be taken into account to improve the guessing strategy. That in itself was not new to me, but I hoped to learn how these concepts connect in detail, as I find entropy an intriguing concept. However, I found these sites very unclear, so, I decided to work on it myself, and start from scratch.

I first discuss my approach to coming to understand the ideas, then I present the mathematical framework that summarizes these ideas. This framework helps to show the commonality of many problems and clarifies several other relating issues such as the selection of a good pivot for quicksort, why certain other types of games are harder than others, and how to develop efficient heuristics to deal with these hard games.

2. Approach

The book `Probability And Information’ by Yaglom and Yaglom provides a nice discussion on information theory and entropy. For this, they consider a game in which one has to identify a deficit (heavier) coin among \(n\) coins with an expected minimum of weightings with a balance with two pans. For nine coins the optimal strategy is easy: put two sets of three coins on either pan of the balance. If the balance remains level, the deficit coin is in the third set. If it tilts to the right, the heavier coin is in the right set, otherwise in the left. Thus, using the information gained by a weighting, i.e., an experiment, we can reduce the nine-coin problem to a three-coin problem. In other words, we can apply recursion to find the heavier coin. With \(n\) coins, the idea works roughly the same. In general terms, with every weighting we can reduce the size of the set containing the deficit coin by a factor three. Hence, the minimum of weightings for \(n\) coins must be about \(\log_3 n\).

A similar problem is to identify a secret word in a set of \(n\) words chosen by an `adversary’ who only responds with `yes’ or `no’ to any question posed. For instance, we can ask a question whether some word is larger than the secret. Realize that we now use the question to split the space of words into two sets, and ask whether the secret is an element of one of the two sets. This explains why this algorithm is called `binary search’. Since we get two possible answers, we need about \(\log_{2} n\) experiments to find the secret word.

In the coin and number-finding problems it is clear (after reading the book of the Yaglom brothers) how entropy helps to find the strategy that is best in the sence of minimizing the number of experiments. In fact in both games we use a question that makes the subsets that are compatible with the feedback as equal in size as possible, because this maximizes the entropy when all coins have an equal probability of being the deficit coin.

Below I make a general framework to analyze these two games, and with this framework we can also deal efficiently with the wordle game. I first built a computer program for the coin problem and binary search and thought about common concepts. Then I tried to make a mathematical formulation of it. Based on that, I refined my programs. After this, I went for a walk, and tried to explain in yet more general terms the essential ingredients. I also tried to tackle more general problems like how to find a coin that can be heavier or lighter among \(n\) coins, or two heavier coins out of \(n\), or \(m\) out of \(n\). These generalizations revealed to me that, actually, all these weighing games can be solved generically, in an efficient, but not optimal, way: just sort the coins from light to heavy with a balance. The most general of these problems is to sort \(n\) coins, and we already know from this page that this takes \(n \log_{2} n\) weightings.

A bit of further research on the web showed that finding the optimal policy for the \(2\) out of \(n\) coin problem is very hard.¹¹Richard Bellman even wrote a paper on this So I dropped this task from my list, but it raised two interesting problems. How to explain, in terms of my general framework, where exactly lies the additional complexity for \(m\) out of \(n\) coin game (with \(m>1\))? And second, what would be good heuristics to efficiently solve this, and similar, problems. As it turns out, this second problem relates directly to the selection of a good pivot in the quicksort algorithm; we will address it below.

3. Developing the mathematical framework

Before trying to develop a general mathematical formulation, it is helpful to consider two concrete examples and develop useful terminology at the same time.

In the wordle game, a player tries to find a secret word in a given list of words, and for this, the player can consult an oracle in two different ways. First, the player can ask feedback in terms of "If the word `raise’ is the secret, i.e., my guess for the secret, and I would ask the question whether the word `abate’ is the secret, what feedback would you give me?’. Clearly, asking such feedback is free: the oracle does not have to reveal any information about the secret word because the oracle just uses the player’s guess of the secret. Second, the player can perform an actual experiment, in which the player asks to give feedback about a specific word, and the oracle uses the secret to give the feedback. A player can get unlimited feedback about the quality of a word for a guess (of the secret) provided by the player, but an experiment counts as a try. In the wordle game, the player has 6 tries to identify the secret.

As a second example, consider 4 coins of which one is heavier, with probabilities \(\alpha, \beta, \gamma, \delta\) that the first, second, \(\ldots\), coin is the deficit one. We use a balance to compare the weight of groups of coins. Suppose we would put coins \(1, 2\) on the left pan, and \(3, 4\) on the right pan. The probability that the left pan tilts downwards is \(\alpha + \beta\), because this is the probability that the heavier coin lies in the set \(\{1, 2\}\), and the probability that the right pan tips downward is \(\gamma + \delta\). It cannot happen that the balance stays level. In more detail, our question is the grouping of the coins into sets \(\{1, 2\}\) and \(\{3, 4\}\). Then, if we guess that coin 1 is heavier, the feedback given by the oracle, i.e., the balance, will be that the left pan tips downwards. If coin 2 is heavier, we get the same feedback, while if coins 3 (or 4) is the heavier, the right pan tips. So, given the question, the probability to get as feedback that the left pan tips, is the probability that the heavier coin lies in the set \(\{1, 2\}\). The other probabilities can be computed likewise.

Now we are ready to make a formalized description of such stochastic optimization problems (here games). There is a state space \(S\) from which an oracle selects a secret word \(s\in S\). In wordle, \(S\) is the set of words; for the coins, \(S\) is the set of coins. A player can ask a question \(q\in Q\) for a set of possible questions \(Q\). For wordle, the set of questions is the same as the set of words; for finding a deficit coin, a question is the placement of sets of coins on the left and right pan of the balance, hence, the question set is the set of all possible groupings of coins in three piles, one is placed at the left pan, the other at the right, and the rest is not used in the weighting. The player can ask the oracle to provide feedback in the form of function \(f_{q}(g)\) for question \(q\) and guess \(g\) for the secret. In wordle, the feedback consists of a set of colors for letters; in the coin game, the feedback is \(l, eq\), or \(r\), i.e., the left pan tips, the balance stays equal, or the right pan tips. Finally, the player can perform an experiment (which counts as a try) by asking the oracle a question \(q\) and receiving feedback \(f_{q}(s)\). So, in an experiment, the oracle substitutes the secret \(s\) in the feedback function \(f_{q}\) to give the feedback \(f_{q}(s)\), but does not reveal the secret.

The player can progress like this. For question \(q\), define the set of possible feedbacks compatible with \(q\) as

\begin{align*} F_q = \{f_q(g) : g \in S\}. \end{align*}

Then, for each feedback \(f\in F_{q}\), the player collects the states with the same feedback \(f\) into subsets of \(S\):

\begin{equation} \label{org194cf40} W_q(f) = \{g \in S : f_q(g) = f\}. \end{equation}

Clearly, \(\cup_{f\in F_q} W_q(f) = S\), and \(W_q(f) \cap W_{q}(f')=\varnothing\) if \(f\neq f'\). Hence the question \(q\) induces by means of the set of feedbacks \(F_q\) a unique partition of \(S\):

\begin{equation} \label{org0043671} \mathscr{P}_q = \{W_{q}(f) : f \in F_q\}. \end{equation}

The player’s problem is to identify the question in \(Q\) that minimizes the expected number of experiments, and for this, the player can use the entropy function. Suppose we are given the probability \(p_g\) that element \(g\in S\) is the secret. Then \(P(A) = \sum_{g\in A} p_g\) is the probability that \(s \in A \subset S\). For instance, when \(p_{g}\) is uniform over \(S\), \(\P{A} = |A|/|S|\), i.e., counting measure. Once the probability function (the measure) \(\P{\cdot}\) is known, we can define the entropy of a partition \(\mathcal P\) of \(S\) as

\begin{align*} H(\mathcal P) = - \sum_{A\in \mathcal P} \P{A} \log \P{A}. \end{align*}

Clearly, by \eqref{org0043671}, a question \(q\) induces the partition \(\mathcal{P}_{q}\), hence, we define the entropy induced by \(q\) as

\begin{align*} H_q &= - \sum_{A\in \mathcal P_{q}} \P{A} \log \P{A} \\ &=- \sum_{f\in F_{q}} \P{W_q(f)} \log \P{W_q(f)}. \end{align*}

The best question \(q^{*}\) is the solution of the maximization problem²²Why this is best is explained very well at many places. Besides the book by Yaglom and Yaglom, I recommend Information Theory, Inference, and Learning Algorithms by MacKay.

\begin{equation} \label{orga55bdd4} q^{*} = \argmax_{q\in Q}\, H_{q}. \end{equation}

To complete the procedure, if the player has \(q^*\) and the associated optimal partition \(W^*_{q^*}\) , s/he does an experiment by asking the oracle for

\begin{equation} \label{org1520f12} f^* = f_{q^*}(s), \end{equation}

where \(s\) is the oracle’s secret. With this answer, the player starts a new round of the game, but now with the strictly smaller state space

\begin{equation} \label{orgff52af2} S^{*} = W_{q^{*}}(f^{*}). \end{equation}

Since this set is a strict subset of \(S\), the player can find the secret \(s\) by means of recursion until the final state space contains just one element, and this must be the secret.

Note that any question \(q\) such that the set of feedbacks \(F_{q}\) contains at least 2 elements is effective in solving the problem, because this question will result in a non-trivial partitioning of the space \(S\). In other words, even when simple questions are not optimal, they do help to work towards identifying the secret.

4. Oracle, Player, and Answer Classes

It’s time now to convert the maths into code. Let me repeat: I did not find the maths or the code in one pass. I first used a bit of math, then built it to see where the ideas would break. This lead to a new cycle of thinking, maths, and coding.

4.1. The oracle super class

The oracle super class is quite simple. It needs a secret \(s\), and the space \(S\) to check that \(s\in S\). The feedback function depends on the actual game, hence the feedback methods needs to be implemented in a class derived from Oracle. The player is free to consult the oracle for feedback on what-if scenarios: `If this is my question and this is my guess for the secret, then what feedback would you give?’. Finally, in an experiment the oracle gives feedback based on the secret, but the secret remains hidden, of course.

The only reason to include the test on whether the secret is in the space or not is to protect against typos in the secret and/or the space.

from abc import ABC, abstractmethod


class Oracle(ABC):
    def __init__(self, secret, space):
        self.n_tries = 0
        if secret not in space:
            raise ValueError(
                f"The secret '{secret}' is not in the solution space."
            )
        self.secret = secret

    @abstractmethod
    def feedback(self, question, secret):
        raise NotImplementedError

    def experiment(self, question):
        self.n_tries += 1
        return self.feedback(question, self.secret)

4.2. Computing the entropy

We store the computation of the entropy and some related function in a separate module. The implementations speak for themselves (I believe).

from collections.abc import Iterable

import numpy as np


def counting_measure(d: dict[object, set]) -> dict[object, int]:
    "Map keys of d to the number of elements in d[key]."
    return {k: len(v) for k, v in d.items()}


def normalize_probs(p: dict):
    "Normalize the probabilities stored as values of p."
    tot = sum(p.values())
    return {k: v / tot for k, v in p.items()}


def uniform_probs(d: dict[object, set]) -> dict[object, float]:
    "Map keys of d to  uniform probabilities."
    return normalize_probs(counting_measure(d))


def entropy(pmf: Iterable) -> float:
    "The entropy of the probability mass function pmf."
    return -sum(p * np.log(p) for p in pmf)

4.3. The player super class

We need textwrap to print somewhat larger spaces, and probability_functions.py for the entropy function. Below we assume that the secret is chosen uniformly over the space, so therefore we use uniform probabilities.

The player needs the oracle to get feedback, and s/he needs the state space to generate the question space \(Q\). The method make_partition partitions the state space according to the function \(W\) defined in \eqref{org194cf40}. The rest of the functions should be clear, except for solve.

We find for the optimal guess by means of recursion. Given a state space \(S\), we partition \(S\) according to the question. While searching for the best question we only maintain a reference to the best question and partition so far.³³It’s pointless to keep worse questions and the partitions they induce. Once we have the question that optimally partitions the space, we ask the oracle to give an answer for the best question. This provides the player also with the best subspace to continue the search for the secret. We then call solve with this smaller space.

import textwrap
from abc import ABC, abstractmethod
from collections import defaultdict

from probablity_functions import entropy, uniform_probs


class Player(ABC):
    def __init__(self, space, oracle):
        self.space = space
        self.oracle = oracle

    @abstractmethod
    def make_question_space(self, space):
        pass

    def play(self):
        return self.solve(self.space)

    def make_partition(self, question, space):
        "Map feedback for a (question, guess) pair to a subset of the space."
        W = defaultdict(set)
        for guess in space:
            f = self.oracle.feedback(question, guess)
            W[f].add(guess)
        return W

    def find_optimal_question(self, questions, space):
        q_star, P_star, h_star = None, {}, -float('inf')
        for question in questions:
            W = self.make_partition(question, space)
            h = entropy(uniform_probs(W).values())
            if h > h_star:
                q_star, P_star, h_star = question, W, h
        return q_star, P_star

    def solve(self, space):
        if len(space) == 1:
            return space
        questions = self.make_question_space(space)
        q_star, P_star = self.find_optimal_question(questions, space)
        answer = self.oracle.experiment(q_star)
        space = P_star[answer]
        self.print_step(q_star, answer, space)
        return self.solve(space)

    def print_step(self, q_star, answer, space):
        print(f"{q_star=}, {answer=}, n_tries={self.oracle.n_tries}")
        if len(space) == 1:
            secret = list(space)[0]
            print(f"The secret is {secret}")
        else:
            print("New state space:")
            text = " ".join(str(e) for e in sorted(space))
            print(textwrap.fill(text, width=80) + "\n")

4.4. Answer class

The oracles for the different games can give different answers, for instance, for the coin games whether the balance tilts to right. I find it convenient to put the different types of answers in a separate file.

from enum import IntEnum


class Answer(IntEnum):
    def __repr__(self):
        return self.name


class BinaryAnswer(Answer):
    LEFT = -1
    RIGHT = 1


class TernaryAnswer(Answer):
    LEFT = -1
    MIDDLE = 0
    RIGHT = 1


class ColorAnswer(Answer):
    "Used for the wordle game."

    GREEN = 2
    YELLOW = 1
    GRAY = 0

5. Two simple games

Let us try what we developed up to now to two simple problems whose solutions we can check by hand.

5.1. Binary search

The task of binary search is to identify a selected element (the secret) in a state space \(S\). The oracle gives as feedback whether a guess (for the secret) lies in a left or a right set that form a question of the player. The player makes a question list by splitting the space into left and right sets. Of course, it is evident that when the secret is chosen with uniform probability it is optimal to split the space into a left and right set of equal size. However, that is not the aim here. We want to check that among many different questions, the player selects the best one by using the algorithm described in Section Developing the mathematical framework.

from answers import BinaryAnswer as Answer
from oracle_base import Oracle
from player_base import Player


class Binary_search_oracle(Oracle):
    def feedback(self, question, guess):
        left, right = question
        if guess in left:
            return Answer.LEFT
        return Answer.RIGHT


class Binary_search_player(Player):
    def make_question_space(self, space):
        Q = []
        space = list(space)
        for i in range(1, len(space)):
            left, right = space[:i], space[i:]
            Q.append([left, right])
        return Q


def main():
    space = [
        "aahed",
        "aalii",
        "aargh",
        "abaca",
        "abaci",
        "aback",
        "abaft",
        "abaka",
        "abamp",
    ]

    oracle = Binary_search_oracle(secret="abaft", space=space)
    player = Binary_search_player(oracle=oracle, space=space)

    player.play()


if __name__ == "__main__":
    main()

exec(open("binary.py").read())

q_star=[['aahed', 'aalii', 'aargh', 'abaca'], ['abaci', 'aback', 'abaft', 'abaka', 'abamp']], answer=RIGHT, n_tries=1
New state space:
abaci aback abaft abaka abamp

q_star=[['abaft', 'aback'], ['abaci', 'abaka', 'abamp']], answer=LEFT, n_tries=2
New state space:
aback abaft

q_star=[['aback'], ['abaft']], answer=RIGHT, n_tries=3
The secret is abaft

The initial space contains 9 words, so the player would need maximally 4 tries under optimal play.

5.2. One heavy coin

For this problem we have to find one heavy coin among \(n\) in a minimum of weightings with a balance that has three possible outcomes: it tilts to the left or right or stays level. Like this, the balance offers a ternary search instead of a binary search. The state space \(S\) is the set of numbers \(\{1, 2, \ldots, n\}\) where \(i\in S\) means that \(i\) is the deficit coin. A question consists of two sets with an equal number of coins that we put on either pan of the balance; the other coins remain in the middle and are not used during the weighting.

The question space of the player consists of sets of coins such that the left and right sets have equal size, for instance, \(\{1, 2\}; \{3, 4\}\), where the semicolon denotes the split between the left and right set. Assuming we don’t have any specific information about any coin, the deficit coin is selected uniformly from \(S\). By symmetry, the question \(\{1, 2,\}; \{3, 4\}\) will provide the same information as the question \(\{1, 4\}; \{2, 3\}\); only the number of coins in the sets matter. Thus, we do not have to consider all possible equally sized subsets of \(S\) to form the question space.

In the next code, suppose we deal with \(9\) coins. Then we can form 4 questions: \(\{1\}; \{9\}\), \(\{1, 2\}; \{8, 9\}\), \(\{1, 2, 3\}; \{7, 8, 9\}\), \(\{1, 2, 3, 4\}; \{6, 7, 8, 9\}\). Thus, the size of the left and right set takes as values \(1, 2, 3\) and \(4\), which we achieve with range(1, 9 // 2 + 1).

from answers import TernaryAnswer as Answer
from oracle_base import Oracle
from player_base import Player


class One_coin_oracle(Oracle):
    def feedback(self, question, guess):
        left, middle, right = question[:]
        if guess in left:
            return Answer.LEFT
        if guess in middle:
            return Answer.MIDDLE
        return Answer.RIGHT


class One_coin_player(Player):
    def make_question_space(self, space):
        space = list(space)
        Q = []
        for i in range(1, len(space) // 2 + 1):
            left = space[:i]
            middle = space[i:-i]
            right = space[-i:]
            Q.append([left, middle, right])
        return Q


def main():
    space = range(13)
    oracle = One_coin_oracle(secret=11, space=space)
    player = One_coin_player(oracle=oracle, space=space)
    player.play()


if __name__ == '__main__':
    main()

Let us try it for 13 coins.

exec(open("./one_coin.py").read())

q_star=[[0, 1, 2, 3], [4, 5, 6, 7, 8], [9, 10, 11, 12]], answer=RIGHT, n_tries=1
New state space:
9 10 11 12

q_star=[[9], [10, 11], [12]], answer=MIDDLE, n_tries=2
New state space:
10 11

q_star=[[10], [], [11]], answer=RIGHT, n_tries=3
The secret is 11

6. Wordle (with sampling)

Above we have tackled two simple games. Now it’s time to see how to deal with Wordle. This game is about identifying a secret word, for instance ’aahed’ in a set \(S\) of five letter words. The player formulates a question in the form of a word, for instance ’adopt’. The feedback consists of a string of five colors, for each letter position one color. Green at some position means that the letter of the question is correct at that position; Yellow means that the letter of the question is also in the secret, but not at that position; Gray means that the letter at the position of the question is not in the secret. Thus, the feedback for the question ’adopt’ would be Green, Yellow, Gray, Gray, Gray. Interestingly, feedback is not symmetric, as can be seen when `adopt’ would be the secret and `aahed’ the question.

While the player is simple in essence, we introduce a new idea to demonstrate how to reduce numerical work. To find the secret word in a minimal number of experiments, we have to search \(S\) entirely for the guess that gives the highest entropy. This can be very time consuming when \(S\) is large. To prevent doing lots of useless numerical work (or if we like to have an answer before we die), we can take a sample of n_samples of \(S\) to make the question space \(Q\). Surely this is not optimal, but it still gives exponentially fast convergence (with very large probability).

The feedback of the oracle is somewhat subtle to deal with some easy-to-miss corner cases. I used the implementation from this link.

import random
from collections import Counter

from answers import ColorAnswer as Answer
from oracle_base import Oracle
from player_base import Player

random.seed(3)


def read_words(num=None):
    with open("./valid-wordle-words.txt", "r") as fp:
        words = [w.strip() for w in list(fp)[:num]]
    return words


class Wordle_Player(Player):
    def __init__(self, space, oracle, n_samples=float('inf')):
        self.n_samples = n_samples
        super().__init__(space, oracle)

    def make_question_space(self, space):
        if self.n_samples >= len(space):
            return space
        return random.sample(list(space), self.n_samples)


class Wordle_oracle(Oracle):
    def feedback(self, question, guess):
        pool = Counter(s for g, s in zip(question, guess) if g != s)
        score = []
        for g, s in zip(question, guess):
            if g == s:
                pool[g] -= 1
                score.append(Answer.GREEN)
            elif pool[g] > 0:
                pool[g] -= 1
                score.append(Answer.YELLOW)
            else:
                score.append(Answer.GRAY)
        return tuple(score)


def main():
    space = read_words(num=1000)
    oracle = Wordle_oracle(secret="abase", space=space)
    player = Wordle_Player(oracle=oracle, space=space)
    player.play()


if __name__ == '__main__':
    main()

To keep the output limited, I just read the first 1000 words from this list of valid Wordle words. Here is the output when we don’t use sampling. The actual computation takes a few second because of all the computations.

exec(open("./wordle.py").read())

q_star='arles', answer=(GREEN, GRAY, GRAY, YELLOW, YELLOW), n_tries=1
New state space:
abase abuse amuse anise ansae aside aspie avise

q_star='avise', answer=(GREEN, GRAY, GRAY, GREEN, GREEN), n_tries=2
New state space:
abase abuse amuse

q_star='abase', answer=(GREEN, GREEN, GREEN, GREEN, GREEN), n_tries=3
The secret is abase

Let us now use a sample of 10 words of the space as question words, and read all \(\approx 15K\) words. The search runs real fast, even though we now consider all more than 10 K words. For this change, we modify the main function to this.

def main():
    space = read_words()
    oracle = Wordle_oracle(secret="abase", space=space)
    player = Wordle_Player(oracle=oracle, space=space, n_samples=10)
    player.play()

Let’s run it.

exec(open("./wordle.py").read())

q_star='polts', answer=(GRAY, GRAY, GRAY, GRAY, YELLOW), n_tries=1
New state space:
abase abash abask absey abuse abysm adsum aesir agism agush aksed amuse anise
ansae arise arish arsed arsey asada asana asdic ashed ashen aside askar asked
asker askew assai assam assay assed assez asura asway aswim avise awash bagsy
baisa basan based basen baser basha basic basij basin basse bassi bassy beisa
besaw besee birse birsy brash brise brisk brush brusk bursa burse busby bused
bushy busky bussu byssi caese carse cased caser casky causa cause cease cense
cesse chase chasm chuse cissy crash crise cruse crush crusy cuish curse cursi
cusec cushy cusum daisy danse dashi dashy deash deism dense desex deshi desse
disci dishy disme druse drusy dunsh dusky eased easer ecash eensy ensew ensky
ensue erase escar eskar esker essay farse fasci fease feese fesse fishy fresh
frise frisk frush fubsy funsy fused fusee fussy gashy gassy gawsy geasa geese
gesse girsh gnash grese grise grisy guise gursh gushy gussy guyse hansa hanse
harsh hashy hause hawse herse hissy hushy husky hussy imshi imshy insee isnae
issei issue jasey jesse jessy karsy kasha kasme kesar kiasu kisan kissy knish
maise manse marse marsh mased maser masha mashy massa masse massy masur mausy
mease meism mensa mense mensh merse mersk mesad mesca mesem meshy mesia mesic
mesne messy meuse miasm mimsy minse misch miser misky missa missy msasa mumsy
musar musca mused musee muser musha mushy music musky musse mussy mysid mysie
nashi neese neski newsy nisei nisin nisse nsima nurse nused nyssa nyuse quash
quasi qursh raise raksi ramse ramsh ranse rasam rased raser rasse resam resaw
resay resee resew resid resin resue reuse rinse risen riser rishi risky rushy
rusky rusma russe sabed saber sabha sabin sabir sabji sabra sabre sabzi sacra
sacre saddy sadhe sadhu sadic sadza safed safer sagar sager saggy sagum sahab
saheb sahib saice saick saiga saine sakai saker sakia saman samba samek samen
samey samfi samfu sammy sanad sandy saned saner sanga sangh sansa saran sared
saree sarge sarin sarir sarky saser sasin sasse sassy sauba sauce sauch saucy
saugh sauna saunf saury sauve saved saver savey savin savvy sawah sawed sawer
sayed sayee sayer sayid sayne scaff scand scare scarf scary scaud scaur scena
scend scene schav schif schwa scifi scind scire scrab scrae scrag scram scran
scraw scray scree screw scrim scrub scrum scuba scudi scuff scurf scuse scuzz
sdayn sdein seame seamy seare sease seaze sebum sedan seder sedge sedgy sedum
seedy sefer segar segni segue sehri seine seise seism seiza seize semee semen
semie senex sengi senna sensa sense sensi sensu senvy senza serac serai sered
serer serge seria seric serif serin serir serra serre serry serum serve sesey
sessa sevak seven sever sevir sewan sewar sewed sewen sewer sewin sexed sexer
seyen shack shade shady shaka shake shaky shama shame shand shank shard share
shark sharn shash shave shawm shawn shaya shchi sheaf shear sheen sheer sheik
shend sheng sherd shere sheva shewn shiai shied shier shine shiny shire shirk
shirr shish shiur shiva shive shmek shred shrew shrub shrug shuba shuck shura
shush shyer sibia sicky sided sider sidey sidha sidhe siege sieur sieve sigma
signa sigri siker simar simba since sined sinew singe sinky sinsi sired siree
siren sirih sirra sissy siver sixer sizar sized sizer skank skarn skean skear
skeed skeef skeen skeer skeev skeez skegg skein skene skerm skied skier skiey
skiff skink skirr skive skivy skran skrik skunk skyed skyer skyey skyre smaak
smack smaik smarm smash smaze smear smeek smeik smeke smerk smick smirk smirr
smize smush snack snafu snake snaky snare snarf snark snary snash snead sneak
sneck sneed sneer snick snide snied sniff snive snuck snuff snush squab squad
squaw squee squeg squib squid squiz suave subah subak subby suber subha succi
sucky sucre sudan sudsy suede sugan sugar suhur suing sujee sukuk sumac summa
sunna sunny surah sured surer surfy surge surgy surra sused sushi swack swage
swain swami swamy swang swank sward sware swarf swarm swash swear swede sweed
sweer sweir swerf swine swing swink swire swish swive swizz swung sybbe sycee
syker symar synch syned syrah syren syver ukase unsaw unsay unsee unsew unsex
unsub urase ursae ursid usage usher using usnea usnic usque usure usury versa
verse vised visie visna visne washi washy weise wersh whish whisk wised wiser
wisha wushu wussy yeesh ysame zhush

q_star='raise', answer=(GRAY, YELLOW, GRAY, GREEN, GREEN), n_tries=2
New state space:
abase abuse amuse cease chase fease mease sease ukase

q_star='amuse', answer=(GREEN, GRAY, GRAY, GREEN, GREEN), n_tries=3
The secret is abase

It’s impressive to see how fast this algorithm find the solution.

7. Harder problems with coins.

The above problems are actually quite simple, mainly because the guess space can be taken as equal to the state space. In the next two problems this is no longer the case.

7.1. Two heavy coins

Our next problem is to find two heavy coins out of \(n\) coins; we assume that the two deficit coins are equaly heavy.

The state space conists of all selections of 2 coins out of \(n\). The oracle specifies the secret as an array of two coins c1 and c2. For the feedback, when coin c1 is in the left pan and coin c2 is in the middle (i.e., on neither pan), the left pan tilts. The other conditions follow like wise.

The player has to generate the question space \(Q\). Since it only makes sense to put the same number of coins in both pans, When we put \(k\leq n/2\) coins in either pan, the number of combinations of ways to distribute the coins over the pans is

\begin{equation} \frac{n!}{k!k!(n-2k)!}. \end{equation}

The algorithm works like this. First we find out the ids of the coins that form the state space. Once we have the coins out of which we can select, we select \(k\) coins for the left set, and \(k\) for the right set, the rest of the coins are put in the middle (hence not used in the weighting). We vary \(k\) from \(1\) to half the size of the state space to make all questions. Note that we overcount the number of questions by at least a factor 2, because equivalent questions like \(\{1\}, \{2, 3, 4\}, \{5\}\) and \(\{5\}, \{2, 3, 4\}, \{1\}\) are both stored. We can remove these doubles, but this comes at the expense of more difficult code, so we don’t do this here.

from itertools import combinations

from answers import TernaryAnswer as Answer
from oracle_base import Oracle
from player_base import Player


class Deficit_coint_oracle(Oracle):
    def feedback(self, question, guess):
        left, middle, right = question[:]
        if abs(guess) in middle:
            return Answer.MIDDLE
        elif abs(guess) in left:
            return Answer.RIGHT if guess < 0 else Answer.LEFT
        return Answer.RIGHT if guess > 0 else Answer.LEFT


class Deficit_coin_player(Player):
    def __init__(self, oracle, space):
        super().__init__(space, oracle)
        self.question_space = None

    def make_question_space(self, space):
        if not self.question_space:
            coins = set(abs(s) for s in space)
            Q = []
            for i in range(1, len(coins) // 2 + 1):
                for c1 in combinations(coins, i):
                    left = set(c1)
                    for c2 in combinations(coins - left, i):
                        right = set(c2)
                        middle = set(coins - left - right)
                        Q.append([left, middle, right])
            self.question_space = Q
        return self.question_space


def main():
    n = 12
    space = set(range(-n, 0)) | set(range(1, n + 1))
    oracle = Deficit_coint_oracle(secret=12, space=space)
    player = Deficit_coin_player(oracle=oracle, space=space)
    player.play()


if __name__ == "__main__":
    main()

exec(open("./one_deficit_coin.py").read())

q_star=[{0, 1, 2}, {8, 9, 6, 7}, {3, 4, 5}], answer=LEFT, n_tries=1
New state space:
(0, 1) (0, 2) (0, 6) (0, 7) (0, 8) (0, 9) (1, 2) (1, 6) (1, 7) (1, 8) (1, 9) (2,
6) (2, 7) (2, 8) (2, 9)

q_star=[{0}, {2, 6, 7, 8, 9}, {1}], answer=RIGHT, n_tries=2
New state space:
(1, 2) (1, 6) (1, 7) (1, 8) (1, 9)

q_star=[{2, 6}, {1, 7}, {8, 9}], answer=LEFT, n_tries=3
New state space:
(1, 2) (1, 6)

q_star=[{1}, {6}, {2}], answer=MIDDLE, n_tries=4
The secret is (1, 2)

The optimal strategy is not easy to understand, I think.

While developing the above mathematical framework and code, I considered how to approach the problem in which \(2\) out of \(n\) coins are heavier. We can now finally see why this problem is harder than the one-coin problem: the question space is no longer equal to the state space, but much larger. In particular, when during an experiment the balance remains level, the two heavier coins can be in the middle, or they are at either plate of the balance. It is not easy (for humans) to see how to use this feedback to efficiently reduce the state space or the question space.

7.2. One deficit coin

In this example one coin out of \(n\) is either heavier or lighter. The state space \(\{-n, \ldots, -1\} \cup \{1, 2, \ldots, n\}\): when the secret is \(+3\), say, then the third coin is heavier, when it is \(-3\) the third coin is lighter.

The feedback of oracle is as follows. The index of the deficit coin is the absolute value of the secret. Then, depending on the sign of the secret, the left or the right pan should tilt when the secret is not in the middle.

The player generates the question space in much the same way as the two coin player. However, this problem is trickier than the problems above. If \(n=2\), then necessarily both coins are different. In other words, to be able to decide that, for instance, coin 1 is not only different, but also heavier, we need at least 2 more coins. Consequently, when in the recursion the state space has reduced to just 2 coins, we need one of the other coins to determine whether one of the remaining coins is lighter or heavier. For this reason, we generate the question space just once, and keep using that set so that we always have a third coin at our disposal.

from itertools import combinations

from answers import TernaryAnswer as Answer
from oracle_base import Oracle
from player_base import Player


class Deficit_coint_oracle(Oracle):
    def feedback(self, question, guess):
        left, middle, right = question[:]
        if abs(guess) in middle:
            return Answer.MIDDLE
        elif abs(guess) in left:
            return Answer.RIGHT if guess < 0 else Answer.LEFT
        return Answer.RIGHT if guess > 0 else Answer.LEFT


class Deficit_coin_player(Player):
    def __init__(self, oracle, space):
        super().__init__(space, oracle)
        self.question_space = None

    def make_question_space(self, space):
        if not self.question_space:
            coins = set(abs(s) for s in space)
            Q = []
            for i in range(1, len(coins) // 2 + 1):
                for c1 in combinations(coins, i):
                    left = set(c1)
                    for c2 in combinations(coins - left, i):
                        right = set(c2)
                        middle = set(coins - left - right)
                        Q.append([left, middle, right])
            self.question_space = Q
        return self.question_space


def main():
    n = 12
    space = set(range(-n, 0)) | set(range(1, n + 1))
    oracle = Deficit_coint_oracle(secret=12, space=space)
    player = Deficit_coin_player(oracle=oracle, space=space)
    player.play()


if __name__ == "__main__":
    main()

exec(open("./one_deficit_coin.py").read())

q_star=[{1, 2, 3, 4}, {9, 10, 11, 12}, {8, 5, 6, 7}], answer=MIDDLE, n_tries=1
New state space:
-12 -11 -10 -9 9 10 11 12

q_star=[{1, 9}, {2, 3, 4, 5, 6, 7, 8, 12}, {10, 11}], answer=MIDDLE, n_tries=2
New state space:
-12 12

q_star=[{1}, {2, 3, 4, 5, 6, 7, 8, 9, 10, 11}, {12}], answer=RIGHT, n_tries=3
The secret is 12

7.3. General coins; link with Quicksort

Suppose we consider the yet more complicated problem to find \(k\) heavier or lighter coins out of \(n\) coins, and that the heavier and lighter coins are not equaly heavier and lighter, there comes a point at which it is simpler to just sort the coins and forget about the optimal policy. To see this, take an arbitrary coin and use that as pivot. Place the pivot coin on the left plate, select an arbitrary coin and put that on the right plate. Depending on the outcome of the balance, put the selected coin on a left, middle, or right stack of coins. Then pick another coin and compare that with the pivot, and so on. Finally, once all coins are weighted, put the pivot on the middle stack. Clearly, this is just the Quicksort algorithm.

Out of interest, let us discuss Quicksort in a bit more detail. Suppose we would only have available a balance that provides binary feedback, then we can use the following code to sort the coins.

L, M, R = {}, {}, {}
pivot = space.pop()
while space:
    w = space.pop()
    if w < p:
        L.add(w)
    elif w == p:
        M.add(w)
    else:
        R.add(w)
M.add(pivot)

This gives us about \(\log_2 n\) efficiency if most of the coins are different from the pivot.

Clearly, a good pivot is capable to split the state space in sets of about the same size. Thus, the best pivot is the median of the entire set, rather than an arbitrary element. However, finding the median is quite wasteful just for this step of the sorting process, see this page. We can better follow the idea of the Wordle game in which we selected a few samples of the word space, and then use the best question in this smaller set. Likewise, for Quicksort , we take the median of three arbitrary elements of \(S\). With a bit of probability theory, it’s evident that this gives a big performance gain for very little work.

8. Mastermind

Yet another nice game is Mastermind, the rules of which you can find at Wikipedia.

The number of positions in which the guess and the secret have the same color are the number of black pegs. The number of hits, both black and white, is given by the formula

\begin{align*} \min\{n_1, n_1'\} + \min\{n_2, n_2'\} + \cdots \min\{n_6, n_6'\}, \end{align*}

where \(n_i\) is the number of times color \(i\) occurs in the question, and \(n_i'\) the number of times it occurs in the secret. See this paper of Knuth. The number of white pegs is the number of hits minus the number of black pegs.

I only have to compute once the number of occurrences of the colors in the secret, hence I do this in the __init__.

Note that we minimize the expected number of experiments, we don’t minimize the worst case scenario. As Knuth proves, the guaranteed minimal number of iterations is 5. I think we do better often, but not always.

from collections import Counter
from itertools import product

from oracle_base import Oracle
from player_base import Player


class Mastermind_player(Player):
    def make_question_space(self, space):
        return space


class Mastermind_oracle(Oracle):
    def __init__(self, space, secret):
        super().__init__(space=space, secret=secret)
        self.s_cnt = Counter(c for c in secret)

    def feedback(self, question, guess):
        black = sum(s == g for s, g in zip(guess, question))
        g_cnt = Counter(c for c in question)
        hits = sum(min(self.s_cnt[g], g_cnt[g]) for g in range(6))
        white = hits - black
        return (white, black)


def main():
    space = set(product(range(6), repeat=4))
    oracle = Mastermind_oracle(secret=(1, 2, 3, 4), space=space)
    player = Mastermind_player(oracle=oracle, space=space)
    player.play()


if __name__ == '__main__':
    main()

exec(open("./mastermind.py").read())

q_star=(4, 2, 0, 5), answer=(1, 1), n_tries=1
New state space:
(0, 0, 0, 0) (0, 0, 0, 1) (0, 0, 0, 2) (0, 0, 0, 3) (0, 0, 0, 4) (0, 0, 1, 5)
(0, 0, 2, 5) (0, 0, 3, 5) (0, 0, 4, 5) (0, 0, 5, 5) (0, 1, 0, 0) (0, 1, 0, 1)
(0, 1, 0, 2) (0, 1, 0, 3) (0, 1, 0, 4) (0, 1, 1, 5) (0, 1, 2, 5) (0, 1, 3, 5)
(0, 1, 4, 5) (0, 1, 5, 5) (0, 2, 1, 0) (0, 2, 1, 1) (0, 2, 1, 2) (0, 2, 1, 3)
(0, 2, 1, 4) (0, 2, 2, 0) (0, 2, 2, 1) (0, 2, 2, 2) (0, 2, 2, 3) (0, 2, 2, 4)
(0, 2, 3, 0) (0, 2, 3, 1) (0, 2, 3, 2) (0, 2, 3, 3) (0, 2, 3, 4) (0, 2, 4, 0)
(0, 2, 4, 1) (0, 2, 4, 2) (0, 2, 4, 3) (0, 2, 4, 4) (0, 2, 5, 0) (0, 2, 5, 1)
(0, 2, 5, 2) (0, 2, 5, 3) (0, 2, 5, 4) (0, 3, 0, 0) (0, 3, 0, 1) (0, 3, 0, 2)
(0, 3, 0, 3) (0, 3, 0, 4) (0, 3, 1, 5) (0, 3, 2, 5) (0, 3, 3, 5) (0, 3, 4, 5)
(0, 3, 5, 5) (0, 4, 0, 0) (0, 4, 0, 1) (0, 4, 0, 2) (0, 4, 0, 3) (0, 4, 0, 4)
(0, 4, 1, 5) (0, 4, 2, 5) (0, 4, 3, 5) (0, 4, 4, 5) (0, 4, 5, 5) (0, 5, 0, 0)
(0, 5, 0, 1) (0, 5, 0, 2) (0, 5, 0, 3) (0, 5, 0, 4) (0, 5, 1, 5) (0, 5, 2, 5)
(0, 5, 3, 5) (0, 5, 4, 5) (0, 5, 5, 5) (1, 0, 0, 0) (1, 0, 0, 1) (1, 0, 0, 2)
(1, 0, 0, 3) (1, 0, 0, 4) (1, 0, 1, 5) (1, 0, 2, 5) (1, 0, 3, 5) (1, 0, 4, 5)
(1, 0, 5, 5) (1, 1, 0, 0) (1, 1, 0, 1) (1, 1, 0, 2) (1, 1, 0, 3) (1, 1, 0, 4)
(1, 1, 1, 5) (1, 1, 2, 5) (1, 1, 3, 5) (1, 1, 4, 5) (1, 1, 5, 5) (1, 2, 1, 0)
(1, 2, 1, 1) (1, 2, 1, 2) (1, 2, 1, 3) (1, 2, 1, 4) (1, 2, 2, 0) (1, 2, 2, 1)
(1, 2, 2, 2) (1, 2, 2, 3) (1, 2, 2, 4) (1, 2, 3, 0) (1, 2, 3, 1) (1, 2, 3, 2)
(1, 2, 3, 3) (1, 2, 3, 4) (1, 2, 4, 0) (1, 2, 4, 1) (1, 2, 4, 2) (1, 2, 4, 3)
(1, 2, 4, 4) (1, 2, 5, 0) (1, 2, 5, 1) (1, 2, 5, 2) (1, 2, 5, 3) (1, 2, 5, 4)
(1, 3, 0, 0) (1, 3, 0, 1) (1, 3, 0, 2) (1, 3, 0, 3) (1, 3, 0, 4) (1, 3, 1, 5)
(1, 3, 2, 5) (1, 3, 3, 5) (1, 3, 4, 5) (1, 3, 5, 5) (1, 4, 0, 0) (1, 4, 0, 1)
(1, 4, 0, 2) (1, 4, 0, 3) (1, 4, 0, 4) (1, 4, 1, 5) (1, 4, 2, 5) (1, 4, 3, 5)
(1, 4, 4, 5) (1, 4, 5, 5) (1, 5, 0, 0) (1, 5, 0, 1) (1, 5, 0, 2) (1, 5, 0, 3)
(1, 5, 0, 4) (1, 5, 1, 5) (1, 5, 2, 5) (1, 5, 3, 5) (1, 5, 4, 5) (1, 5, 5, 5)
(2, 0, 0, 0) (2, 0, 0, 1) (2, 0, 0, 2) (2, 0, 0, 3) (2, 0, 0, 4) (2, 0, 1, 5)
(2, 0, 2, 5) (2, 0, 3, 5) (2, 0, 4, 5) (2, 0, 5, 5) (2, 1, 0, 0) (2, 1, 0, 1)
(2, 1, 0, 2) (2, 1, 0, 3) (2, 1, 0, 4) (2, 1, 1, 5) (2, 1, 2, 5) (2, 1, 3, 5)
(2, 1, 4, 5) (2, 1, 5, 5) (2, 2, 1, 0) (2, 2, 1, 1) (2, 2, 1, 2) (2, 2, 1, 3)
(2, 2, 1, 4) (2, 2, 2, 0) (2, 2, 2, 1) (2, 2, 2, 2) (2, 2, 2, 3) (2, 2, 2, 4)
(2, 2, 3, 0) (2, 2, 3, 1) (2, 2, 3, 2) (2, 2, 3, 3) (2, 2, 3, 4) (2, 2, 4, 0)
(2, 2, 4, 1) (2, 2, 4, 2) (2, 2, 4, 3) (2, 2, 4, 4) (2, 2, 5, 0) (2, 2, 5, 1)
(2, 2, 5, 2) (2, 2, 5, 3) (2, 2, 5, 4) (2, 3, 0, 0) (2, 3, 0, 1) (2, 3, 0, 2)
(2, 3, 0, 3) (2, 3, 0, 4) (2, 3, 1, 5) (2, 3, 2, 5) (2, 3, 3, 5) (2, 3, 4, 5)
(2, 3, 5, 5) (2, 4, 0, 0) (2, 4, 0, 1) (2, 4, 0, 2) (2, 4, 0, 3) (2, 4, 0, 4)
(2, 4, 1, 5) (2, 4, 2, 5) (2, 4, 3, 5) (2, 4, 4, 5) (2, 4, 5, 5) (2, 5, 0, 0)
(2, 5, 0, 1) (2, 5, 0, 2) (2, 5, 0, 3) (2, 5, 0, 4) (2, 5, 1, 5) (2, 5, 2, 5)
(2, 5, 3, 5) (2, 5, 4, 5) (2, 5, 5, 5) (3, 0, 0, 0) (3, 0, 0, 1) (3, 0, 0, 2)
(3, 0, 0, 3) (3, 0, 0, 4) (3, 0, 1, 5) (3, 0, 2, 5) (3, 0, 3, 5) (3, 0, 4, 5)
(3, 0, 5, 5) (3, 1, 0, 0) (3, 1, 0, 1) (3, 1, 0, 2) (3, 1, 0, 3) (3, 1, 0, 4)
(3, 1, 1, 5) (3, 1, 2, 5) (3, 1, 3, 5) (3, 1, 4, 5) (3, 1, 5, 5) (3, 2, 1, 0)
(3, 2, 1, 1) (3, 2, 1, 2) (3, 2, 1, 3) (3, 2, 1, 4) (3, 2, 2, 0) (3, 2, 2, 1)
(3, 2, 2, 2) (3, 2, 2, 3) (3, 2, 2, 4) (3, 2, 3, 0) (3, 2, 3, 1) (3, 2, 3, 2)
(3, 2, 3, 3) (3, 2, 3, 4) (3, 2, 4, 0) (3, 2, 4, 1) (3, 2, 4, 2) (3, 2, 4, 3)
(3, 2, 4, 4) (3, 2, 5, 0) (3, 2, 5, 1) (3, 2, 5, 2) (3, 2, 5, 3) (3, 2, 5, 4)
(3, 3, 0, 0) (3, 3, 0, 1) (3, 3, 0, 2) (3, 3, 0, 3) (3, 3, 0, 4) (3, 3, 1, 5)
(3, 3, 2, 5) (3, 3, 3, 5) (3, 3, 4, 5) (3, 3, 5, 5) (3, 4, 0, 0) (3, 4, 0, 1)
(3, 4, 0, 2) (3, 4, 0, 3) (3, 4, 0, 4) (3, 4, 1, 5) (3, 4, 2, 5) (3, 4, 3, 5)
(3, 4, 4, 5) (3, 4, 5, 5) (3, 5, 0, 0) (3, 5, 0, 1) (3, 5, 0, 2) (3, 5, 0, 3)
(3, 5, 0, 4) (3, 5, 1, 5) (3, 5, 2, 5) (3, 5, 3, 5) (3, 5, 4, 5) (3, 5, 5, 5)
(4, 0, 1, 0) (4, 0, 1, 1) (4, 0, 1, 2) (4, 0, 1, 3) (4, 0, 1, 4) (4, 0, 2, 0)
(4, 0, 2, 1) (4, 0, 2, 2) (4, 0, 2, 3) (4, 0, 2, 4) (4, 0, 3, 0) (4, 0, 3, 1)
(4, 0, 3, 2) (4, 0, 3, 3) (4, 0, 3, 4) (4, 0, 4, 0) (4, 0, 4, 1) (4, 0, 4, 2)
(4, 0, 4, 3) (4, 0, 4, 4) (4, 0, 5, 0) (4, 0, 5, 1) (4, 0, 5, 2) (4, 0, 5, 3)
(4, 0, 5, 4) (4, 1, 1, 0) (4, 1, 1, 1) (4, 1, 1, 2) (4, 1, 1, 3) (4, 1, 1, 4)
(4, 1, 2, 0) (4, 1, 2, 1) (4, 1, 2, 2) (4, 1, 2, 3) (4, 1, 2, 4) (4, 1, 3, 0)
(4, 1, 3, 1) (4, 1, 3, 2) (4, 1, 3, 3) (4, 1, 3, 4) (4, 1, 4, 0) (4, 1, 4, 1)
(4, 1, 4, 2) (4, 1, 4, 3) (4, 1, 4, 4) (4, 1, 5, 0) (4, 1, 5, 1) (4, 1, 5, 2)
(4, 1, 5, 3) (4, 1, 5, 4) (4, 3, 1, 0) (4, 3, 1, 1) (4, 3, 1, 2) (4, 3, 1, 3)
(4, 3, 1, 4) (4, 3, 2, 0) (4, 3, 2, 1) (4, 3, 2, 2) (4, 3, 2, 3) (4, 3, 2, 4)
(4, 3, 3, 0) (4, 3, 3, 1) (4, 3, 3, 2) (4, 3, 3, 3) (4, 3, 3, 4) (4, 3, 4, 0)
(4, 3, 4, 1) (4, 3, 4, 2) (4, 3, 4, 3) (4, 3, 4, 4) (4, 3, 5, 0) (4, 3, 5, 1)
(4, 3, 5, 2) (4, 3, 5, 3) (4, 3, 5, 4) (4, 4, 1, 0) (4, 4, 1, 1) (4, 4, 1, 2)
(4, 4, 1, 3) (4, 4, 1, 4) (4, 4, 2, 0) (4, 4, 2, 1) (4, 4, 2, 2) (4, 4, 2, 3)
(4, 4, 2, 4) (4, 4, 3, 0) (4, 4, 3, 1) (4, 4, 3, 2) (4, 4, 3, 3) (4, 4, 3, 4)
(4, 4, 4, 0) (4, 4, 4, 1) (4, 4, 4, 2) (4, 4, 4, 3) (4, 4, 4, 4) (4, 4, 5, 0)
(4, 4, 5, 1) (4, 4, 5, 2) (4, 4, 5, 3) (4, 4, 5, 4) (4, 5, 1, 0) (4, 5, 1, 1)
(4, 5, 1, 2) (4, 5, 1, 3) (4, 5, 1, 4) (4, 5, 2, 0) (4, 5, 2, 1) (4, 5, 2, 2)
(4, 5, 2, 3) (4, 5, 2, 4) (4, 5, 3, 0) (4, 5, 3, 1) (4, 5, 3, 2) (4, 5, 3, 3)
(4, 5, 3, 4) (4, 5, 4, 0) (4, 5, 4, 1) (4, 5, 4, 2) (4, 5, 4, 3) (4, 5, 4, 4)
(4, 5, 5, 0) (4, 5, 5, 1) (4, 5, 5, 2) (4, 5, 5, 3) (4, 5, 5, 4) (5, 0, 0, 0)
(5, 0, 0, 1) (5, 0, 0, 2) (5, 0, 0, 3) (5, 0, 0, 4) (5, 0, 1, 5) (5, 0, 2, 5)
(5, 0, 3, 5) (5, 0, 4, 5) (5, 0, 5, 5) (5, 1, 0, 0) (5, 1, 0, 1) (5, 1, 0, 2)
(5, 1, 0, 3) (5, 1, 0, 4) (5, 1, 1, 5) (5, 1, 2, 5) (5, 1, 3, 5) (5, 1, 4, 5)
(5, 1, 5, 5) (5, 2, 1, 0) (5, 2, 1, 1) (5, 2, 1, 2) (5, 2, 1, 3) (5, 2, 1, 4)
(5, 2, 2, 0) (5, 2, 2, 1) (5, 2, 2, 2) (5, 2, 2, 3) (5, 2, 2, 4) (5, 2, 3, 0)
(5, 2, 3, 1) (5, 2, 3, 2) (5, 2, 3, 3) (5, 2, 3, 4) (5, 2, 4, 0) (5, 2, 4, 1)
(5, 2, 4, 2) (5, 2, 4, 3) (5, 2, 4, 4) (5, 2, 5, 0) (5, 2, 5, 1) (5, 2, 5, 2)
(5, 2, 5, 3) (5, 2, 5, 4) (5, 3, 0, 0) (5, 3, 0, 1) (5, 3, 0, 2) (5, 3, 0, 3)
(5, 3, 0, 4) (5, 3, 1, 5) (5, 3, 2, 5) (5, 3, 3, 5) (5, 3, 4, 5) (5, 3, 5, 5)
(5, 4, 0, 0) (5, 4, 0, 1) (5, 4, 0, 2) (5, 4, 0, 3) (5, 4, 0, 4) (5, 4, 1, 5)
(5, 4, 2, 5) (5, 4, 3, 5) (5, 4, 4, 5) (5, 4, 5, 5) (5, 5, 0, 0) (5, 5, 0, 1)
(5, 5, 0, 2) (5, 5, 0, 3) (5, 5, 0, 4) (5, 5, 1, 5) (5, 5, 2, 5) (5, 5, 3, 5)
(5, 5, 4, 5) (5, 5, 5, 5)

q_star=(2, 5, 1, 5), answer=(2, 0), n_tries=2
New state space:
(0, 0, 0, 0) (0, 0, 0, 1) (0, 0, 0, 2) (0, 0, 0, 3) (0, 0, 0, 4) (0, 1, 0, 0)
(0, 1, 0, 1) (0, 1, 0, 2) (0, 1, 0, 3) (0, 1, 0, 4) (0, 2, 2, 0) (0, 2, 2, 1)
(0, 2, 2, 2) (0, 2, 2, 3) (0, 2, 2, 4) (0, 2, 3, 0) (0, 2, 3, 1) (0, 2, 3, 2)
(0, 2, 3, 3) (0, 2, 3, 4) (0, 2, 4, 0) (0, 2, 4, 1) (0, 2, 4, 2) (0, 2, 4, 3)
(0, 2, 4, 4) (0, 2, 5, 0) (0, 2, 5, 1) (0, 2, 5, 2) (0, 2, 5, 3) (0, 2, 5, 4)
(0, 3, 0, 0) (0, 3, 0, 1) (0, 3, 0, 2) (0, 3, 0, 3) (0, 3, 0, 4) (0, 4, 0, 0)
(0, 4, 0, 1) (0, 4, 0, 2) (0, 4, 0, 3) (0, 4, 0, 4) (1, 0, 0, 0) (1, 0, 0, 1)
(1, 0, 0, 2) (1, 0, 0, 3) (1, 0, 0, 4) (1, 1, 0, 0) (1, 1, 0, 1) (1, 1, 0, 2)
(1, 1, 0, 3) (1, 1, 0, 4) (1, 2, 2, 0) (1, 2, 2, 1) (1, 2, 2, 2) (1, 2, 2, 3)
(1, 2, 2, 4) (1, 2, 3, 0) (1, 2, 3, 1) (1, 2, 3, 2) (1, 2, 3, 3) (1, 2, 3, 4)
(1, 2, 4, 0) (1, 2, 4, 1) (1, 2, 4, 2) (1, 2, 4, 3) (1, 2, 4, 4) (1, 2, 5, 0)
(1, 2, 5, 1) (1, 2, 5, 2) (1, 2, 5, 3) (1, 2, 5, 4) (1, 3, 0, 0) (1, 3, 0, 1)
(1, 3, 0, 2) (1, 3, 0, 3) (1, 3, 0, 4) (1, 4, 0, 0) (1, 4, 0, 1) (1, 4, 0, 2)
(1, 4, 0, 3) (1, 4, 0, 4) (3, 0, 0, 0) (3, 0, 0, 1) (3, 0, 0, 2) (3, 0, 0, 3)
(3, 0, 0, 4) (3, 1, 0, 0) (3, 1, 0, 1) (3, 1, 0, 2) (3, 1, 0, 3) (3, 1, 0, 4)
(3, 2, 2, 0) (3, 2, 2, 1) (3, 2, 2, 2) (3, 2, 2, 3) (3, 2, 2, 4) (3, 2, 3, 0)
(3, 2, 3, 1) (3, 2, 3, 2) (3, 2, 3, 3) (3, 2, 3, 4) (3, 2, 4, 0) (3, 2, 4, 1)
(3, 2, 4, 2) (3, 2, 4, 3) (3, 2, 4, 4) (3, 2, 5, 0) (3, 2, 5, 1) (3, 2, 5, 2)
(3, 2, 5, 3) (3, 2, 5, 4) (3, 3, 0, 0) (3, 3, 0, 1) (3, 3, 0, 2) (3, 3, 0, 3)
(3, 3, 0, 4) (3, 4, 0, 0) (3, 4, 0, 1) (3, 4, 0, 2) (3, 4, 0, 3) (3, 4, 0, 4)
(4, 0, 2, 0) (4, 0, 2, 1) (4, 0, 2, 2) (4, 0, 2, 3) (4, 0, 2, 4) (4, 0, 3, 0)
(4, 0, 3, 1) (4, 0, 3, 2) (4, 0, 3, 3) (4, 0, 3, 4) (4, 0, 4, 0) (4, 0, 4, 1)
(4, 0, 4, 2) (4, 0, 4, 3) (4, 0, 4, 4) (4, 0, 5, 0) (4, 0, 5, 1) (4, 0, 5, 2)
(4, 0, 5, 3) (4, 0, 5, 4) (4, 1, 2, 0) (4, 1, 2, 1) (4, 1, 2, 2) (4, 1, 2, 3)
(4, 1, 2, 4) (4, 1, 3, 0) (4, 1, 3, 1) (4, 1, 3, 2) (4, 1, 3, 3) (4, 1, 3, 4)
(4, 1, 4, 0) (4, 1, 4, 1) (4, 1, 4, 2) (4, 1, 4, 3) (4, 1, 4, 4) (4, 1, 5, 0)
(4, 1, 5, 1) (4, 1, 5, 2) (4, 1, 5, 3) (4, 1, 5, 4) (4, 3, 2, 0) (4, 3, 2, 1)
(4, 3, 2, 2) (4, 3, 2, 3) (4, 3, 2, 4) (4, 3, 3, 0) (4, 3, 3, 1) (4, 3, 3, 2)
(4, 3, 3, 3) (4, 3, 3, 4) (4, 3, 4, 0) (4, 3, 4, 1) (4, 3, 4, 2) (4, 3, 4, 3)
(4, 3, 4, 4) (4, 3, 5, 0) (4, 3, 5, 1) (4, 3, 5, 2) (4, 3, 5, 3) (4, 3, 5, 4)
(4, 4, 2, 0) (4, 4, 2, 1) (4, 4, 2, 2) (4, 4, 2, 3) (4, 4, 2, 4) (4, 4, 3, 0)
(4, 4, 3, 1) (4, 4, 3, 2) (4, 4, 3, 3) (4, 4, 3, 4) (4, 4, 4, 0) (4, 4, 4, 1)
(4, 4, 4, 2) (4, 4, 4, 3) (4, 4, 4, 4) (4, 4, 5, 0) (4, 4, 5, 1) (4, 4, 5, 2)
(4, 4, 5, 3) (4, 4, 5, 4) (5, 0, 0, 0) (5, 0, 0, 1) (5, 0, 0, 2) (5, 0, 0, 3)
(5, 0, 0, 4) (5, 1, 0, 0) (5, 1, 0, 1) (5, 1, 0, 2) (5, 1, 0, 3) (5, 1, 0, 4)
(5, 2, 2, 0) (5, 2, 2, 1) (5, 2, 2, 2) (5, 2, 2, 3) (5, 2, 2, 4) (5, 2, 3, 0)
(5, 2, 3, 1) (5, 2, 3, 2) (5, 2, 3, 3) (5, 2, 3, 4) (5, 2, 4, 0) (5, 2, 4, 1)
(5, 2, 4, 2) (5, 2, 4, 3) (5, 2, 4, 4) (5, 2, 5, 0) (5, 2, 5, 1) (5, 2, 5, 2)
(5, 2, 5, 3) (5, 2, 5, 4) (5, 3, 0, 0) (5, 3, 0, 1) (5, 3, 0, 2) (5, 3, 0, 3)
(5, 3, 0, 4) (5, 4, 0, 0) (5, 4, 0, 1) (5, 4, 0, 2) (5, 4, 0, 3) (5, 4, 0, 4)

q_star=(4, 1, 2, 4), answer=(2, 1), n_tries=3
New state space:
(0, 0, 0, 4) (0, 1, 0, 0) (0, 1, 0, 1) (0, 1, 0, 2) (0, 1, 0, 3) (0, 2, 2, 0)
(0, 2, 2, 1) (0, 2, 2, 2) (0, 2, 2, 3) (0, 2, 3, 4) (0, 2, 4, 4) (0, 2, 5, 4)
(0, 3, 0, 4) (0, 4, 0, 4) (1, 0, 0, 4) (1, 1, 0, 0) (1, 1, 0, 1) (1, 1, 0, 2)
(1, 1, 0, 3) (1, 2, 2, 0) (1, 2, 2, 1) (1, 2, 2, 2) (1, 2, 2, 3) (1, 2, 3, 4)
(1, 2, 4, 4) (1, 2, 5, 4) (1, 3, 0, 4) (1, 4, 0, 4) (3, 0, 0, 4) (3, 1, 0, 0)
(3, 1, 0, 1) (3, 1, 0, 2) (3, 1, 0, 3) (3, 2, 2, 0) (3, 2, 2, 1) (3, 2, 2, 2)
(3, 2, 2, 3) (3, 2, 3, 4) (3, 2, 4, 4) (3, 2, 5, 4) (3, 3, 0, 4) (3, 4, 0, 4)
(4, 0, 3, 0) (4, 0, 3, 1) (4, 0, 3, 2) (4, 0, 3, 3) (4, 0, 4, 0) (4, 0, 4, 1)
(4, 0, 4, 2) (4, 0, 4, 3) (4, 0, 5, 0) (4, 0, 5, 1) (4, 0, 5, 2) (4, 0, 5, 3)
(4, 3, 3, 0) (4, 3, 3, 1) (4, 3, 3, 2) (4, 3, 3, 3) (4, 3, 4, 0) (4, 3, 4, 1)
(4, 3, 4, 2) (4, 3, 4, 3) (4, 3, 5, 0) (4, 3, 5, 1) (4, 3, 5, 2) (4, 3, 5, 3)
(4, 4, 3, 0) (4, 4, 3, 1) (4, 4, 3, 2) (4, 4, 3, 3) (4, 4, 4, 0) (4, 4, 4, 1)
(4, 4, 4, 2) (4, 4, 4, 3) (4, 4, 5, 0) (4, 4, 5, 1) (4, 4, 5, 2) (4, 4, 5, 3)
(5, 0, 0, 4) (5, 1, 0, 0) (5, 1, 0, 1) (5, 1, 0, 2) (5, 1, 0, 3) (5, 2, 2, 0)
(5, 2, 2, 1) (5, 2, 2, 2) (5, 2, 2, 3) (5, 2, 3, 4) (5, 2, 4, 4) (5, 2, 5, 4)
(5, 3, 0, 4) (5, 4, 0, 4)

q_star=(4, 0, 3, 0), answer=(1, 1), n_tries=4
New state space:
(0, 0, 0, 4) (0, 1, 0, 0) (0, 2, 2, 0) (0, 2, 3, 4) (1, 0, 0, 4) (1, 1, 0, 0)
(1, 2, 2, 0) (1, 2, 3, 4) (3, 0, 0, 4) (3, 1, 0, 0) (3, 2, 2, 0) (3, 2, 3, 4)
(4, 3, 4, 1) (4, 3, 4, 2) (4, 3, 4, 3) (4, 3, 5, 1) (4, 3, 5, 2) (4, 3, 5, 3)
(4, 4, 4, 1) (4, 4, 4, 2) (4, 4, 4, 3) (4, 4, 5, 1) (4, 4, 5, 2) (4, 4, 5, 3)
(5, 0, 0, 4) (5, 1, 0, 0) (5, 2, 2, 0) (5, 2, 3, 4)

q_star=(0, 2, 2, 0), answer=(0, 1), n_tries=5
New state space:
(0, 0, 0, 4) (1, 1, 0, 0) (1, 2, 3, 4) (3, 1, 0, 0) (3, 2, 3, 4) (5, 1, 0, 0)
(5, 2, 3, 4)

q_star=(3, 2, 3, 4), answer=(0, 3), n_tries=6
New state space:
(1, 2, 3, 4) (5, 2, 3, 4)

q_star=(1, 2, 3, 4), answer=(0, 4), n_tries=7
The secret is (1, 2, 3, 4)