What is the process behind Akinator perform?
If you don’t know that it exists, Akinator is a computer-based game and mobile application developed by the French Company: Elokence .
Akinator’s aim is to make guesses about the identity of a fictional or real-life character. To figure out what person the player is thinking of, Akinator asks a series of questions to which the player has the option of answering by answering “Yes,” “No,” Don’t know”, “No, likely, or ‘probably ,then the program decides which question is the most appropriate.
For every answers, Akinator computes the best question to ask the participant and then provides an estimate of whom the person is thinking of. If the initial guess isn’t right, Akinator continues to ask questions in the future, up to three guesses with the first guess is usually after about 15 questions. When the second guess not accurate, the participant is required to enter the name to an existing database.
The algorithm used to make the choice of questions was created by the French Company mentioned above, that has been kept private. It is simple to find articles which explain the process of creating the algorithm and how it is utilized in Akinator. In this post, I’ll give you a quick and enjoyable method of understanding the algorithm.
Certain articles state that Akinator employs the Decision Trees and also in probabilistic methods also known as reinforcement learning.. This article will concentrate on two algorithms that are crucial to the decision Tree; Incremental Induction of Decision Trees 3 (ID3) and ID4.
The Incremental Induction Method for Decision Trees 3. (ID3)
The principle behind the ID3 method is that it build the Decision Tree using a top-down and a frenzied search of the sets of data provided to verify every attribute at every node in the tree.
To determine the best method to categorize an educational collection, it’s important to reduce the amount of questions asked(i.e. reduce the size of the trees). Therefore, we require a function that can determine what kinds of questions lead to best balanced and balanced division. Information Gain is one such function. Information Gain metric is such function. That means that Information Gain is the difference between the Impurity Measure of the first set (i.e. when it is not yet divided) as well as the weighted mean from that Impurity Measure when the set is split. (In our previous post ” Tree Models Fundamental Concepts” we’ve examined the fact that Gini as well as Entropy can be considered the measures of impureness):
where Entropy(S) refers to the Impurity values prior to splitting the data , and Entropy(S,X) refers to the Impurity following the split.
The process of Information Gain, there are two major operations that occur in tree construction:
Evaluation of splits in each attribute, and the choosing the best split.
The creation of partitions is done with the best split.
One aspect you must know is that the challenge lies in determining the most optimal divide for every attribute, and just like the previous example based on the Entropy formula or Gini and we can calculate information Gain.
Thus, using Information Gain, the algorithm used in the ID3 tree is this:
If all instances are all from the same class it is likely that the tree will be an answer node with that class’s name.
(a) (a) a(best) as the attribute (or feature) that has the lowest gain-score.
(b) for each number V(best,i) from a(best) (b), grow an a(best) branch a(best) to an recursive decision tree from all the instances that have the value V(best,i) of the attribute a(best).
Another key method is ID4. The researchers argue that ID4 takes a new training session and then changes the decision tree, that is, it doesn’t need to rebuild the decisions trees if the global data structure is preserved in the tree from which it was originally constructed.
The fundamental ID4 algorithm tree-update procedure is explained below.
inputs include: A decision tree. One instance
output The output of a decision tree
For each test attribute that is possible at the present node increment the number of negative or positive instances for the appropriate test attribute within the current training session.
If all instances seen at the present node have been found to be positive (negative) and positive, the Decision Tree at the current node is an answer that contains an “+” (“-“) to indicate an affirmative (negative) situation.
(a) (a) If your currently number of nodes is not an answer and you want to change it, make it one that contains an attribute test, with the lowest gain-score.
(b) If, however, the current decision node has one of the attributes tests that doesn’t have the lowest Gain-score
Make the attribute test one that has the lowest score for gain.
Eliminate all existing sub-trees beneath at the point of decision.
Update your Decision Tree below the current decision node in the branch that corresponds to the the current test attribute which is described in the description of the instance. The branch can be expanded if needed.
Database of Akinator
Akinator has an own and exclusive database that is only available to Akinator and users can access it without stress of overflow. Furthermore, Akinator keeps updating the Database frequently through learning from its users. The only downside is that it asks you 20 questions in which it will try to guess the name of the person. If it fails to figure out the name then it will ask for the name directly. However, Akinator has failed numerous times, it has never been able to guess the name that you thought of in your mind. Akinator can be viewed in two ways: through entertaining and learning to do both.