The paper is actually pretty interesting - it's a pity the headline is misleading.
The Reddit comment[1] is a good summary: More accurately: "The Tsetlin Machine - a new approach to ML - outperforms single layer Neural Networks, SVMs, Random Forests, the Naive Bayes Classifier and Logistic Regression on four carefully selected contrived datasets"
Notably, on their (weird!) "Binary Iris" dataset, the NN appears to be undertrained, and it is unclear what the headline "mean" accuracy figure is actually a mean of.
However, once we get over that, it's quite a different approach, and I can imagine places where it could be useful. Notably, as an anomaly detector it would seem to have interesting properties like interpretable results similar to a random forest.
They say that their results are interpretable due to being logical formulas but, later, show a toy problem in which their formula has 10000 clauses. By that definition deep-learning is probably interpretable.
However I do think that it is an interesting concept that would probably be worth testing if your inputs and outputs have a natural mapping to booleans.
The Tsetlin machine is based on Tsetlin’s automata, the so called finite automata with linear tactics. The are basically counters of rewards (up) and penalties (down) with various strategies of integration, proportionization and differentiation. They have minimum necessary for digital emulation of the perception functionality. They str not only due to Tsetlin but also yo Krinsky, Krylov, Varshavsky. Mikhail Tsetlin is the most prominent creator. It’s the work in the USSR in the 60s. So not new but very elegant!
A Tselin automata returns always true or always false depending on its internal state (wich is basically a counter set during the learning process).
Using one automata per literal per clause you can learn any logical formula in normalized form (the automata tells you if you should include this literal or not).
That section made the paper worth reading for me.
A Tselin machine is a refinement of the formula learning algorithm to get better learning properties, once trained you get a (long) formula (not precisely a logical formula since it uses a sum and a threshold instead of OR to improve robustness).
It is important to note that they manipulate boolean inputs and outputs and, thus, it is not a general alternativ for neural network.
As far as I can tell from looking at the code that implements it, and ignoring the training for the moment, the model itself takes a binary vector, and takes the sum of (the characteristic function of) AND's of elements of that vector, and returns whether the sum exceeds a threshold.
So an example model with an input vector `x` might be
EDIT: okay, so that's the effective evaluation model, but for training purposes, it is stored internally as a series of real numbers for each potential coefficient, and if that number exceeds the "number of states", then it is considered active. So the model above in binary form would look like
For a detailed understanding, you need the paper. [0]
But, loosely speaking, it is a classifier that works best with less information, utilising a series of formulas. It only really does bitwise ops for recognition, and is easy to implement and fast.
If I were to describe it in more flowery terms, hashing and finite automata had a baby.
Why can't I just get a reasonable definition of what a Tsetlin machine is?
As far as I can read it a function that takes an input (a vector of real numbers in 0..1 (?)), a state (in form of an integer) and produces an output (in form of something?).
None of these approaches are prima facie "unreasonable." But if you get the wrong level it won't feel useful.
I don't think it's crazy to have open teaching discussions on HN, I think you lay out a fair hope that someone can break something down here. But I'm just not sure which level you're looking for.
Precision and simplicity are often competing constraints.
Let me try getting my point across with an insult instead.
Some dude in a marginal university in Norway has managed to convince a few people, a couple of students and a writer for a glittery university magazine, that he is a genius.
His approach is to do something trivial but bury it in a completely obscure terminology. This technically sorta works on a toy example but not really on more challenging tests. But it is very convincing for people who are not trying to understand what is going on but rather are impressed by technical words and academic rank.
That is why it is getting negative feedback here and on other technical forums.
And that is why noone can answer a simple technical question.
>And that is why noone can answer a simple technical question.
Who is "noone"? Some people in a small thread on HN that first read about this today and most of which do not have anything to do with NN or Tsetlin's automata (e.g. startup founders, JS programmers, embedded programmers, and so on)?
Note also that you haven't made any "simple technical question" -- not a specific one that is. Just asked for it to be explained to you, and then rudely complained again when someone did that it wasn't "reasonably" explained.
You also seem impatient to wait for one (considering that you posted 4 hours ago, and most of US is sleeping still), it has to be pronto for you it seems...
Here's also a simple explanation from another member:
"The Tsetlin machine is based on Tsetlin’s automata, the so called finite automata with linear tactics. The are basically counters of rewards (up) and penalties (down) with various strategies of integration, proportionization and differentiation. They have minimum necessary for digital emulation of the perception functionality. They str not only due to Tsetlin but also yo Krinsky, Krylov, Varshavsky. Mikhail Tsetlin is the most prominent creator. It’s the work in the USSR in the 60s. So not new but very elegant!"
>His approach is to do something trivial but bury it in a completely obscure terminology. This technically sorta works on a toy example but not really on more challenging tests. But it is very convincing for people who are not trying to understand what is going on but rather are impressed by technical words and academic rank. That is why it is getting negative feedback here and on other technical forums.
Possibly. Also possible that all the negative feedback are from people who don't understand the math either, and/or are too invested in traditional NN, and is the usual resistance to new ideas.
Apparently you don't have the means to classify it to one or the other case, but you do want to be rude and make a judgement nonetheless.
Your problem is your own inability or unwillingness to set aside a few moments to read the - in my opinion reasonably clear - definitions in the linked articles. Don't take it out on others.
You are not entitled to an easily grokkable explanation, tailor made for your exact level of knowledge.
The Tsetlin Machine being a specialized method, it makes sense that it beats neural networks in its domain of predilection (decision from, potentially noisy, binary inputs).
Paper: https://arxiv.org/abs/1804.01508
Code: https://github.com/cair/TsetlinMachine