## Machine Learning – So, Explain Yourself

We humans are not giving up without a fight. Though computers are clearly outthinking us, see AlphaGo for example, we aren’t listening.  Cynthia Rudin, my latest hero, describes the issue in this video. Humans need reasons. In the paper, A Bayesian Framework for Learning Rule Sets for Interpretable Classification. Journal of Machine Learning Research (JMLR), 2017 Tong Wang, Cynthia Rudin, Finale Doshi, Yimin Liu, Erica Klampfl, and Perry MacNeille,  she and her researchers devised a machine learning method that provides compact understandable rules that define specified success. This post is a report on my experiments with the accompanying program that was posted here on github.

I first used their tic-tac-toe example to get all the Python parts running. The output, as I found out, was a set of rules defining what a win looked like. Then I decided to see if the method could generate the rules of the game of  Set in the same way. Set is played with a deck of cards. Each card has a symbol or symbols having four characteristics – number, shape, color, and fill type, each with three different possibilities. A set is three cards that have either all different or all the same possibilities for all four characteristics. To make the resulting rules from the program a little easier to understand, I chose to do the simulation with just three characteristics called a, b and c with three possibilities 0, 1 or 2 for each.

After messing around with random deals for a while, I found that the proportion (4%) of natural hits was too low for the program to handle. In my struggle, since nothing seemed to be working, I had added logical summaries to the input data columns. In essence, I preprocessed the information. Each deal of three cards was now accompanied by six columns describing whether each of the characteristics were all the same or all different. This felt like cheating. Since I had known what I was looking for, I set up columns that summarized those exact traits. For a while, I experimented with just this summary information. I had started by using all permutations of the deals, then changed to just all combinations since the program did just as well without the redundancy. The four percent success rate was not enough for the program to learn the distinctions so I turned the problem on its head and defined hits as non-sets. This lead to a succinct set of rules arrived at quickly. There were the three rules for non-Sets, to quote, “(‘cAllDifferent_neg’, ‘cAllSame_neg’) (‘aAllDifferent_neg’, ‘aAllSame_neg’) (‘bAllDifferent_neg’, ‘bAllSame_neg’)”. This translates as characteristic c can’t have all different traits and characteristic c can’t have all the same traits or  characteristic a can’t have all different traits and characteristic a can’t have all the same traits or characteristic b can’t have all different traits and characteristic b can’t have all the same traits. This is exactly how a human might describe the game of Set’s rules. Next I added columns for the specific characteristics on each card. The program properly ignored the more detailed information and reverted to the summary information since they gave a more compact set of rules. Finally, I eliminated the summary rules. The list of rules was longer than it needed to be and had evident overlap. It was tricky to parse, i.e., explain to myself. Anyway my learning curve has flattened out. Now I want to explore real data. This is in the works.

What have I learned? 1) The method of assigning added negation columns saves time and gives fewer, shorter rules. 2) Looking for rules that define non-hits is sometimes more efficient. 3) It is sometimes useful to preprocess the data into partial summary logical new columns. 4) Using someone else’s program without a understanding every line leaves a residue of uncertainty.

Also, a shout out to the Jupyter notebook system which made working with Python much easier and more organized. This type of notebook is particularly useful for the kind of casual experimenting that I did. Each time, I copied portions of the program’s run results and pasted them with some comments into a new HTML cell as documentation. The result was a crude narrative which I drew on for the above.