Machine Learning – So, Explain Yourself

We humans are not giving up without a fight. Though computers are clearly outthinking us, see AlphaGo for example, we aren’t listening.  Cynthia Rudin, my latest hero, describes the issue in this video. Humans need reasons. In the paper, A Bayesian Framework for Learning Rule Sets for Interpretable Classification. Journal of Machine Learning Research (JMLR), 2017 Tong Wang, Cynthia Rudin, Finale Doshi, Yimin Liu, Erica Klampfl, and Perry MacNeille,  she and her researchers devised a machine learning method that provides compact understandable rules that define specified success. This post is a report on my experiments with the accompanying program that was posted here on github.

I first used their tic-tac-toe example to get all the Python parts running. The output, as I found out, was a set of rules defining what a win looked like. Then I decided to see if the method could generate the rules of the game of  Set in the same way. Set is played with a deck of cards. Each card has a symbol or symbols having four characteristics – number, shape, color, and fill type, each with three different possibilities. A set is three cards that have either all different or all the same possibilities for all four characteristics. To make the resulting rules from the program a little easier to understand, I chose to do the simulation with just three characteristics called a, b and c with three possibilities 0, 1 or 2 for each.

After messing around with random deals for a while, I found that the proportion (4%) of natural hits was too low for the program to handle. In my struggle, since nothing seemed to be working, I had added logical summaries to the input data columns. In essence, I preprocessed the information. Each deal of three cards was now accompanied by six columns describing whether each of the characteristics were all the same or all different. This felt like cheating. Since I had known what I was looking for, I set up columns that summarized those exact traits. For a while, I experimented with just this summary information. I had started by using all permutations of the deals, then changed to just all combinations since the program did just as well without the redundancy. The four percent success rate was not enough for the program to learn the distinctions so I turned the problem on its head and defined hits as non-sets. This lead to a succinct set of rules arrived at quickly. There were the three rules for non-Sets, to quote, “(‘cAllDifferent_neg’, ‘cAllSame_neg’) (‘aAllDifferent_neg’, ‘aAllSame_neg’) (‘bAllDifferent_neg’, ‘bAllSame_neg’)”. This translates as characteristic c can’t have all different traits and characteristic c can’t have all the same traits or  characteristic a can’t have all different traits and characteristic a can’t have all the same traits or characteristic b can’t have all different traits and characteristic b can’t have all the same traits. This is exactly how a human might describe the game of Set’s rules. Next I added columns for the specific characteristics on each card. The program properly ignored the more detailed information and reverted to the summary information since they gave a more compact set of rules. Finally, I eliminated the summary rules. The list of rules was longer than it needed to be and had evident overlap. It was tricky to parse, i.e., explain to myself. Anyway my learning curve has flattened out. Now I want to explore real data. This is in the works.

What have I learned? 1) The method of assigning added negation columns saves time and gives fewer, shorter rules. 2) Looking for rules that define non-hits is sometimes more efficient. 3) It is sometimes useful to preprocess the data into partial summary logical new columns. 4) Using someone else’s program without a understanding every line leaves a residue of uncertainty.

Also, a shout out to the Jupyter notebook system which made working with Python much easier and more organized. This type of notebook is particularly useful for the kind of casual experimenting that I did. Each time, I copied portions of the program’s run results and pasted them with some comments into a new HTML cell as documentation. The result was a crude narrative which I drew on for the above.

Advertisements

About jrh794

I am a sixty-five year old math instructor at Southern Oregon University. I taught at the College of the Siskiyous in Weed California for twenty-six years. Prior to that I worked as a computer programmer, carpenter and in various other jobs. I graduated from Rice University in 1967 and have a MS in Operations Research from Stanford. In the past I have hand-built a stone house and taken long solo bicycle tours. Now I ride my mountain bike and play golf for recreation.
This entry was posted in Cool Ideas, Math Explorations and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s