In previous posts I have been discussing how to treat rubric data as sample data. The next post along these lines is intended to show how to apply various Bayesian methods to such data. My plan was to contrast these methods using some type of graph. Thus the following digression.
Rubric data is commonly shown with a bar graph like this.
Standard Bar Plot for Rubrics
Note the gaps. I prefer a histogram-like bar graph like this.
Bar Graph with No Gaps
I included the normal curve to make following point. If the scale on the horizontal axis has ordered meaning, then the bars on the histogram, if the same width, represent parts of a whole – think placing the bars end to end. This is what we are used to looking at with histograms – area = probability. The shape can be deceptive (see this post) but the idea gets across.
Yet overall I prefer a standard horizontal graph like so.
Stacked Horizontal Bar Plot
Here the redder the “badder” and the greener the “gooder” or for the color blind, “lefter” is “badder” and “righter” is “gooder.” One loses the ability to read frequencies directly but their relative size is easily seen. Comparisons work smoothly as can be seen in this graph from a forthcoming assessment committee report.
Multiple Stacked Horizontal Bars for Comparison
The problem with any of these choices is how to show prediction intervals (PI) This is a clunky graph from a previous post.
Vertical Bars with Errors on Each Bar
Now the rest of this saga.
The beauty of the Bayesian method as I gleaned from Statistical Rethinking by Richard McElreath was that one can get a distribution for each rubric category by repeated sampling using posterior probabilities. The ability to get means and PI’s and HPDI’s (prediction intervals with minimum width) and also densities comes naturally. I was proud of this graph for 15 minutes.
Density Plots for Each Category
So proud I showed it to my wife. It is easy to read from afar – nice thick lines and color-coded. But it is deceptive (and mistitled). The overlap of the density plots have no meaning. The four graphs are just mushed together. The overlaps will be large or small depending on the sharpness of the density plots and also on their position. For instance two categories may have the same estimate for the mean.
I was also proud of this graph for a while.
Rubric Prediction Interval Plots
This is similar in style to those in Statistical Rethinking except for larger plotted points and no shading extending between the bars for the PI’s. I particularly liked the contrast between posterior sampling mean and data using large open and filled disks. Yet the information seems to float in space and there is no sense that this is frequency data.
Maybe I could modify the horizontal stacked bar graph. I tried several ways of showing the “fuzz”, the uncertainty between the categories using transparent coloring and shades of gray. Not much success. Here is an example with the grey sections presenting uncertainty (95%).
Horizontal Bars with Grey Errors
The grey sections dominate the graph since the sample size was so small and are a bit deceptive since a smaller percentage for one category will necessarily cause a larger value in another category.
A cool feature of the Bayesian tools in Statistical Rethinking is the ability to sample from the posterior distribution. I decided to plot four thousand samples on one graph. I used forty narrow stacked graphs one above the other and iterated one hundred times with decreasing transparency. This gave a fair picture of the fuzz but no ability to visually quantify the sampling error. The last two graphs show how the uncertainty decreases with sample (original data) size.
Stacked Bar Errors by Sampling: Original Data n=39
Stacked Bar Errors by Sampling: Original Data n=390
Stacked Bar Errors by Sampling: Original Data n=3900
Gradually changing the transparency one hundred times is overkill but I am out of ideas. So, at this point I will use the latter graphs to show fuzz and revert to a table to compare the various methods.