So what is this? The boxes show the 25th and 75th percentiles of the data, and the red line is the median (which is the middle value—not the average—in the data). The “whiskers” extend to the extremes of the statistically significant data, and the red “+” symbols are the outliers (which figure into the median and average but are statistically outside the normal distribution). I’ve added a couple of extras to the regular box plot; the first is a red-and-green dashed line that shows the average of all readings, while the curve with the “x” markers shows the average of each bin.
What does this say to me? Most of the medians are below the daily average, which means that most of the time my BG is pretty good, but I have a lot of “outlier” highs that I should work on. Also, the time right around lunch is the low time of my day, while the late afternoon (from about 3:00-6:00PM) tends to be my highest. I seem to be most consistent—statistically speaking—in the evenings.
What next? While I was walking off a hypo during today’s long run, I started thinking about how to extract “interesting” things from the data I’ve aggregated. I need to figure out how to programmatically request to “show me all of the data five hours before a long run in the afternoon” or “show me what happened on days that I exercised after bolusing two to three hours before starting.” I can think of “good” ways that are quite complicated, but I need to consider whether there’s a way to do some natural-language processing so that I can actually write something “human” instead of forcing people/myself to chain lots of conditionals and requests together. (Of course, the complicated way might be the first step.)
Figuring that out is going to take a while. In the meantime, I need to get to work on adding the currently unrecorded data to the mix and figuring out how to display just a day-or-so of data.