recipes : Statistics : Making simple box plots

Problem

How do I perform a box plot in MATLAB?

SolutionA box plot is a useful non-parametric statistical plot. In other words, it shows the distribution of the data without making any assumptions about its underlying distribution. Box plots are especially useful when data are not normally distributed. Making a box plot in MATLAB is easy.

data=randn(10,5); data(:,3)=data(:,3)+2; boxplot(data)

Above, we generated a 10 by 5 matrix. MATLAB treats that as 5 groups of data (the columns) each of which contain 10 observations (the rows). The box plot command therefore creates 5 box plots. Let's say, however, that we want to customise our box plot. The plot is built up of individual line elements. If you call the command with one output argument, it returns the handles of the plot elements.

>> H=boxplot(data) H = 175.0013 176.0013 177.0013 178.0013 179.0013 180.0013 181.0013 182.0013 183.0013 184.0013 185.0013 186.0013 187.0013 188.0013 189.0013 190.0013 191.0013 192.0013 193.0013 194.0013 195.0013 196.0013 197.0013 198.0013 199.0013 200.0013 201.0013 202.0013 203.0013 204.0013 205.0013 206.0013 207.0013 208.0013 209.0013

Ouch! What do we do with all those handles? Well, there are obviously 5 columns worth of handles, so it's obvious that each column relates to one box plot (as we have 5 of those, too.). To figure out what the handles actually relate to, we can do:

>> get(H(:,3),'tag') ans = 'Upper Whisker' 'Lower Whisker' 'Upper Adjacent Value' 'Lower Adjacent Value' 'Box' 'Median' 'Outliers'

Ah... Now it's all making sense. We have used the "tag" string that's part of the structure returned by the get command to tell us what each handle is for. Let's try it and see if it works. Let's change the median bar of the 3rd box plot to a thick green line:

set(H(6,3),'color','g','linewidth',2)

You now know both how to create and modify box plots in MATLAB!

DiscussionAlthough bar charts are often used in place of box plots, possibly because they have a less cluttered feel to them and because they are easier to take in quickly, box plots do provide more information. For instance, they show up things such skewness, which a conventional bar chart will not show.

**Want to continue the discussion?**

Enter your comments, suggestions, or thoughts below