recipes : Statistics : Plotting simple bar charts

Problem

How do I plot a simple bar chart?

Solution

This is a Statistics Recipe because bar charts are the dominant way for presenting categorical data (at least in the life sciences). Thus, bar charts are typically the only graphical presentation to accompany formal significance tests such t-tests or ANOVA.

Here we're going to use the bar command to make some simple charts. Let's say we've obtained data from 5 categorical variables. Each has a different sample size so we're storing them in a structure. Let's calculate the means and plot them.

%Define a structure called "data" containing 5 categorical variables.
%Each is comprised of random numbers drawn from the normal distribution. 
%We add different offsets to each variable.
data.bob=randn(1,12)+0.66; 
data.alice=randn(1,15)+1.2; 
data.rufus=randn(1,8)-0.8; 
data.uma=randn(1,21)+1.4;
data.bozo=randn(1,10)+5;

%Let's loop through the fields of the structure and calculate the mean 
%of each then plot these. There's an elegant way to do this:
f=fields(data); %makes a cell array of strings containing field names

for ii=1:length(f)
	mu(ii)=mean( data.(f{ii}) ); %note the brackets around the string
end

%plot!
bar(mu)

Ok, nice. But let's make the thing a little nicer: give it some labels and get rid of the default colour scheme.

H=bar(mu); %make the plot and keep the handle

set(H,'FaceColor',[1,1,1]*0.5,'LineWidth',2) %Fill bars in gray

%Add labels on x axis. We can use the "f" variable made above
set(gca,'XTickLabel',f)

Good, this is starting to look much better. Still one thing missing, though: error bars. Often people plot the standard deviation, so let's go with tradition and overlay that.

%Loop through our data structure and extract the SD:
for ii=1:length(f)
	sd(ii)=std( data.(f{ii}) ); %note the brackets around the string
end

%We know the means, so we have enough information to add the error bars
hold on %without this the current plot will be wiped when we start plotting

for ii=1:length(f)
   plot([ii,ii],[mu(ii)-sd(ii),mu(ii)+sd(ii)],'-k','LineWidth',2)
end

hold off

ylabel('Truffles per cubit') %Y-axis label

Very nice! Now it really looks like we know what we're doing. We have pretty bars, error bars, labels, the works.

Discussion

As you can see, it doesn't take many lines of code to produce a respectable bar chart. However, there is more to do! We might want to overlay the results of a statistical test. We might want to use a different error interval (such as the standard error the mean). We might even want to be daring and overlay the raw data. Look at the next recipe to see how to pimp your bar chart yet further.

 

Want to continue the discussion?
Enter your comments, suggestions, or thoughts below

comments powered by Disqus