Day 28: Accentuate your data using best fit lines and shading
I used to agonize over adding best fit lines and shaded error plots to my data. There were occasionally custom .m files passed between people online or in the lab, but these would quickly break or fail to apply to my particular use case. By learning about two simple functions and a third less simple one, you can readily craft data visualizations to meet your unique needs.
Best fit lines the easy way
While you may go through the hassle of using polyfit and polyval, leaving a handful of variables in their wake — as I used to for years, there’s no need to do this if you are mainly concerned with displaying the line. Simply render your graph and create a new graphics object using the lsline function (least-squares); alternately, you can add a second function with a custom slope and intercept.
Let’s get on to creating some artificial data.
Npts = 25; % Number of data points for each "x"
mydata = arrayfun( @(x) normrnd(5*x,x,Npts,1), [1:10], 'UniformOutput', false );
mydata = cell2mat( mydata )
This data has a linear trend built in, as each increment in x leads to a change in the mu-value taken as an input to normrnd. As the output, mydata is a variable that will contain 25 rows (Npts) and 10 columns; each column is an “x” value with corresponding data in the rows.
We can display the data as a scatter plot. Note that since each “x” value has 25 entries, we make a custom “x” variable.
x = repmat([1:10],Npts,1);
s = scatter( x(:), mydata(:), 'k', 'filled', 'jitter', 'on');
The easy part is actually adding the line of best fit. Simply enter:
best_fit = lsline;
The variable best_fit is a graphics object, which has settings that you can modify using either set(best_fit,’Param’,value) or using dot notation:
best_fit.LineWidth = 4;
We can add another line with alternative slope and intercept using the refline function.
otherline = refline(4.5,.5);
% First input is slope (4.5); second is intercept (0.5) %otherline.LineWidth = 4
It’s already starting to come together.
The patch function is one of the least user-friendly. It is versatile, and it can produce colorful lines as we saw in Day x, or it can produce patches. Maybe in a future version of MATLAB, these functionalities will be divided into two separate functions. For now, we’ll look at how to input values to get the results you want.
Here’s how to create the input to patch if you want to generate a polygon made from two y-lines, such as one standard deviation below your average for each x’s dataset (y1) and one standard deviation above your average for your each x’s dataset (y2). The trick is to reverse the order of the second function (y2), which can be done using fliplr (see below).
You may want to study the diagram below to see exactly what this means.
x = [1:10,10:-1:1,1];
y1 = mean(mydata,1)-std(mydata,,1);
y2 = mean(mydata,1)+std(mydata,,1); y = [ y1, fliplr(y2), y1(1)];
hold on; p = patch(x',y',repmat(10,21,1),'facealpha', 0.1,'linestyle','none' )
If you’ve done this correctly, you should see something like:
Optionally, you could get rid of the scatter at this point, if you so choose!
Simply enter: delete(s)
Hope this post helped patch up some holes in your knowledge about plotting and shading.