Day 25: Use sort to restore order to your data visualizations
In MATLAB, the outputs of sort and unique go beyond what their names suggest. They are both valuable tools for data analysis.
Ordering and ranking
There are situations when you may want to return the order of your data. For example, to sort another data set. Other times, however, you may want to convert your data into ranks. You can use the optional outputs from sort to help you order your input vector (or another vector).
The output y can be used to sort another equally sized vector.
input = [10,1,35,2,10];
other_data = [0,1,0,1,0];[x,y] = sort( input );
sorted_other = other_data(y);
Why would I ever want to do this?
The simplest case is when you have a series of trials (or observations) over time (or across a set of different variables). You may want to sort these trials by their mean value. Here’s the simple way to do it with sort.
First, start off with some artificial data. This (highly artificial) data simulates something like a set of neurons having activity around a certain time point.
rng(5)
Nevents = 100;
y1 = accumarray( [randi(100,Nevents,1),50+randi(10,Nevents,1)], 0.1*randperm(Nevents), [100,100] ) + rand(100,100);
We’ll want to sort by the mean of each row.
mean_y1 = mean(y1,2);
[~,sorted] = sort(mean_y1);
y2 = y1(sorted, :);
Next, let’s show the before and after:
figure('color','w');subplot(1,2,1); imagesc(y1); title('Unsorted');
subplot(1,2,2); imagesc(y2); title('Sorted');
On the other hand, by taking advantage of unique, in the third output (z) you will get a list of your data ranked (with ties). Don’t try to index with these values, they are ranks.
If your data does not have ties, then unique provides you a list of the rankings of each element in your vector. If you have ties, however, you need another solution.
Can I get a ranking of the data without ties?
Yes, somewhat awkwardly, using sort twice.
input = [10,1,35,2,10];[x,y] = sort( input );
[~,rank_no_ties] = sort( y );
The variable rank_no_ties will contain the ordering of the variables in input, so that the smallest value is 1 and the largest value is 5. The ties will get the values 3 and 4!
For more information about the concept of ranking and its advantages, check out the post from Day 20.