When you’re working with machine learning, deciding which algorithm performs best across multiple datasets can be quite challenging. Simply looking at performance metrics might not be enough, you need statistical methods to be sure. That’s when the Friedman Test and Critical Difference (CD) Diagrams can help.
My classmates and I faced this challenge firsthand when preparing a project presentation. We struggled to find a clear way to generate the diagram, so after finally figuring it out, I decided to share this guide to save others time.
In this article, you’ll find a Python code that performs this evaluation and visualization. You can also access the complete code on my GitHub gist.
I’ll also show you how to modify the code to use accuracy instead of the error rate. The script has been tested on Python 3.8 and above.
The Python script does three main things:
- Performs the Friedman Test to statistically evaluate performance differences.
- Creates a ranking table comparing the algorithm scores.
- Generates and saves a PNG image of the Critical Difference Diagram and the ranking table.
Critical Difference Diagram generated
In the diagram, algorithms connected by a horizontal bar are not significantly different from each other based on the statistical test. Lower-ranked algorithms (positioned further right) generally performed better. The ranking remains the same whether using error rate or accuracy as the performance metric.
Ranking table generated
This table shows the error rates of each algorithm across all datasets. Each cell contains the error rate along with its ranking in parentheses (where 1 is the best and 13 is the worst, since I used 12 datasets).
At the bottom, you’ll find the rank sums and average rankings for each algorithm, for better overall comparison.
I edited the original image and removed some columns for better readability here.
Why Use the Friedman Test and Critical Difference Diagram?
The Friedman test is a non-parametric statistical test designed to detect differences between multiple algorithms across various datasets. It ranks algorithms based on their performance, helping you understand if differences in performance are genuinely significant or just due to chance.
The Critical Difference Diagram visually presents these rankings. It clearly shows which algorithms perform similarly and which are significantly better or worse, making it easy to interpret results at a glance. This diagram is particularly useful when comparing numerous algorithms across multiple datasets.
Preparing Your Data
Now for the implementation, you’ll need your data structured like this:
Datasets
: Names of your datasets (e.g., MNIST, Fashion-MNIST).
Algorithms
: Names of the algorithms you’re evaluating. Keep these ordered consistently.
Performance (Error)
: Lists of error rates for each algorithm per dataset, aligned with your Algorithms
list.
For example:
<span>data</span> <span>=</span> <span>{</span><span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span> <span>...],</span><span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>RandomForest</span><span>'</span><span>],</span><span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span><span>[</span><span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>88.65%</span><span>'</span><span>],</span> <span># MNIST </span> <span>[</span><span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>90%</span><span>'</span><span>],</span> <span># Fashion-MNIST </span> <span># Other datasets... </span> <span>]</span><span>}</span><span>data</span> <span>=</span> <span>{</span> <span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span> <span>...],</span> <span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>RandomForest</span><span>'</span><span>],</span> <span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span> <span>[</span><span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>88.65%</span><span>'</span><span>],</span> <span># MNIST </span> <span>[</span><span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>90%</span><span>'</span><span>],</span> <span># Fashion-MNIST </span> <span># Other datasets... </span> <span>]</span> <span>}</span>data = { 'Datasets': ['MNIST', 'Fashion-MNIST', ...], 'Algorithms': ['NaiveBayes', 'IBk', ..., 'RandomForest'], 'Performance (Error)': [ ['30.34%', '3.09%', ..., '88.65%'], # MNIST ['36.72%', '14.35%', ..., '90%'], # Fashion-MNIST # Other datasets... ] }
Enter fullscreen mode Exit fullscreen mode
Make sure the error rates are listed in the same order as their corresponding algorithms. For example, if NaiveBayes is the first algorithm in the list, its performance values should always appear first in each dataset’s row.
If you prefer to use accuracy instead of error rates, you can either replace the values in the Performance field with accuracy scores or simply subtract the error rates from 1. I’ll also demonstrate how to do this right after the implementation.
Python Implementation
Here’s the Python code, also available in this gist on GitHub:
<span>import</span> <span>pandas</span> <span>as</span> <span>pd</span><span>import</span> <span>numpy</span> <span>as</span> <span>np</span><span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span><span>from</span> <span>scipy.stats</span> <span>import</span> <span>friedmanchisquare</span><span>from</span> <span>aeon.visualisation</span> <span>import</span> <span>plot_critical_difference</span><span>data</span> <span>=</span> <span>{</span><span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span><span>'</span><span>e1-Spiral</span><span>'</span><span>,</span> <span>'</span><span>e1-Android</span><span>'</span><span>,</span><span>'</span><span>e2-andorinhas</span><span>'</span><span>,</span> <span>'</span><span>e2-chinese</span><span>'</span><span>,</span><span>'</span><span>e3-user</span><span>'</span><span>,</span> <span>'</span><span>e3-ecommerce</span><span>'</span><span>,</span><span>'</span><span>e4-wine</span><span>'</span><span>,</span> <span>'</span><span>e4-heart</span><span>'</span><span>,</span><span>'</span><span>e5-mamiferos</span><span>'</span><span>,</span> <span>'</span><span>e5-titanic</span><span>'</span><span>],</span><span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>'</span><span>J48</span><span>'</span><span>,</span> <span>'</span><span>RandomForest</span><span>'</span><span>,</span> <span>'</span><span>LMT</span><span>'</span><span>,</span><span>'</span><span>XGBoost</span><span>'</span><span>,</span> <span>'</span><span>SVM</span><span>'</span><span>,</span> <span>'</span><span>LGBM</span><span>'</span><span>,</span> <span>'</span><span>Bagging</span><span>'</span><span>,</span> <span>'</span><span>AdaBoost</span><span>'</span><span>,</span><span>'</span><span>KStar</span><span>'</span><span>,</span> <span>'</span><span>M5P</span><span>'</span><span>,</span> <span>'</span><span>MLP</span><span>'</span><span>,</span> <span>'</span><span>HC</span><span>'</span><span>,</span> <span>'</span><span>E-M</span><span>'</span><span>],</span><span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span><span>[</span> <span># MNIST -ok </span> <span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>'</span><span>10.67%</span><span>'</span><span>,</span> <span>'</span><span>3.51%</span><span>'</span><span>,</span> <span>'</span><span>5.70%</span><span>'</span><span>,</span><span>'</span><span>2.05%</span><span>'</span><span>,</span> <span>'</span><span>2.61%</span><span>'</span><span>,</span> <span>'</span><span>2.26%</span><span>'</span><span>,</span> <span>'</span><span>5.07%</span><span>'</span><span>,</span> <span>'</span><span>11.74%</span><span>'</span><span>,</span><span>'</span><span>89.80%</span><span>'</span><span>,</span> <span>'</span><span>47.31%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>88.65%</span><span>'</span><span>],</span><span>[</span> <span># Fashion-MNIST -ok </span> <span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>'</span><span>18.27%</span><span>'</span><span>,</span> <span>'</span><span>11.92%</span><span>'</span><span>,</span> <span>'</span><span>13.62%</span><span>'</span><span>,</span><span>'</span><span>8.58%</span><span>'</span><span>,</span> <span>'</span><span>9.47%</span><span>'</span><span>,</span> <span>'</span><span>9.50%</span><span>'</span><span>,</span> <span>'</span><span>12.20%</span><span>'</span><span>,</span> <span>'</span><span>19.00%</span><span>'</span><span>,</span><span>'</span><span>90%</span><span>'</span><span>,</span> <span>'</span><span>46.78%</span><span>'</span><span>,</span> <span>'</span><span>0.52%</span><span>'</span><span>,</span> <span>'</span><span>51.45%</span><span>'</span><span>,</span> <span>'</span><span>90%</span><span>'</span><span>],</span><span>[</span> <span># e1-Spiral -ok </span> <span>'</span><span>29.125%</span><span>'</span><span>,</span> <span>'</span><span>0.38%</span><span>'</span><span>,</span> <span>'</span><span>2.25%</span><span>'</span><span>,</span> <span>'</span><span>1.75%</span><span>'</span><span>,</span> <span>'</span><span>2.375%</span><span>'</span><span>,</span><span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>1.88%</span><span>'</span><span>,</span> <span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span><span>'</span><span>5.51%</span><span>'</span><span>,</span> <span>'</span><span>1.43%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>49.75%</span><span>'</span><span>,</span> <span>'</span><span>72.50%</span><span>'</span><span>],</span><span>[</span> <span># e1-Android -ok </span> <span>'</span><span>8.1317%</span><span>'</span><span>,</span> <span>'</span><span>7.7285%</span><span>'</span><span>,</span> <span>'</span><span>4.7491%</span><span>'</span><span>,</span> <span>'</span><span>4.5475%</span><span>'</span><span>,</span> <span>'</span><span>4.3683%</span><span>'</span><span>,</span><span>'</span><span>4.03%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span> <span>'</span><span>3.47%</span><span>'</span><span>,</span> <span>'</span><span>6.38%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span><span>'</span><span>8.94%</span><span>'</span><span>,</span> <span>'</span><span>7.61%</span><span>'</span><span>,</span> <span>'</span><span>1.67%</span><span>'</span><span>,</span> <span>'</span><span>49.98%</span><span>'</span><span>,</span> <span>'</span><span>38.95%</span><span>'</span><span>],</span><span>[</span> <span>'</span><span>8.1%</span><span>'</span><span>,</span> <span>'</span><span>6.65%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span> <span>'</span><span>4.90%</span><span>'</span><span>,</span> <span>'</span><span>4.60%</span><span>'</span><span>,</span><span>'</span><span>4.00%</span><span>'</span><span>,</span> <span>'</span><span>3.75%</span><span>'</span><span>,</span> <span>'</span><span>4.25%</span><span>'</span><span>,</span> <span>'</span><span>3.50%</span><span>'</span><span>,</span> <span>'</span><span>4.75%</span><span>'</span><span>,</span><span>'</span><span>4.36%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span> <span>'</span><span>3.92%</span><span>'</span><span>,</span> <span>'</span><span>48.60%</span><span>'</span><span>,</span> <span>'</span><span>49.25%</span><span>'</span><span>],</span><span>[</span> <span># e2-chinese -ok </span> <span>'</span><span>27.1589%</span><span>'</span><span>,</span> <span>'</span><span>12.8911%</span><span>'</span><span>,</span> <span>'</span><span>34.9186%</span><span>'</span><span>,</span> <span>'</span><span>7.5094%</span><span>'</span><span>,</span> <span>'</span><span>10.6383%</span><span>'</span><span>,</span><span>'</span><span>7.50%</span><span>'</span><span>,</span> <span>'</span><span>6.25%</span><span>'</span><span>,</span> <span>'</span><span>5.63%</span><span>'</span><span>,</span> <span>'</span><span>16.25%</span><span>'</span><span>,</span> <span>'</span><span>34.38%</span><span>'</span><span>,</span><span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>1.21%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>87.36%</span><span>'</span><span>,</span> <span>'</span><span>78.22%</span><span>'</span><span>],</span><span>[</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.8571%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.1429%</span><span>'</span><span>,</span><span>'</span><span>2.14%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span><span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.39%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>79.14%</span><span>'</span><span>,</span> <span>'</span><span>4.57%</span><span>'</span><span>],</span><span>[</span> <span># e3-ecommerce -ok </span> <span>'</span><span>11.37%</span><span>'</span><span>,</span> <span>'</span><span>11.15%</span><span>'</span><span>,</span> <span>'</span><span>2.39%</span><span>'</span><span>,</span> <span>'</span><span>2.07%</span><span>'</span><span>,</span> <span>'</span><span>2.42%</span><span>'</span><span>,</span><span>'</span><span>0.90%</span><span>'</span><span>,</span> <span>'</span><span>8.80%</span><span>'</span><span>,</span> <span>'</span><span>0.70%</span><span>'</span><span>,</span> <span>'</span><span>10.35%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span><span>'</span><span>0.02%</span><span>'</span><span>,</span> <span>'</span><span>7.56%</span><span>'</span><span>,</span> <span>'</span><span>3.96%</span><span>'</span><span>,</span> <span>'</span><span>22.11%</span><span>'</span><span>,</span> <span>'</span><span>41.49%</span><span>'</span><span>],</span><span>[</span> <span># e4-wine -ok </span> <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>35.21%</span><span>'</span><span>,</span> <span>'</span><span>38.59%</span><span>'</span><span>,</span> <span>'</span><span>29.89%</span><span>'</span><span>,</span> <span>'</span><span>39.65%</span><span>'</span><span>,</span><span>'</span><span>48.95%</span><span>'</span><span>,</span> <span>'</span><span>56.56%</span><span>'</span><span>,</span> <span>'</span><span>46.85%</span><span>'</span><span>,</span> <span>'</span><span>43.94%</span><span>'</span><span>,</span> <span>'</span><span>50.99%</span><span>'</span><span>,</span><span>'</span><span>39.23%</span><span>'</span><span>,</span> <span>'</span><span>50.82%</span><span>'</span><span>,</span> <span>'</span><span>36.51%</span><span>'</span><span>,</span> <span>'</span><span>57.34%</span><span>'</span><span>,</span> <span>'</span><span>77.98%</span><span>'</span><span>],</span><span>[</span> <span># e4-heart -ok </span> <span>'</span><span>43.51%</span><span>'</span><span>,</span> <span>'</span><span>46.61%</span><span>'</span><span>,</span> <span>'</span><span>35.82%</span><span>'</span><span>,</span> <span>'</span><span>37.20%</span><span>'</span><span>,</span> <span>'</span><span>35.88%</span><span>'</span><span>,</span><span>'</span><span>45.71%</span><span>'</span><span>,</span> <span>'</span><span>34.51%</span><span>'</span><span>,</span> <span>'</span><span>45.73%</span><span>'</span><span>,</span> <span>'</span><span>44.16%</span><span>'</span><span>,</span> <span>'</span><span>46.1%</span><span>'</span><span>,</span><span>'</span><span>46.15%</span><span>'</span><span>,</span> <span>'</span><span>64.18%</span><span>'</span><span>,</span> <span>'</span><span>49.22%</span><span>'</span><span>,</span> <span>'</span><span>88.1962%</span><span>'</span><span>,</span> <span>'</span><span>69.94%</span><span>'</span><span>],</span><span>[</span> <span># e5-mamiferos -ok </span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span><span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span><span>'</span><span>1.57%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.20%</span><span>'</span><span>,</span> <span>'</span><span>31.20%</span><span>'</span><span>,</span> <span>'</span><span>44.80%</span><span>'</span><span>],</span><span>[</span> <span># e5-titanic -ok </span> <span>'</span><span>21.3244%</span><span>'</span><span>,</span> <span>'</span><span>22.7834%</span><span>'</span><span>,</span> <span>'</span><span>22.5589%</span><span>'</span><span>,</span> <span>'</span><span>28.3951%</span><span>'</span><span>,</span> <span>'</span><span>19.7531%</span><span>'</span><span>,</span><span>'</span><span>16.76%</span><span>'</span><span>,</span> <span>'</span><span>27.56%</span><span>'</span><span>,</span> <span>'</span><span>14.85%</span><span>'</span><span>,</span> <span>'</span><span>7.48%</span><span>'</span><span>,</span> <span>'</span><span>10.79%</span><span>'</span><span>,</span><span>'</span><span>27.18%</span><span>'</span><span>,</span> <span>'</span><span>61.62%</span><span>'</span><span>,</span> <span>'</span><span>26.76%</span><span>'</span><span>,</span> <span>'</span><span>38.16%</span><span>'</span><span>,</span> <span>'</span><span>38.50%</span><span>'</span><span>]</span><span>]</span><span>}</span><span># Convert the data into a DataFrame </span><span>datasets</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Datasets</span><span>'</span><span>]</span><span>algorithms</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Algorithms</span><span>'</span><span>]</span><span>performance_data</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Performance (Error)</span><span>'</span><span>]</span><span># Create a list of dictionaries for each dataset </span><span>rows</span> <span>=</span> <span>[]</span><span>for</span> <span>dataset</span><span>,</span> <span>performance</span> <span>in</span> <span>zip</span><span>(</span><span>datasets</span><span>,</span> <span>performance_data</span><span>):</span><span>row</span> <span>=</span> <span>{</span><span>'</span><span>Dataset</span><span>'</span><span>:</span> <span>dataset</span><span>}</span><span>row</span><span>.</span><span>update</span><span>({</span><span>alg</span><span>:</span> <span>perf</span> <span>for</span> <span>alg</span><span>,</span> <span>perf</span> <span>in</span> <span>zip</span><span>(</span><span>algorithms</span><span>,</span> <span>performance</span><span>)})</span><span>rows</span><span>.</span><span>append</span><span>(</span><span>row</span><span>)</span><span># Create the DataFrame </span><span>df</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>(</span><span>rows</span><span>)</span><span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span><span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>'</span><span>,</span><span>'</span><span>,</span> <span>'</span><span>.</span><span>'</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>'</span><span>%</span><span>'</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span><span># Calculate the ranking of each algorithm for each dataset </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>'</span><span>min</span><span>'</span><span>,</span> <span>ascending</span><span>=</span><span>True</span><span>)</span><span># Format the results </span><span>formatted_results</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>copy</span><span>()</span><span>for</span> <span>col</span> <span>in</span> <span>formatted_results</span><span>.</span><span>columns</span><span>:</span><span>formatted_results</span><span>[</span><span>col</span><span>]</span> <span>=</span> <span>formatted_results</span><span>[</span><span>col</span><span>].</span><span>round</span><span>(</span><span>3</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span> (</span><span>"</span> <span>+</span> <span>rankings_matrix</span><span>[</span><span>col</span><span>].</span><span>astype</span><span>(</span><span>int</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span>)</span><span>"</span><span># Add a row for the sum of ranks and average of ranks </span><span>sum_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>sum</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Sum Ranks</span><span>'</span><span>)</span><span>average_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>mean</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Average Ranks</span><span>'</span><span>)</span><span># Add the rows to the formatted DataFrame using concat </span><span>formatted_results</span> <span>=</span> <span>pd</span><span>.</span><span>concat</span><span>([</span><span>formatted_results</span><span>,</span> <span>sum_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>,</span> <span>average_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>])</span><span># Add the 'Dataset' column to the formatted DataFrame </span><span>formatted_results</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>'</span><span>Dataset</span><span>'</span><span>,</span> <span>df</span><span>[</span><span>'</span><span>Dataset</span><span>'</span><span>].</span><span>tolist</span><span>()</span> <span>+</span> <span>[</span><span>'</span><span>Sum Ranks</span><span>'</span><span>,</span> <span>'</span><span>Average Ranks</span><span>'</span><span>])</span><span># Display the table </span><span>print</span><span>(</span><span>"</span><span>Error Table (%) with Ranking:</span><span>"</span><span>)</span><span>print</span><span>(</span><span>formatted_results</span><span>)</span><span># Save the formatted table as an image </span><span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>14</span><span>,</span> <span>8</span><span>))</span><span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>tight</span><span>'</span><span>)</span><span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>off</span><span>'</span><span>)</span><span>table</span> <span>=</span> <span>ax</span><span>.</span><span>table</span><span>(</span><span>cellText</span><span>=</span><span>formatted_results</span><span>.</span><span>values</span><span>,</span> <span>colLabels</span><span>=</span><span>formatted_results</span><span>.</span><span>columns</span><span>,</span> <span>cellLoc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>,</span> <span>loc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>)</span><span>table</span><span>.</span><span>auto_set_font_size</span><span>(</span><span>False</span><span>)</span><span>table</span><span>.</span><span>set_fontsize</span><span>(</span><span>12</span><span>)</span><span>table</span><span>.</span><span>scale</span><span>(</span><span>2.5</span><span>,</span> <span>2.5</span><span>)</span><span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>left</span><span>=</span><span>0.2</span><span>,</span> <span>bottom</span><span>=</span><span>0.2</span><span>,</span> <span>right</span><span>=</span><span>0.8</span><span>,</span> <span>top</span><span>=</span><span>1</span><span>,</span> <span>wspace</span><span>=</span><span>0.2</span><span>,</span> <span>hspace</span><span>=</span><span>0.2</span><span>)</span><span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>table_with_rankings.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span><span>plt</span><span>.</span><span>show</span><span>()</span><span>print</span><span>(</span><span>"</span><span>Table saved as </span><span>'</span><span>table_with_rankings.png</span><span>'"</span><span>)</span><span># Perform the Friedman Test </span><span>friedman_stat</span><span>,</span> <span>p_value</span> <span>=</span> <span>friedmanchisquare</span><span>(</span><span>*</span><span>rankings_matrix</span><span>.</span><span>T</span><span>.</span><span>values</span><span>)</span><span>print</span><span>(</span><span>f</span><span>"</span><span>Friedman test statistic: </span><span>{</span><span>friedman_stat</span><span>}</span><span>, p-value = </span><span>{</span><span>p_value</span><span>}</span><span>"</span><span>)</span><span># Convert the accuracy matrix into a NumPy array for the critical difference diagram </span><span>scores</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>values</span><span>classifiers</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>columns</span><span>.</span><span>tolist</span><span>()</span><span>print</span><span>(</span><span>"</span><span>Algorithms:</span><span>"</span><span>,</span> <span>classifiers</span><span>)</span><span>print</span><span>(</span><span>"</span><span>Errors:</span><span>"</span><span>,</span> <span>scores</span><span>)</span><span># Set the figure size before plotting </span><span>plt</span><span>.</span><span>figure</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>16</span><span>,</span> <span>12</span><span>))</span> <span># Adjust the figure size as needed </span><span># Generate the critical difference diagram </span><span>plot_critical_difference</span><span>(</span><span>scores</span><span>,</span><span>classifiers</span><span>,</span><span>lower_better</span><span>=</span><span>True</span><span>,</span><span>test</span><span>=</span><span>'</span><span>wilcoxon</span><span>'</span><span>,</span> <span># or nemenyi </span> <span>correction</span><span>=</span><span>'</span><span>holm</span><span>'</span><span>,</span> <span># or bonferroni or none </span><span>)</span><span># Get the current axes </span><span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>gca</span><span>()</span><span># Adjust font size and rotation of x-axis labels </span><span>for</span> <span>label</span> <span>in</span> <span>ax</span><span>.</span><span>get_xticklabels</span><span>():</span><span>label</span><span>.</span><span>set_fontsize</span><span>(</span><span>14</span><span>)</span><span>label</span><span>.</span><span>set_rotation</span><span>(</span><span>45</span><span>)</span><span>label</span><span>.</span><span>set_horizontalalignment</span><span>(</span><span>'</span><span>right</span><span>'</span><span>)</span><span># Increase padding between labels and axis </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>x</span><span>'</span><span>,</span> <span>which</span><span>=</span><span>'</span><span>major</span><span>'</span><span>,</span> <span>pad</span><span>=</span><span>20</span><span>)</span><span># Adjust margins to provide more space for labels </span><span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>bottom</span><span>=</span><span>0.35</span><span>)</span><span># Optionally adjust y-axis label font size </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>y</span><span>'</span><span>,</span> <span>labelsize</span><span>=</span><span>12</span><span>)</span><span># Save and display the plot </span><span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>critical_difference_diagram.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span><span>plt</span><span>.</span><span>show</span><span>()</span><span>import</span> <span>pandas</span> <span>as</span> <span>pd</span> <span>import</span> <span>numpy</span> <span>as</span> <span>np</span> <span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span> <span>from</span> <span>scipy.stats</span> <span>import</span> <span>friedmanchisquare</span> <span>from</span> <span>aeon.visualisation</span> <span>import</span> <span>plot_critical_difference</span> <span>data</span> <span>=</span> <span>{</span> <span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span> <span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span> <span>'</span><span>e1-Spiral</span><span>'</span><span>,</span> <span>'</span><span>e1-Android</span><span>'</span><span>,</span> <span>'</span><span>e2-andorinhas</span><span>'</span><span>,</span> <span>'</span><span>e2-chinese</span><span>'</span><span>,</span> <span>'</span><span>e3-user</span><span>'</span><span>,</span> <span>'</span><span>e3-ecommerce</span><span>'</span><span>,</span> <span>'</span><span>e4-wine</span><span>'</span><span>,</span> <span>'</span><span>e4-heart</span><span>'</span><span>,</span> <span>'</span><span>e5-mamiferos</span><span>'</span><span>,</span> <span>'</span><span>e5-titanic</span><span>'</span> <span>],</span> <span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span> <span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>'</span><span>J48</span><span>'</span><span>,</span> <span>'</span><span>RandomForest</span><span>'</span><span>,</span> <span>'</span><span>LMT</span><span>'</span><span>,</span> <span>'</span><span>XGBoost</span><span>'</span><span>,</span> <span>'</span><span>SVM</span><span>'</span><span>,</span> <span>'</span><span>LGBM</span><span>'</span><span>,</span> <span>'</span><span>Bagging</span><span>'</span><span>,</span> <span>'</span><span>AdaBoost</span><span>'</span><span>,</span> <span>'</span><span>KStar</span><span>'</span><span>,</span> <span>'</span><span>M5P</span><span>'</span><span>,</span> <span>'</span><span>MLP</span><span>'</span><span>,</span> <span>'</span><span>HC</span><span>'</span><span>,</span> <span>'</span><span>E-M</span><span>'</span> <span>],</span> <span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span> <span>[</span> <span># MNIST -ok </span> <span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>'</span><span>10.67%</span><span>'</span><span>,</span> <span>'</span><span>3.51%</span><span>'</span><span>,</span> <span>'</span><span>5.70%</span><span>'</span><span>,</span> <span>'</span><span>2.05%</span><span>'</span><span>,</span> <span>'</span><span>2.61%</span><span>'</span><span>,</span> <span>'</span><span>2.26%</span><span>'</span><span>,</span> <span>'</span><span>5.07%</span><span>'</span><span>,</span> <span>'</span><span>11.74%</span><span>'</span><span>,</span> <span>'</span><span>89.80%</span><span>'</span><span>,</span> <span>'</span><span>47.31%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>88.65%</span><span>'</span> <span>],</span> <span>[</span> <span># Fashion-MNIST -ok </span> <span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>'</span><span>18.27%</span><span>'</span><span>,</span> <span>'</span><span>11.92%</span><span>'</span><span>,</span> <span>'</span><span>13.62%</span><span>'</span><span>,</span> <span>'</span><span>8.58%</span><span>'</span><span>,</span> <span>'</span><span>9.47%</span><span>'</span><span>,</span> <span>'</span><span>9.50%</span><span>'</span><span>,</span> <span>'</span><span>12.20%</span><span>'</span><span>,</span> <span>'</span><span>19.00%</span><span>'</span><span>,</span> <span>'</span><span>90%</span><span>'</span><span>,</span> <span>'</span><span>46.78%</span><span>'</span><span>,</span> <span>'</span><span>0.52%</span><span>'</span><span>,</span> <span>'</span><span>51.45%</span><span>'</span><span>,</span> <span>'</span><span>90%</span><span>'</span> <span>],</span> <span>[</span> <span># e1-Spiral -ok </span> <span>'</span><span>29.125%</span><span>'</span><span>,</span> <span>'</span><span>0.38%</span><span>'</span><span>,</span> <span>'</span><span>2.25%</span><span>'</span><span>,</span> <span>'</span><span>1.75%</span><span>'</span><span>,</span> <span>'</span><span>2.375%</span><span>'</span><span>,</span> <span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>1.88%</span><span>'</span><span>,</span> <span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span> <span>'</span><span>5.51%</span><span>'</span><span>,</span> <span>'</span><span>1.43%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>49.75%</span><span>'</span><span>,</span> <span>'</span><span>72.50%</span><span>'</span> <span>],</span> <span>[</span> <span># e1-Android -ok </span> <span>'</span><span>8.1317%</span><span>'</span><span>,</span> <span>'</span><span>7.7285%</span><span>'</span><span>,</span> <span>'</span><span>4.7491%</span><span>'</span><span>,</span> <span>'</span><span>4.5475%</span><span>'</span><span>,</span> <span>'</span><span>4.3683%</span><span>'</span><span>,</span> <span>'</span><span>4.03%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span> <span>'</span><span>3.47%</span><span>'</span><span>,</span> <span>'</span><span>6.38%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span> <span>'</span><span>8.94%</span><span>'</span><span>,</span> <span>'</span><span>7.61%</span><span>'</span><span>,</span> <span>'</span><span>1.67%</span><span>'</span><span>,</span> <span>'</span><span>49.98%</span><span>'</span><span>,</span> <span>'</span><span>38.95%</span><span>'</span> <span>],</span> <span>[</span> <span>'</span><span>8.1%</span><span>'</span><span>,</span> <span>'</span><span>6.65%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span> <span>'</span><span>4.90%</span><span>'</span><span>,</span> <span>'</span><span>4.60%</span><span>'</span><span>,</span> <span>'</span><span>4.00%</span><span>'</span><span>,</span> <span>'</span><span>3.75%</span><span>'</span><span>,</span> <span>'</span><span>4.25%</span><span>'</span><span>,</span> <span>'</span><span>3.50%</span><span>'</span><span>,</span> <span>'</span><span>4.75%</span><span>'</span><span>,</span> <span>'</span><span>4.36%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span> <span>'</span><span>3.92%</span><span>'</span><span>,</span> <span>'</span><span>48.60%</span><span>'</span><span>,</span> <span>'</span><span>49.25%</span><span>'</span> <span>],</span> <span>[</span> <span># e2-chinese -ok </span> <span>'</span><span>27.1589%</span><span>'</span><span>,</span> <span>'</span><span>12.8911%</span><span>'</span><span>,</span> <span>'</span><span>34.9186%</span><span>'</span><span>,</span> <span>'</span><span>7.5094%</span><span>'</span><span>,</span> <span>'</span><span>10.6383%</span><span>'</span><span>,</span> <span>'</span><span>7.50%</span><span>'</span><span>,</span> <span>'</span><span>6.25%</span><span>'</span><span>,</span> <span>'</span><span>5.63%</span><span>'</span><span>,</span> <span>'</span><span>16.25%</span><span>'</span><span>,</span> <span>'</span><span>34.38%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>1.21%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>87.36%</span><span>'</span><span>,</span> <span>'</span><span>78.22%</span><span>'</span> <span>],</span> <span>[</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.8571%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.1429%</span><span>'</span><span>,</span> <span>'</span><span>2.14%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.39%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>79.14%</span><span>'</span><span>,</span> <span>'</span><span>4.57%</span><span>'</span> <span>],</span> <span>[</span> <span># e3-ecommerce -ok </span> <span>'</span><span>11.37%</span><span>'</span><span>,</span> <span>'</span><span>11.15%</span><span>'</span><span>,</span> <span>'</span><span>2.39%</span><span>'</span><span>,</span> <span>'</span><span>2.07%</span><span>'</span><span>,</span> <span>'</span><span>2.42%</span><span>'</span><span>,</span> <span>'</span><span>0.90%</span><span>'</span><span>,</span> <span>'</span><span>8.80%</span><span>'</span><span>,</span> <span>'</span><span>0.70%</span><span>'</span><span>,</span> <span>'</span><span>10.35%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span> <span>'</span><span>0.02%</span><span>'</span><span>,</span> <span>'</span><span>7.56%</span><span>'</span><span>,</span> <span>'</span><span>3.96%</span><span>'</span><span>,</span> <span>'</span><span>22.11%</span><span>'</span><span>,</span> <span>'</span><span>41.49%</span><span>'</span> <span>],</span> <span>[</span> <span># e4-wine -ok </span> <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>35.21%</span><span>'</span><span>,</span> <span>'</span><span>38.59%</span><span>'</span><span>,</span> <span>'</span><span>29.89%</span><span>'</span><span>,</span> <span>'</span><span>39.65%</span><span>'</span><span>,</span> <span>'</span><span>48.95%</span><span>'</span><span>,</span> <span>'</span><span>56.56%</span><span>'</span><span>,</span> <span>'</span><span>46.85%</span><span>'</span><span>,</span> <span>'</span><span>43.94%</span><span>'</span><span>,</span> <span>'</span><span>50.99%</span><span>'</span><span>,</span> <span>'</span><span>39.23%</span><span>'</span><span>,</span> <span>'</span><span>50.82%</span><span>'</span><span>,</span> <span>'</span><span>36.51%</span><span>'</span><span>,</span> <span>'</span><span>57.34%</span><span>'</span><span>,</span> <span>'</span><span>77.98%</span><span>'</span> <span>],</span> <span>[</span> <span># e4-heart -ok </span> <span>'</span><span>43.51%</span><span>'</span><span>,</span> <span>'</span><span>46.61%</span><span>'</span><span>,</span> <span>'</span><span>35.82%</span><span>'</span><span>,</span> <span>'</span><span>37.20%</span><span>'</span><span>,</span> <span>'</span><span>35.88%</span><span>'</span><span>,</span> <span>'</span><span>45.71%</span><span>'</span><span>,</span> <span>'</span><span>34.51%</span><span>'</span><span>,</span> <span>'</span><span>45.73%</span><span>'</span><span>,</span> <span>'</span><span>44.16%</span><span>'</span><span>,</span> <span>'</span><span>46.1%</span><span>'</span><span>,</span> <span>'</span><span>46.15%</span><span>'</span><span>,</span> <span>'</span><span>64.18%</span><span>'</span><span>,</span> <span>'</span><span>49.22%</span><span>'</span><span>,</span> <span>'</span><span>88.1962%</span><span>'</span><span>,</span> <span>'</span><span>69.94%</span><span>'</span> <span>],</span> <span>[</span> <span># e5-mamiferos -ok </span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>1.57%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.20%</span><span>'</span><span>,</span> <span>'</span><span>31.20%</span><span>'</span><span>,</span> <span>'</span><span>44.80%</span><span>'</span> <span>],</span> <span>[</span> <span># e5-titanic -ok </span> <span>'</span><span>21.3244%</span><span>'</span><span>,</span> <span>'</span><span>22.7834%</span><span>'</span><span>,</span> <span>'</span><span>22.5589%</span><span>'</span><span>,</span> <span>'</span><span>28.3951%</span><span>'</span><span>,</span> <span>'</span><span>19.7531%</span><span>'</span><span>,</span> <span>'</span><span>16.76%</span><span>'</span><span>,</span> <span>'</span><span>27.56%</span><span>'</span><span>,</span> <span>'</span><span>14.85%</span><span>'</span><span>,</span> <span>'</span><span>7.48%</span><span>'</span><span>,</span> <span>'</span><span>10.79%</span><span>'</span><span>,</span> <span>'</span><span>27.18%</span><span>'</span><span>,</span> <span>'</span><span>61.62%</span><span>'</span><span>,</span> <span>'</span><span>26.76%</span><span>'</span><span>,</span> <span>'</span><span>38.16%</span><span>'</span><span>,</span> <span>'</span><span>38.50%</span><span>'</span> <span>]</span> <span>]</span> <span>}</span> <span># Convert the data into a DataFrame </span><span>datasets</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Datasets</span><span>'</span><span>]</span> <span>algorithms</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Algorithms</span><span>'</span><span>]</span> <span>performance_data</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Performance (Error)</span><span>'</span><span>]</span> <span># Create a list of dictionaries for each dataset </span><span>rows</span> <span>=</span> <span>[]</span> <span>for</span> <span>dataset</span><span>,</span> <span>performance</span> <span>in</span> <span>zip</span><span>(</span><span>datasets</span><span>,</span> <span>performance_data</span><span>):</span> <span>row</span> <span>=</span> <span>{</span><span>'</span><span>Dataset</span><span>'</span><span>:</span> <span>dataset</span><span>}</span> <span>row</span><span>.</span><span>update</span><span>({</span><span>alg</span><span>:</span> <span>perf</span> <span>for</span> <span>alg</span><span>,</span> <span>perf</span> <span>in</span> <span>zip</span><span>(</span><span>algorithms</span><span>,</span> <span>performance</span><span>)})</span> <span>rows</span><span>.</span><span>append</span><span>(</span><span>row</span><span>)</span> <span># Create the DataFrame </span><span>df</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>(</span><span>rows</span><span>)</span> <span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span> <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>'</span><span>,</span><span>'</span><span>,</span> <span>'</span><span>.</span><span>'</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>'</span><span>%</span><span>'</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span> <span># Calculate the ranking of each algorithm for each dataset </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>'</span><span>min</span><span>'</span><span>,</span> <span>ascending</span><span>=</span><span>True</span><span>)</span> <span># Format the results </span><span>formatted_results</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>copy</span><span>()</span> <span>for</span> <span>col</span> <span>in</span> <span>formatted_results</span><span>.</span><span>columns</span><span>:</span> <span>formatted_results</span><span>[</span><span>col</span><span>]</span> <span>=</span> <span>formatted_results</span><span>[</span><span>col</span><span>].</span><span>round</span><span>(</span><span>3</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span> (</span><span>"</span> <span>+</span> <span>rankings_matrix</span><span>[</span><span>col</span><span>].</span><span>astype</span><span>(</span><span>int</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span>)</span><span>"</span> <span># Add a row for the sum of ranks and average of ranks </span><span>sum_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>sum</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Sum Ranks</span><span>'</span><span>)</span> <span>average_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>mean</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Average Ranks</span><span>'</span><span>)</span> <span># Add the rows to the formatted DataFrame using concat </span><span>formatted_results</span> <span>=</span> <span>pd</span><span>.</span><span>concat</span><span>([</span><span>formatted_results</span><span>,</span> <span>sum_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>,</span> <span>average_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>])</span> <span># Add the 'Dataset' column to the formatted DataFrame </span><span>formatted_results</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>'</span><span>Dataset</span><span>'</span><span>,</span> <span>df</span><span>[</span><span>'</span><span>Dataset</span><span>'</span><span>].</span><span>tolist</span><span>()</span> <span>+</span> <span>[</span><span>'</span><span>Sum Ranks</span><span>'</span><span>,</span> <span>'</span><span>Average Ranks</span><span>'</span><span>])</span> <span># Display the table </span><span>print</span><span>(</span><span>"</span><span>Error Table (%) with Ranking:</span><span>"</span><span>)</span> <span>print</span><span>(</span><span>formatted_results</span><span>)</span> <span># Save the formatted table as an image </span><span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>14</span><span>,</span> <span>8</span><span>))</span> <span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>tight</span><span>'</span><span>)</span> <span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>off</span><span>'</span><span>)</span> <span>table</span> <span>=</span> <span>ax</span><span>.</span><span>table</span><span>(</span><span>cellText</span><span>=</span><span>formatted_results</span><span>.</span><span>values</span><span>,</span> <span>colLabels</span><span>=</span><span>formatted_results</span><span>.</span><span>columns</span><span>,</span> <span>cellLoc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>,</span> <span>loc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>)</span> <span>table</span><span>.</span><span>auto_set_font_size</span><span>(</span><span>False</span><span>)</span> <span>table</span><span>.</span><span>set_fontsize</span><span>(</span><span>12</span><span>)</span> <span>table</span><span>.</span><span>scale</span><span>(</span><span>2.5</span><span>,</span> <span>2.5</span><span>)</span> <span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>left</span><span>=</span><span>0.2</span><span>,</span> <span>bottom</span><span>=</span><span>0.2</span><span>,</span> <span>right</span><span>=</span><span>0.8</span><span>,</span> <span>top</span><span>=</span><span>1</span><span>,</span> <span>wspace</span><span>=</span><span>0.2</span><span>,</span> <span>hspace</span><span>=</span><span>0.2</span><span>)</span> <span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>table_with_rankings.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span> <span>plt</span><span>.</span><span>show</span><span>()</span> <span>print</span><span>(</span><span>"</span><span>Table saved as </span><span>'</span><span>table_with_rankings.png</span><span>'"</span><span>)</span> <span># Perform the Friedman Test </span><span>friedman_stat</span><span>,</span> <span>p_value</span> <span>=</span> <span>friedmanchisquare</span><span>(</span><span>*</span><span>rankings_matrix</span><span>.</span><span>T</span><span>.</span><span>values</span><span>)</span> <span>print</span><span>(</span><span>f</span><span>"</span><span>Friedman test statistic: </span><span>{</span><span>friedman_stat</span><span>}</span><span>, p-value = </span><span>{</span><span>p_value</span><span>}</span><span>"</span><span>)</span> <span># Convert the accuracy matrix into a NumPy array for the critical difference diagram </span><span>scores</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>values</span> <span>classifiers</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>columns</span><span>.</span><span>tolist</span><span>()</span> <span>print</span><span>(</span><span>"</span><span>Algorithms:</span><span>"</span><span>,</span> <span>classifiers</span><span>)</span> <span>print</span><span>(</span><span>"</span><span>Errors:</span><span>"</span><span>,</span> <span>scores</span><span>)</span> <span># Set the figure size before plotting </span><span>plt</span><span>.</span><span>figure</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>16</span><span>,</span> <span>12</span><span>))</span> <span># Adjust the figure size as needed </span> <span># Generate the critical difference diagram </span><span>plot_critical_difference</span><span>(</span> <span>scores</span><span>,</span> <span>classifiers</span><span>,</span> <span>lower_better</span><span>=</span><span>True</span><span>,</span> <span>test</span><span>=</span><span>'</span><span>wilcoxon</span><span>'</span><span>,</span> <span># or nemenyi </span> <span>correction</span><span>=</span><span>'</span><span>holm</span><span>'</span><span>,</span> <span># or bonferroni or none </span><span>)</span> <span># Get the current axes </span><span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>gca</span><span>()</span> <span># Adjust font size and rotation of x-axis labels </span><span>for</span> <span>label</span> <span>in</span> <span>ax</span><span>.</span><span>get_xticklabels</span><span>():</span> <span>label</span><span>.</span><span>set_fontsize</span><span>(</span><span>14</span><span>)</span> <span>label</span><span>.</span><span>set_rotation</span><span>(</span><span>45</span><span>)</span> <span>label</span><span>.</span><span>set_horizontalalignment</span><span>(</span><span>'</span><span>right</span><span>'</span><span>)</span> <span># Increase padding between labels and axis </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>x</span><span>'</span><span>,</span> <span>which</span><span>=</span><span>'</span><span>major</span><span>'</span><span>,</span> <span>pad</span><span>=</span><span>20</span><span>)</span> <span># Adjust margins to provide more space for labels </span><span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>bottom</span><span>=</span><span>0.35</span><span>)</span> <span># Optionally adjust y-axis label font size </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>y</span><span>'</span><span>,</span> <span>labelsize</span><span>=</span><span>12</span><span>)</span> <span># Save and display the plot </span><span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>critical_difference_diagram.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span> <span>plt</span><span>.</span><span>show</span><span>()</span>import pandas as pd import numpy as np import matplotlib.pyplot as plt from scipy.stats import friedmanchisquare from aeon.visualisation import plot_critical_difference data = { 'Datasets': [ 'MNIST', 'Fashion-MNIST', 'e1-Spiral', 'e1-Android', 'e2-andorinhas', 'e2-chinese', 'e3-user', 'e3-ecommerce', 'e4-wine', 'e4-heart', 'e5-mamiferos', 'e5-titanic' ], 'Algorithms': [ 'NaiveBayes', 'IBk', 'J48', 'RandomForest', 'LMT', 'XGBoost', 'SVM', 'LGBM', 'Bagging', 'AdaBoost', 'KStar', 'M5P', 'MLP', 'HC', 'E-M' ], 'Performance (Error)': [ [ # MNIST -ok '30.34%', '3.09%', '10.67%', '3.51%', '5.70%', '2.05%', '2.61%', '2.26%', '5.07%', '11.74%', '89.80%', '47.31%', '0%', '44.96%', '88.65%' ], [ # Fashion-MNIST -ok '36.72%', '14.35%', '18.27%', '11.92%', '13.62%', '8.58%', '9.47%', '9.50%', '12.20%', '19.00%', '90%', '46.78%', '0.52%', '51.45%', '90%' ], [ # e1-Spiral -ok '29.125%', '0.38%', '2.25%', '1.75%', '2.375%', '3.12%', '1.88%', '3.12%', '0%', '4.37%', '5.51%', '1.43%', '0%', '49.75%', '72.50%' ], [ # e1-Android -ok '8.1317%', '7.7285%', '4.7491%', '4.5475%', '4.3683%', '4.03%', '4.37%', '3.47%', '6.38%', '5.60%', '8.94%', '7.61%', '1.67%', '49.98%', '38.95%' ], [ '8.1%', '6.65%', '5.60%', '4.90%', '4.60%', '4.00%', '3.75%', '4.25%', '3.50%', '4.75%', '4.36%', '2.85%', '3.92%', '48.60%', '49.25%' ], [ # e2-chinese -ok '27.1589%', '12.8911%', '34.9186%', '7.5094%', '10.6383%', '7.50%', '6.25%', '5.63%', '16.25%', '34.38%', '0%', '1.21%', '0%', '87.36%', '78.22%' ], [ '0%', '4.8571%', '0%', '0%', '0%', '0.1429%', '2.14%', '0%', '0%', '0%', '0%', '0%', '0%', '0.39%', '0%', '79.14%', '4.57%' ], [ # e3-ecommerce -ok '11.37%', '11.15%', '2.39%', '2.07%', '2.42%', '0.90%', '8.80%', '0.70%', '10.35%', '2.85%', '0.02%', '7.56%', '3.96%', '22.11%', '41.49%' ], [ # e4-wine -ok '44.96%', '35.21%', '38.59%', '29.89%', '39.65%', '48.95%', '56.56%', '46.85%', '43.94%', '50.99%', '39.23%', '50.82%', '36.51%', '57.34%', '77.98%' ], [ # e4-heart -ok '43.51%', '46.61%', '35.82%', '37.20%', '35.88%', '45.71%', '34.51%', '45.73%', '44.16%', '46.1%', '46.15%', '64.18%', '49.22%', '88.1962%', '69.94%' ], [ # e5-mamiferos -ok '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '1.57%', '0%', '0.20%', '31.20%', '44.80%' ], [ # e5-titanic -ok '21.3244%', '22.7834%', '22.5589%', '28.3951%', '19.7531%', '16.76%', '27.56%', '14.85%', '7.48%', '10.79%', '27.18%', '61.62%', '26.76%', '38.16%', '38.50%' ] ] } # Convert the data into a DataFrame datasets = data['Datasets'] algorithms = data['Algorithms'] performance_data = data['Performance (Error)'] # Create a list of dictionaries for each dataset rows = [] for dataset, performance in zip(datasets, performance_data): row = {'Dataset': dataset} row.update({alg: perf for alg, perf in zip(algorithms, performance)}) rows.append(row) # Create the DataFrame df = pd.DataFrame(rows) # Convert string percentages to floats for alg in algorithms: df[alg] = df[alg].str.replace(',', '.').str.rstrip('%').astype(float) / 100 # Calculate the ranking of each algorithm for each dataset rankings_matrix = df[algorithms].rank(axis=1, method='min', ascending=True) # Format the results formatted_results = df[algorithms].copy() for col in formatted_results.columns: formatted_results[col] = formatted_results[col].round(3).astype(str) + " (" + rankings_matrix[col].astype(int).astype(str) + ")" # Add a row for the sum of ranks and average of ranks sum_ranks = rankings_matrix.sum().round(3).rename('Sum Ranks') average_ranks = rankings_matrix.mean().round(3).rename('Average Ranks') # Add the rows to the formatted DataFrame using concat formatted_results = pd.concat([formatted_results, sum_ranks.to_frame().T, average_ranks.to_frame().T]) # Add the 'Dataset' column to the formatted DataFrame formatted_results.insert(0, 'Dataset', df['Dataset'].tolist() + ['Sum Ranks', 'Average Ranks']) # Display the table print("Error Table (%) with Ranking:") print(formatted_results) # Save the formatted table as an image fig, ax = plt.subplots(figsize=(14, 8)) ax.axis('tight') ax.axis('off') table = ax.table(cellText=formatted_results.values, colLabels=formatted_results.columns, cellLoc='center', loc='center') table.auto_set_font_size(False) table.set_fontsize(12) table.scale(2.5, 2.5) plt.subplots_adjust(left=0.2, bottom=0.2, right=0.8, top=1, wspace=0.2, hspace=0.2) plt.savefig('table_with_rankings.png', format="png", bbox_inches="tight", dpi=300) plt.show() print("Table saved as 'table_with_rankings.png'") # Perform the Friedman Test friedman_stat, p_value = friedmanchisquare(*rankings_matrix.T.values) print(f"Friedman test statistic: {friedman_stat}, p-value = {p_value}") # Convert the accuracy matrix into a NumPy array for the critical difference diagram scores = df[algorithms].values classifiers = df[algorithms].columns.tolist() print("Algorithms:", classifiers) print("Errors:", scores) # Set the figure size before plotting plt.figure(figsize=(16, 12)) # Adjust the figure size as needed # Generate the critical difference diagram plot_critical_difference( scores, classifiers, lower_better=True, test='wilcoxon', # or nemenyi correction='holm', # or bonferroni or none ) # Get the current axes ax = plt.gca() # Adjust font size and rotation of x-axis labels for label in ax.get_xticklabels(): label.set_fontsize(14) label.set_rotation(45) label.set_horizontalalignment('right') # Increase padding between labels and axis ax.tick_params(axis='x', which='major', pad=20) # Adjust margins to provide more space for labels plt.subplots_adjust(bottom=0.35) # Optionally adjust y-axis label font size ax.tick_params(axis='y', labelsize=12) # Save and display the plot plt.savefig('critical_difference_diagram.png', format="png", bbox_inches="tight", dpi=300) plt.show()
Enter fullscreen mode Exit fullscreen mode
Changing to accuracy instead of error
The code gist already includes a full working script that uses accuracy as the performance metric.
If you’re already using the error percentage in the data
variable, you can add this last line to convert the values to accuracy:
<span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span><span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>"</span><span>,</span><span>"</span><span>,</span> <span>"</span><span>.</span><span>"</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>"</span><span>%</span><span>"</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span><span># Convert to accuracy </span> <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>1</span> <span>-</span> <span>df</span><span>[</span><span>alg</span><span>]</span><span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span> <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>"</span><span>,</span><span>"</span><span>,</span> <span>"</span><span>.</span><span>"</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>"</span><span>%</span><span>"</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span> <span># Convert to accuracy </span> <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>1</span> <span>-</span> <span>df</span><span>[</span><span>alg</span><span>]</span># Convert string percentages to floats for alg in algorithms: df[alg] = df[alg].str.replace(",", ".").str.rstrip("%").astype(float) / 100 # Convert to accuracy df[alg] = 1 - df[alg]
Enter fullscreen mode Exit fullscreen mode
Then change the rankings_matrix
to set the rank ascending
value to False
:
<span># Calculate the ranking of each algorithm for each dataset (higher accuracy = better rank) </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>"</span><span>min</span><span>"</span><span>,</span> <span>ascending</span><span>=</span><span>False</span><span>)</span><span># Calculate the ranking of each algorithm for each dataset (higher accuracy = better rank) </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>"</span><span>min</span><span>"</span><span>,</span> <span>ascending</span><span>=</span><span>False</span><span>)</span># Calculate the ranking of each algorithm for each dataset (higher accuracy = better rank) rankings_matrix = df[algorithms].rank(axis=1, method="min", ascending=False)
Enter fullscreen mode Exit fullscreen mode
Finally, when plotting the Critical Difference diagram, change the lower_better
value to False
.
<span>plot_critical_difference</span><span>(</span><span>scores</span><span>,</span><span>classifiers</span><span>,</span><span>lower_better</span><span>=</span><span>False</span><span>,</span> <span># False for accuracy (higher is better) </span> <span>test</span><span>=</span><span>"</span><span>wilcoxon</span><span>"</span><span>,</span><span>correction</span><span>=</span><span>"</span><span>holm</span><span>"</span><span>,</span><span>)</span><span>plot_critical_difference</span><span>(</span> <span>scores</span><span>,</span> <span>classifiers</span><span>,</span> <span>lower_better</span><span>=</span><span>False</span><span>,</span> <span># False for accuracy (higher is better) </span> <span>test</span><span>=</span><span>"</span><span>wilcoxon</span><span>"</span><span>,</span> <span>correction</span><span>=</span><span>"</span><span>holm</span><span>"</span><span>,</span> <span>)</span>plot_critical_difference( scores, classifiers, lower_better=False, # False for accuracy (higher is better) test="wilcoxon", correction="holm", )
Enter fullscreen mode Exit fullscreen mode
Conclusion
I hope this guide helps someone out 🙂 If you have any suggestions or questions, feel free to leave a comment or reach out, and I’ll do my best to get back to you!
原文链接:Comparing Machine Learning Algorithms Using Friedman Test and Critical Difference Diagrams in Python
暂无评论内容