Comparing Machine Learning Algorithms Using Friedman Test and Critical Difference Diagrams in Python

When you’re working with machine learning, deciding which algorithm performs best across multiple datasets can be quite challenging. Simply looking at performance metrics might not be enough, you need statistical methods to be sure. That’s when the Friedman Test and Critical Difference (CD) Diagrams can help.

My classmates and I faced this challenge firsthand when preparing a project presentation. We struggled to find a clear way to generate the diagram, so after finally figuring it out, I decided to share this guide to save others time.

In this article, you’ll find a Python code that performs this evaluation and visualization. You can also access the complete code on my GitHub gist.

I’ll also show you how to modify the code to use accuracy instead of the error rate. The script has been tested on Python 3.8 and above.

The Python script does three main things:

  • Performs the Friedman Test to statistically evaluate performance differences.
  • Creates a ranking table comparing the algorithm scores.
  • Generates and saves a PNG image of the Critical Difference Diagram and the ranking table.

Critical Difference Diagram generated

In the diagram, algorithms connected by a horizontal bar are not significantly different from each other based on the statistical test. Lower-ranked algorithms (positioned further right) generally performed better. The ranking remains the same whether using error rate or accuracy as the performance metric.

Ranking table generated

This table shows the error rates of each algorithm across all datasets. Each cell contains the error rate along with its ranking in parentheses (where 1 is the best and 13 is the worst, since I used 12 datasets).

At the bottom, you’ll find the rank sums and average rankings for each algorithm, for better overall comparison.

I edited the original image and removed some columns for better readability here.

Why Use the Friedman Test and Critical Difference Diagram?

The Friedman test is a non-parametric statistical test designed to detect differences between multiple algorithms across various datasets. It ranks algorithms based on their performance, helping you understand if differences in performance are genuinely significant or just due to chance.

The Critical Difference Diagram visually presents these rankings. It clearly shows which algorithms perform similarly and which are significantly better or worse, making it easy to interpret results at a glance. This diagram is particularly useful when comparing numerous algorithms across multiple datasets.

Preparing Your Data

Now for the implementation, you’ll need your data structured like this:

Datasets: Names of your datasets (e.g., MNIST, Fashion-MNIST).
Algorithms: Names of the algorithms you’re evaluating. Keep these ordered consistently.
Performance (Error): Lists of error rates for each algorithm per dataset, aligned with your Algorithms list.

For example:

<span>data</span> <span>=</span> <span>{</span>
<span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span> <span>...],</span>
<span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>RandomForest</span><span>'</span><span>],</span>
<span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span>
<span>[</span><span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>88.65%</span><span>'</span><span>],</span> <span># MNIST </span> <span>[</span><span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>90%</span><span>'</span><span>],</span> <span># Fashion-MNIST </span> <span># Other datasets... </span> <span>]</span>
<span>}</span>
<span>data</span> <span>=</span> <span>{</span>
    <span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span> <span>...],</span>
    <span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span><span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>RandomForest</span><span>'</span><span>],</span>
    <span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span>
        <span>[</span><span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>88.65%</span><span>'</span><span>],</span>   <span># MNIST </span>        <span>[</span><span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>...,</span> <span>'</span><span>90%</span><span>'</span><span>],</span>     <span># Fashion-MNIST </span>        <span># Other datasets... </span>    <span>]</span>
<span>}</span>
data = { 'Datasets': ['MNIST', 'Fashion-MNIST', ...], 'Algorithms': ['NaiveBayes', 'IBk', ..., 'RandomForest'], 'Performance (Error)': [ ['30.34%', '3.09%', ..., '88.65%'], # MNIST ['36.72%', '14.35%', ..., '90%'], # Fashion-MNIST # Other datasets... ] }

Enter fullscreen mode Exit fullscreen mode

Make sure the error rates are listed in the same order as their corresponding algorithms. For example, if NaiveBayes is the first algorithm in the list, its performance values should always appear first in each dataset’s row.

If you prefer to use accuracy instead of error rates, you can either replace the values in the Performance field with accuracy scores or simply subtract the error rates from 1. I’ll also demonstrate how to do this right after the implementation.

Python Implementation

Here’s the Python code, also available in this gist on GitHub:

<span>import</span> <span>pandas</span> <span>as</span> <span>pd</span>
<span>import</span> <span>numpy</span> <span>as</span> <span>np</span>
<span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span>
<span>from</span> <span>scipy.stats</span> <span>import</span> <span>friedmanchisquare</span>
<span>from</span> <span>aeon.visualisation</span> <span>import</span> <span>plot_critical_difference</span>
<span>data</span> <span>=</span> <span>{</span>
<span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span>
<span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span>
<span>'</span><span>e1-Spiral</span><span>'</span><span>,</span> <span>'</span><span>e1-Android</span><span>'</span><span>,</span>
<span>'</span><span>e2-andorinhas</span><span>'</span><span>,</span> <span>'</span><span>e2-chinese</span><span>'</span><span>,</span>
<span>'</span><span>e3-user</span><span>'</span><span>,</span> <span>'</span><span>e3-ecommerce</span><span>'</span><span>,</span>
<span>'</span><span>e4-wine</span><span>'</span><span>,</span> <span>'</span><span>e4-heart</span><span>'</span><span>,</span>
<span>'</span><span>e5-mamiferos</span><span>'</span><span>,</span> <span>'</span><span>e5-titanic</span><span>'</span>
<span>],</span>
<span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span>
<span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>'</span><span>J48</span><span>'</span><span>,</span> <span>'</span><span>RandomForest</span><span>'</span><span>,</span> <span>'</span><span>LMT</span><span>'</span><span>,</span>
<span>'</span><span>XGBoost</span><span>'</span><span>,</span> <span>'</span><span>SVM</span><span>'</span><span>,</span> <span>'</span><span>LGBM</span><span>'</span><span>,</span> <span>'</span><span>Bagging</span><span>'</span><span>,</span> <span>'</span><span>AdaBoost</span><span>'</span><span>,</span>
<span>'</span><span>KStar</span><span>'</span><span>,</span> <span>'</span><span>M5P</span><span>'</span><span>,</span> <span>'</span><span>MLP</span><span>'</span><span>,</span> <span>'</span><span>HC</span><span>'</span><span>,</span> <span>'</span><span>E-M</span><span>'</span>
<span>],</span>
<span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span>
<span>[</span> <span># MNIST -ok </span> <span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>'</span><span>10.67%</span><span>'</span><span>,</span> <span>'</span><span>3.51%</span><span>'</span><span>,</span> <span>'</span><span>5.70%</span><span>'</span><span>,</span>
<span>'</span><span>2.05%</span><span>'</span><span>,</span> <span>'</span><span>2.61%</span><span>'</span><span>,</span> <span>'</span><span>2.26%</span><span>'</span><span>,</span> <span>'</span><span>5.07%</span><span>'</span><span>,</span> <span>'</span><span>11.74%</span><span>'</span><span>,</span>
<span>'</span><span>89.80%</span><span>'</span><span>,</span> <span>'</span><span>47.31%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>88.65%</span><span>'</span>
<span>],</span>
<span>[</span> <span># Fashion-MNIST -ok </span> <span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>'</span><span>18.27%</span><span>'</span><span>,</span> <span>'</span><span>11.92%</span><span>'</span><span>,</span> <span>'</span><span>13.62%</span><span>'</span><span>,</span>
<span>'</span><span>8.58%</span><span>'</span><span>,</span> <span>'</span><span>9.47%</span><span>'</span><span>,</span> <span>'</span><span>9.50%</span><span>'</span><span>,</span> <span>'</span><span>12.20%</span><span>'</span><span>,</span> <span>'</span><span>19.00%</span><span>'</span><span>,</span>
<span>'</span><span>90%</span><span>'</span><span>,</span> <span>'</span><span>46.78%</span><span>'</span><span>,</span> <span>'</span><span>0.52%</span><span>'</span><span>,</span> <span>'</span><span>51.45%</span><span>'</span><span>,</span> <span>'</span><span>90%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e1-Spiral -ok </span> <span>'</span><span>29.125%</span><span>'</span><span>,</span> <span>'</span><span>0.38%</span><span>'</span><span>,</span> <span>'</span><span>2.25%</span><span>'</span><span>,</span> <span>'</span><span>1.75%</span><span>'</span><span>,</span> <span>'</span><span>2.375%</span><span>'</span><span>,</span>
<span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>1.88%</span><span>'</span><span>,</span> <span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span>
<span>'</span><span>5.51%</span><span>'</span><span>,</span> <span>'</span><span>1.43%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>49.75%</span><span>'</span><span>,</span> <span>'</span><span>72.50%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e1-Android -ok </span> <span>'</span><span>8.1317%</span><span>'</span><span>,</span> <span>'</span><span>7.7285%</span><span>'</span><span>,</span> <span>'</span><span>4.7491%</span><span>'</span><span>,</span> <span>'</span><span>4.5475%</span><span>'</span><span>,</span> <span>'</span><span>4.3683%</span><span>'</span><span>,</span>
<span>'</span><span>4.03%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span> <span>'</span><span>3.47%</span><span>'</span><span>,</span> <span>'</span><span>6.38%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span>
<span>'</span><span>8.94%</span><span>'</span><span>,</span> <span>'</span><span>7.61%</span><span>'</span><span>,</span> <span>'</span><span>1.67%</span><span>'</span><span>,</span> <span>'</span><span>49.98%</span><span>'</span><span>,</span> <span>'</span><span>38.95%</span><span>'</span>
<span>],</span>
<span>[</span> <span>'</span><span>8.1%</span><span>'</span><span>,</span> <span>'</span><span>6.65%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span> <span>'</span><span>4.90%</span><span>'</span><span>,</span> <span>'</span><span>4.60%</span><span>'</span><span>,</span>
<span>'</span><span>4.00%</span><span>'</span><span>,</span> <span>'</span><span>3.75%</span><span>'</span><span>,</span> <span>'</span><span>4.25%</span><span>'</span><span>,</span> <span>'</span><span>3.50%</span><span>'</span><span>,</span> <span>'</span><span>4.75%</span><span>'</span><span>,</span>
<span>'</span><span>4.36%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span> <span>'</span><span>3.92%</span><span>'</span><span>,</span> <span>'</span><span>48.60%</span><span>'</span><span>,</span> <span>'</span><span>49.25%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e2-chinese -ok </span> <span>'</span><span>27.1589%</span><span>'</span><span>,</span> <span>'</span><span>12.8911%</span><span>'</span><span>,</span> <span>'</span><span>34.9186%</span><span>'</span><span>,</span> <span>'</span><span>7.5094%</span><span>'</span><span>,</span> <span>'</span><span>10.6383%</span><span>'</span><span>,</span>
<span>'</span><span>7.50%</span><span>'</span><span>,</span> <span>'</span><span>6.25%</span><span>'</span><span>,</span> <span>'</span><span>5.63%</span><span>'</span><span>,</span> <span>'</span><span>16.25%</span><span>'</span><span>,</span> <span>'</span><span>34.38%</span><span>'</span><span>,</span>
<span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>1.21%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>87.36%</span><span>'</span><span>,</span> <span>'</span><span>78.22%</span><span>'</span>
<span>],</span>
<span>[</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.8571%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.1429%</span><span>'</span><span>,</span>
<span>'</span><span>2.14%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span>
<span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.39%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>79.14%</span><span>'</span><span>,</span> <span>'</span><span>4.57%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e3-ecommerce -ok </span> <span>'</span><span>11.37%</span><span>'</span><span>,</span> <span>'</span><span>11.15%</span><span>'</span><span>,</span> <span>'</span><span>2.39%</span><span>'</span><span>,</span> <span>'</span><span>2.07%</span><span>'</span><span>,</span> <span>'</span><span>2.42%</span><span>'</span><span>,</span>
<span>'</span><span>0.90%</span><span>'</span><span>,</span> <span>'</span><span>8.80%</span><span>'</span><span>,</span> <span>'</span><span>0.70%</span><span>'</span><span>,</span> <span>'</span><span>10.35%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span>
<span>'</span><span>0.02%</span><span>'</span><span>,</span> <span>'</span><span>7.56%</span><span>'</span><span>,</span> <span>'</span><span>3.96%</span><span>'</span><span>,</span> <span>'</span><span>22.11%</span><span>'</span><span>,</span> <span>'</span><span>41.49%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e4-wine -ok </span> <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>35.21%</span><span>'</span><span>,</span> <span>'</span><span>38.59%</span><span>'</span><span>,</span> <span>'</span><span>29.89%</span><span>'</span><span>,</span> <span>'</span><span>39.65%</span><span>'</span><span>,</span>
<span>'</span><span>48.95%</span><span>'</span><span>,</span> <span>'</span><span>56.56%</span><span>'</span><span>,</span> <span>'</span><span>46.85%</span><span>'</span><span>,</span> <span>'</span><span>43.94%</span><span>'</span><span>,</span> <span>'</span><span>50.99%</span><span>'</span><span>,</span>
<span>'</span><span>39.23%</span><span>'</span><span>,</span> <span>'</span><span>50.82%</span><span>'</span><span>,</span> <span>'</span><span>36.51%</span><span>'</span><span>,</span> <span>'</span><span>57.34%</span><span>'</span><span>,</span> <span>'</span><span>77.98%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e4-heart -ok </span> <span>'</span><span>43.51%</span><span>'</span><span>,</span> <span>'</span><span>46.61%</span><span>'</span><span>,</span> <span>'</span><span>35.82%</span><span>'</span><span>,</span> <span>'</span><span>37.20%</span><span>'</span><span>,</span> <span>'</span><span>35.88%</span><span>'</span><span>,</span>
<span>'</span><span>45.71%</span><span>'</span><span>,</span> <span>'</span><span>34.51%</span><span>'</span><span>,</span> <span>'</span><span>45.73%</span><span>'</span><span>,</span> <span>'</span><span>44.16%</span><span>'</span><span>,</span> <span>'</span><span>46.1%</span><span>'</span><span>,</span>
<span>'</span><span>46.15%</span><span>'</span><span>,</span> <span>'</span><span>64.18%</span><span>'</span><span>,</span> <span>'</span><span>49.22%</span><span>'</span><span>,</span> <span>'</span><span>88.1962%</span><span>'</span><span>,</span> <span>'</span><span>69.94%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e5-mamiferos -ok </span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span>
<span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span>
<span>'</span><span>1.57%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.20%</span><span>'</span><span>,</span> <span>'</span><span>31.20%</span><span>'</span><span>,</span> <span>'</span><span>44.80%</span><span>'</span>
<span>],</span>
<span>[</span> <span># e5-titanic -ok </span> <span>'</span><span>21.3244%</span><span>'</span><span>,</span> <span>'</span><span>22.7834%</span><span>'</span><span>,</span> <span>'</span><span>22.5589%</span><span>'</span><span>,</span> <span>'</span><span>28.3951%</span><span>'</span><span>,</span> <span>'</span><span>19.7531%</span><span>'</span><span>,</span>
<span>'</span><span>16.76%</span><span>'</span><span>,</span> <span>'</span><span>27.56%</span><span>'</span><span>,</span> <span>'</span><span>14.85%</span><span>'</span><span>,</span> <span>'</span><span>7.48%</span><span>'</span><span>,</span> <span>'</span><span>10.79%</span><span>'</span><span>,</span>
<span>'</span><span>27.18%</span><span>'</span><span>,</span> <span>'</span><span>61.62%</span><span>'</span><span>,</span> <span>'</span><span>26.76%</span><span>'</span><span>,</span> <span>'</span><span>38.16%</span><span>'</span><span>,</span> <span>'</span><span>38.50%</span><span>'</span>
<span>]</span>
<span>]</span>
<span>}</span>
<span># Convert the data into a DataFrame </span><span>datasets</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Datasets</span><span>'</span><span>]</span>
<span>algorithms</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Algorithms</span><span>'</span><span>]</span>
<span>performance_data</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Performance (Error)</span><span>'</span><span>]</span>
<span># Create a list of dictionaries for each dataset </span><span>rows</span> <span>=</span> <span>[]</span>
<span>for</span> <span>dataset</span><span>,</span> <span>performance</span> <span>in</span> <span>zip</span><span>(</span><span>datasets</span><span>,</span> <span>performance_data</span><span>):</span>
<span>row</span> <span>=</span> <span>{</span><span>'</span><span>Dataset</span><span>'</span><span>:</span> <span>dataset</span><span>}</span>
<span>row</span><span>.</span><span>update</span><span>({</span><span>alg</span><span>:</span> <span>perf</span> <span>for</span> <span>alg</span><span>,</span> <span>perf</span> <span>in</span> <span>zip</span><span>(</span><span>algorithms</span><span>,</span> <span>performance</span><span>)})</span>
<span>rows</span><span>.</span><span>append</span><span>(</span><span>row</span><span>)</span>
<span># Create the DataFrame </span><span>df</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>(</span><span>rows</span><span>)</span>
<span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span>
<span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>'</span><span>,</span><span>'</span><span>,</span> <span>'</span><span>.</span><span>'</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>'</span><span>%</span><span>'</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span>
<span># Calculate the ranking of each algorithm for each dataset </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>'</span><span>min</span><span>'</span><span>,</span> <span>ascending</span><span>=</span><span>True</span><span>)</span>
<span># Format the results </span><span>formatted_results</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>copy</span><span>()</span>
<span>for</span> <span>col</span> <span>in</span> <span>formatted_results</span><span>.</span><span>columns</span><span>:</span>
<span>formatted_results</span><span>[</span><span>col</span><span>]</span> <span>=</span> <span>formatted_results</span><span>[</span><span>col</span><span>].</span><span>round</span><span>(</span><span>3</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span> (</span><span>"</span> <span>+</span> <span>rankings_matrix</span><span>[</span><span>col</span><span>].</span><span>astype</span><span>(</span><span>int</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span>)</span><span>"</span>
<span># Add a row for the sum of ranks and average of ranks </span><span>sum_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>sum</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Sum Ranks</span><span>'</span><span>)</span>
<span>average_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>mean</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Average Ranks</span><span>'</span><span>)</span>
<span># Add the rows to the formatted DataFrame using concat </span><span>formatted_results</span> <span>=</span> <span>pd</span><span>.</span><span>concat</span><span>([</span><span>formatted_results</span><span>,</span> <span>sum_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>,</span> <span>average_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>])</span>
<span># Add the 'Dataset' column to the formatted DataFrame </span><span>formatted_results</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>'</span><span>Dataset</span><span>'</span><span>,</span> <span>df</span><span>[</span><span>'</span><span>Dataset</span><span>'</span><span>].</span><span>tolist</span><span>()</span> <span>+</span> <span>[</span><span>'</span><span>Sum Ranks</span><span>'</span><span>,</span> <span>'</span><span>Average Ranks</span><span>'</span><span>])</span>
<span># Display the table </span><span>print</span><span>(</span><span>"</span><span>Error Table (%) with Ranking:</span><span>"</span><span>)</span>
<span>print</span><span>(</span><span>formatted_results</span><span>)</span>
<span># Save the formatted table as an image </span><span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>14</span><span>,</span> <span>8</span><span>))</span>
<span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>tight</span><span>'</span><span>)</span>
<span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>off</span><span>'</span><span>)</span>
<span>table</span> <span>=</span> <span>ax</span><span>.</span><span>table</span><span>(</span><span>cellText</span><span>=</span><span>formatted_results</span><span>.</span><span>values</span><span>,</span> <span>colLabels</span><span>=</span><span>formatted_results</span><span>.</span><span>columns</span><span>,</span> <span>cellLoc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>,</span> <span>loc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>)</span>
<span>table</span><span>.</span><span>auto_set_font_size</span><span>(</span><span>False</span><span>)</span>
<span>table</span><span>.</span><span>set_fontsize</span><span>(</span><span>12</span><span>)</span>
<span>table</span><span>.</span><span>scale</span><span>(</span><span>2.5</span><span>,</span> <span>2.5</span><span>)</span>
<span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>left</span><span>=</span><span>0.2</span><span>,</span> <span>bottom</span><span>=</span><span>0.2</span><span>,</span> <span>right</span><span>=</span><span>0.8</span><span>,</span> <span>top</span><span>=</span><span>1</span><span>,</span> <span>wspace</span><span>=</span><span>0.2</span><span>,</span> <span>hspace</span><span>=</span><span>0.2</span><span>)</span>
<span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>table_with_rankings.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span>
<span>plt</span><span>.</span><span>show</span><span>()</span>
<span>print</span><span>(</span><span>"</span><span>Table saved as </span><span>'</span><span>table_with_rankings.png</span><span>'"</span><span>)</span>
<span># Perform the Friedman Test </span><span>friedman_stat</span><span>,</span> <span>p_value</span> <span>=</span> <span>friedmanchisquare</span><span>(</span><span>*</span><span>rankings_matrix</span><span>.</span><span>T</span><span>.</span><span>values</span><span>)</span>
<span>print</span><span>(</span><span>f</span><span>"</span><span>Friedman test statistic: </span><span>{</span><span>friedman_stat</span><span>}</span><span>, p-value = </span><span>{</span><span>p_value</span><span>}</span><span>"</span><span>)</span>
<span># Convert the accuracy matrix into a NumPy array for the critical difference diagram </span><span>scores</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>values</span>
<span>classifiers</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>columns</span><span>.</span><span>tolist</span><span>()</span>
<span>print</span><span>(</span><span>"</span><span>Algorithms:</span><span>"</span><span>,</span> <span>classifiers</span><span>)</span>
<span>print</span><span>(</span><span>"</span><span>Errors:</span><span>"</span><span>,</span> <span>scores</span><span>)</span>
<span># Set the figure size before plotting </span><span>plt</span><span>.</span><span>figure</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>16</span><span>,</span> <span>12</span><span>))</span> <span># Adjust the figure size as needed </span>
<span># Generate the critical difference diagram </span><span>plot_critical_difference</span><span>(</span>
<span>scores</span><span>,</span>
<span>classifiers</span><span>,</span>
<span>lower_better</span><span>=</span><span>True</span><span>,</span>
<span>test</span><span>=</span><span>'</span><span>wilcoxon</span><span>'</span><span>,</span> <span># or nemenyi </span> <span>correction</span><span>=</span><span>'</span><span>holm</span><span>'</span><span>,</span> <span># or bonferroni or none </span><span>)</span>
<span># Get the current axes </span><span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>gca</span><span>()</span>
<span># Adjust font size and rotation of x-axis labels </span><span>for</span> <span>label</span> <span>in</span> <span>ax</span><span>.</span><span>get_xticklabels</span><span>():</span>
<span>label</span><span>.</span><span>set_fontsize</span><span>(</span><span>14</span><span>)</span>
<span>label</span><span>.</span><span>set_rotation</span><span>(</span><span>45</span><span>)</span>
<span>label</span><span>.</span><span>set_horizontalalignment</span><span>(</span><span>'</span><span>right</span><span>'</span><span>)</span>
<span># Increase padding between labels and axis </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>x</span><span>'</span><span>,</span> <span>which</span><span>=</span><span>'</span><span>major</span><span>'</span><span>,</span> <span>pad</span><span>=</span><span>20</span><span>)</span>
<span># Adjust margins to provide more space for labels </span><span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>bottom</span><span>=</span><span>0.35</span><span>)</span>
<span># Optionally adjust y-axis label font size </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>y</span><span>'</span><span>,</span> <span>labelsize</span><span>=</span><span>12</span><span>)</span>
<span># Save and display the plot </span><span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>critical_difference_diagram.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span>
<span>plt</span><span>.</span><span>show</span><span>()</span>
<span>import</span> <span>pandas</span> <span>as</span> <span>pd</span>
<span>import</span> <span>numpy</span> <span>as</span> <span>np</span>
<span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span>
<span>from</span> <span>scipy.stats</span> <span>import</span> <span>friedmanchisquare</span>
<span>from</span> <span>aeon.visualisation</span> <span>import</span> <span>plot_critical_difference</span>

<span>data</span> <span>=</span> <span>{</span>
    <span>'</span><span>Datasets</span><span>'</span><span>:</span> <span>[</span>
      <span>'</span><span>MNIST</span><span>'</span><span>,</span> <span>'</span><span>Fashion-MNIST</span><span>'</span><span>,</span> 
      <span>'</span><span>e1-Spiral</span><span>'</span><span>,</span> <span>'</span><span>e1-Android</span><span>'</span><span>,</span>
      <span>'</span><span>e2-andorinhas</span><span>'</span><span>,</span> <span>'</span><span>e2-chinese</span><span>'</span><span>,</span>
      <span>'</span><span>e3-user</span><span>'</span><span>,</span> <span>'</span><span>e3-ecommerce</span><span>'</span><span>,</span>
      <span>'</span><span>e4-wine</span><span>'</span><span>,</span> <span>'</span><span>e4-heart</span><span>'</span><span>,</span>
      <span>'</span><span>e5-mamiferos</span><span>'</span><span>,</span> <span>'</span><span>e5-titanic</span><span>'</span>
    <span>],</span>
    <span>'</span><span>Algorithms</span><span>'</span><span>:</span> <span>[</span>
        <span>'</span><span>NaiveBayes</span><span>'</span><span>,</span> <span>'</span><span>IBk</span><span>'</span><span>,</span> <span>'</span><span>J48</span><span>'</span><span>,</span> <span>'</span><span>RandomForest</span><span>'</span><span>,</span> <span>'</span><span>LMT</span><span>'</span><span>,</span> 
        <span>'</span><span>XGBoost</span><span>'</span><span>,</span> <span>'</span><span>SVM</span><span>'</span><span>,</span> <span>'</span><span>LGBM</span><span>'</span><span>,</span> <span>'</span><span>Bagging</span><span>'</span><span>,</span> <span>'</span><span>AdaBoost</span><span>'</span><span>,</span> 
        <span>'</span><span>KStar</span><span>'</span><span>,</span> <span>'</span><span>M5P</span><span>'</span><span>,</span> <span>'</span><span>MLP</span><span>'</span><span>,</span> <span>'</span><span>HC</span><span>'</span><span>,</span> <span>'</span><span>E-M</span><span>'</span>
    <span>],</span>
    <span>'</span><span>Performance (Error)</span><span>'</span><span>:</span> <span>[</span>
        <span>[</span> <span># MNIST -ok </span>            <span>'</span><span>30.34%</span><span>'</span><span>,</span> <span>'</span><span>3.09%</span><span>'</span><span>,</span> <span>'</span><span>10.67%</span><span>'</span><span>,</span> <span>'</span><span>3.51%</span><span>'</span><span>,</span> <span>'</span><span>5.70%</span><span>'</span><span>,</span> 
            <span>'</span><span>2.05%</span><span>'</span><span>,</span> <span>'</span><span>2.61%</span><span>'</span><span>,</span> <span>'</span><span>2.26%</span><span>'</span><span>,</span> <span>'</span><span>5.07%</span><span>'</span><span>,</span> <span>'</span><span>11.74%</span><span>'</span><span>,</span> 
            <span>'</span><span>89.80%</span><span>'</span><span>,</span> <span>'</span><span>47.31%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>88.65%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># Fashion-MNIST -ok </span>            <span>'</span><span>36.72%</span><span>'</span><span>,</span> <span>'</span><span>14.35%</span><span>'</span><span>,</span> <span>'</span><span>18.27%</span><span>'</span><span>,</span> <span>'</span><span>11.92%</span><span>'</span><span>,</span> <span>'</span><span>13.62%</span><span>'</span><span>,</span> 
            <span>'</span><span>8.58%</span><span>'</span><span>,</span> <span>'</span><span>9.47%</span><span>'</span><span>,</span> <span>'</span><span>9.50%</span><span>'</span><span>,</span> <span>'</span><span>12.20%</span><span>'</span><span>,</span> <span>'</span><span>19.00%</span><span>'</span><span>,</span>
            <span>'</span><span>90%</span><span>'</span><span>,</span> <span>'</span><span>46.78%</span><span>'</span><span>,</span> <span>'</span><span>0.52%</span><span>'</span><span>,</span> <span>'</span><span>51.45%</span><span>'</span><span>,</span> <span>'</span><span>90%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e1-Spiral -ok </span>            <span>'</span><span>29.125%</span><span>'</span><span>,</span> <span>'</span><span>0.38%</span><span>'</span><span>,</span> <span>'</span><span>2.25%</span><span>'</span><span>,</span> <span>'</span><span>1.75%</span><span>'</span><span>,</span> <span>'</span><span>2.375%</span><span>'</span><span>,</span>
            <span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>1.88%</span><span>'</span><span>,</span> <span>'</span><span>3.12%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span>
            <span>'</span><span>5.51%</span><span>'</span><span>,</span> <span>'</span><span>1.43%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>49.75%</span><span>'</span><span>,</span> <span>'</span><span>72.50%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e1-Android -ok </span>            <span>'</span><span>8.1317%</span><span>'</span><span>,</span> <span>'</span><span>7.7285%</span><span>'</span><span>,</span> <span>'</span><span>4.7491%</span><span>'</span><span>,</span> <span>'</span><span>4.5475%</span><span>'</span><span>,</span> <span>'</span><span>4.3683%</span><span>'</span><span>,</span>
            <span>'</span><span>4.03%</span><span>'</span><span>,</span> <span>'</span><span>4.37%</span><span>'</span><span>,</span> <span>'</span><span>3.47%</span><span>'</span><span>,</span> <span>'</span><span>6.38%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span> 
            <span>'</span><span>8.94%</span><span>'</span><span>,</span> <span>'</span><span>7.61%</span><span>'</span><span>,</span> <span>'</span><span>1.67%</span><span>'</span><span>,</span> <span>'</span><span>49.98%</span><span>'</span><span>,</span> <span>'</span><span>38.95%</span><span>'</span>
        <span>],</span>
        <span>[</span>   <span>'</span><span>8.1%</span><span>'</span><span>,</span> <span>'</span><span>6.65%</span><span>'</span><span>,</span> <span>'</span><span>5.60%</span><span>'</span><span>,</span> <span>'</span><span>4.90%</span><span>'</span><span>,</span> <span>'</span><span>4.60%</span><span>'</span><span>,</span> 
            <span>'</span><span>4.00%</span><span>'</span><span>,</span> <span>'</span><span>3.75%</span><span>'</span><span>,</span> <span>'</span><span>4.25%</span><span>'</span><span>,</span> <span>'</span><span>3.50%</span><span>'</span><span>,</span> <span>'</span><span>4.75%</span><span>'</span><span>,</span>
            <span>'</span><span>4.36%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span> <span>'</span><span>3.92%</span><span>'</span><span>,</span> <span>'</span><span>48.60%</span><span>'</span><span>,</span> <span>'</span><span>49.25%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e2-chinese -ok </span>            <span>'</span><span>27.1589%</span><span>'</span><span>,</span> <span>'</span><span>12.8911%</span><span>'</span><span>,</span> <span>'</span><span>34.9186%</span><span>'</span><span>,</span> <span>'</span><span>7.5094%</span><span>'</span><span>,</span> <span>'</span><span>10.6383%</span><span>'</span><span>,</span>
            <span>'</span><span>7.50%</span><span>'</span><span>,</span> <span>'</span><span>6.25%</span><span>'</span><span>,</span> <span>'</span><span>5.63%</span><span>'</span><span>,</span> <span>'</span><span>16.25%</span><span>'</span><span>,</span> <span>'</span><span>34.38%</span><span>'</span><span>,</span> 
            <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>1.21%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>87.36%</span><span>'</span><span>,</span> <span>'</span><span>78.22%</span><span>'</span>
        <span>],</span>
        <span>[</span>   <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>4.8571%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.1429%</span><span>'</span><span>,</span>
            <span>'</span><span>2.14%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span>
            <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.39%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>79.14%</span><span>'</span><span>,</span> <span>'</span><span>4.57%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e3-ecommerce -ok </span>            <span>'</span><span>11.37%</span><span>'</span><span>,</span> <span>'</span><span>11.15%</span><span>'</span><span>,</span> <span>'</span><span>2.39%</span><span>'</span><span>,</span> <span>'</span><span>2.07%</span><span>'</span><span>,</span> <span>'</span><span>2.42%</span><span>'</span><span>,</span> 
            <span>'</span><span>0.90%</span><span>'</span><span>,</span> <span>'</span><span>8.80%</span><span>'</span><span>,</span> <span>'</span><span>0.70%</span><span>'</span><span>,</span> <span>'</span><span>10.35%</span><span>'</span><span>,</span> <span>'</span><span>2.85%</span><span>'</span><span>,</span> 
            <span>'</span><span>0.02%</span><span>'</span><span>,</span> <span>'</span><span>7.56%</span><span>'</span><span>,</span> <span>'</span><span>3.96%</span><span>'</span><span>,</span> <span>'</span><span>22.11%</span><span>'</span><span>,</span> <span>'</span><span>41.49%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e4-wine -ok </span>            <span>'</span><span>44.96%</span><span>'</span><span>,</span> <span>'</span><span>35.21%</span><span>'</span><span>,</span> <span>'</span><span>38.59%</span><span>'</span><span>,</span> <span>'</span><span>29.89%</span><span>'</span><span>,</span> <span>'</span><span>39.65%</span><span>'</span><span>,</span>
            <span>'</span><span>48.95%</span><span>'</span><span>,</span> <span>'</span><span>56.56%</span><span>'</span><span>,</span> <span>'</span><span>46.85%</span><span>'</span><span>,</span> <span>'</span><span>43.94%</span><span>'</span><span>,</span> <span>'</span><span>50.99%</span><span>'</span><span>,</span> 
            <span>'</span><span>39.23%</span><span>'</span><span>,</span> <span>'</span><span>50.82%</span><span>'</span><span>,</span> <span>'</span><span>36.51%</span><span>'</span><span>,</span> <span>'</span><span>57.34%</span><span>'</span><span>,</span> <span>'</span><span>77.98%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e4-heart -ok </span>            <span>'</span><span>43.51%</span><span>'</span><span>,</span> <span>'</span><span>46.61%</span><span>'</span><span>,</span> <span>'</span><span>35.82%</span><span>'</span><span>,</span> <span>'</span><span>37.20%</span><span>'</span><span>,</span> <span>'</span><span>35.88%</span><span>'</span><span>,</span>
            <span>'</span><span>45.71%</span><span>'</span><span>,</span> <span>'</span><span>34.51%</span><span>'</span><span>,</span> <span>'</span><span>45.73%</span><span>'</span><span>,</span> <span>'</span><span>44.16%</span><span>'</span><span>,</span> <span>'</span><span>46.1%</span><span>'</span><span>,</span> 
            <span>'</span><span>46.15%</span><span>'</span><span>,</span> <span>'</span><span>64.18%</span><span>'</span><span>,</span> <span>'</span><span>49.22%</span><span>'</span><span>,</span> <span>'</span><span>88.1962%</span><span>'</span><span>,</span> <span>'</span><span>69.94%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e5-mamiferos -ok </span>            <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span>
            <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span>
            <span>'</span><span>1.57%</span><span>'</span><span>,</span> <span>'</span><span>0%</span><span>'</span><span>,</span> <span>'</span><span>0.20%</span><span>'</span><span>,</span> <span>'</span><span>31.20%</span><span>'</span><span>,</span> <span>'</span><span>44.80%</span><span>'</span>
        <span>],</span>
        <span>[</span> <span># e5-titanic -ok </span>            <span>'</span><span>21.3244%</span><span>'</span><span>,</span> <span>'</span><span>22.7834%</span><span>'</span><span>,</span> <span>'</span><span>22.5589%</span><span>'</span><span>,</span> <span>'</span><span>28.3951%</span><span>'</span><span>,</span> <span>'</span><span>19.7531%</span><span>'</span><span>,</span> 
            <span>'</span><span>16.76%</span><span>'</span><span>,</span> <span>'</span><span>27.56%</span><span>'</span><span>,</span> <span>'</span><span>14.85%</span><span>'</span><span>,</span> <span>'</span><span>7.48%</span><span>'</span><span>,</span> <span>'</span><span>10.79%</span><span>'</span><span>,</span> 
            <span>'</span><span>27.18%</span><span>'</span><span>,</span> <span>'</span><span>61.62%</span><span>'</span><span>,</span> <span>'</span><span>26.76%</span><span>'</span><span>,</span> <span>'</span><span>38.16%</span><span>'</span><span>,</span> <span>'</span><span>38.50%</span><span>'</span>
        <span>]</span>
    <span>]</span>
<span>}</span>

<span># Convert the data into a DataFrame </span><span>datasets</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Datasets</span><span>'</span><span>]</span>
<span>algorithms</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Algorithms</span><span>'</span><span>]</span>
<span>performance_data</span> <span>=</span> <span>data</span><span>[</span><span>'</span><span>Performance (Error)</span><span>'</span><span>]</span>

<span># Create a list of dictionaries for each dataset </span><span>rows</span> <span>=</span> <span>[]</span>
<span>for</span> <span>dataset</span><span>,</span> <span>performance</span> <span>in</span> <span>zip</span><span>(</span><span>datasets</span><span>,</span> <span>performance_data</span><span>):</span>
    <span>row</span> <span>=</span> <span>{</span><span>'</span><span>Dataset</span><span>'</span><span>:</span> <span>dataset</span><span>}</span>
    <span>row</span><span>.</span><span>update</span><span>({</span><span>alg</span><span>:</span> <span>perf</span> <span>for</span> <span>alg</span><span>,</span> <span>perf</span> <span>in</span> <span>zip</span><span>(</span><span>algorithms</span><span>,</span> <span>performance</span><span>)})</span>
    <span>rows</span><span>.</span><span>append</span><span>(</span><span>row</span><span>)</span>

<span># Create the DataFrame </span><span>df</span> <span>=</span> <span>pd</span><span>.</span><span>DataFrame</span><span>(</span><span>rows</span><span>)</span>

<span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span>
    <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>'</span><span>,</span><span>'</span><span>,</span> <span>'</span><span>.</span><span>'</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>'</span><span>%</span><span>'</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span>

<span># Calculate the ranking of each algorithm for each dataset </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>'</span><span>min</span><span>'</span><span>,</span> <span>ascending</span><span>=</span><span>True</span><span>)</span>

<span># Format the results </span><span>formatted_results</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>copy</span><span>()</span>
<span>for</span> <span>col</span> <span>in</span> <span>formatted_results</span><span>.</span><span>columns</span><span>:</span>
    <span>formatted_results</span><span>[</span><span>col</span><span>]</span> <span>=</span> <span>formatted_results</span><span>[</span><span>col</span><span>].</span><span>round</span><span>(</span><span>3</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span> (</span><span>"</span> <span>+</span> <span>rankings_matrix</span><span>[</span><span>col</span><span>].</span><span>astype</span><span>(</span><span>int</span><span>).</span><span>astype</span><span>(</span><span>str</span><span>)</span> <span>+</span> <span>"</span><span>)</span><span>"</span>

<span># Add a row for the sum of ranks and average of ranks </span><span>sum_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>sum</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Sum Ranks</span><span>'</span><span>)</span>
<span>average_ranks</span> <span>=</span> <span>rankings_matrix</span><span>.</span><span>mean</span><span>().</span><span>round</span><span>(</span><span>3</span><span>).</span><span>rename</span><span>(</span><span>'</span><span>Average Ranks</span><span>'</span><span>)</span>

<span># Add the rows to the formatted DataFrame using concat </span><span>formatted_results</span> <span>=</span> <span>pd</span><span>.</span><span>concat</span><span>([</span><span>formatted_results</span><span>,</span> <span>sum_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>,</span> <span>average_ranks</span><span>.</span><span>to_frame</span><span>().</span><span>T</span><span>])</span>

<span># Add the 'Dataset' column to the formatted DataFrame </span><span>formatted_results</span><span>.</span><span>insert</span><span>(</span><span>0</span><span>,</span> <span>'</span><span>Dataset</span><span>'</span><span>,</span> <span>df</span><span>[</span><span>'</span><span>Dataset</span><span>'</span><span>].</span><span>tolist</span><span>()</span> <span>+</span> <span>[</span><span>'</span><span>Sum Ranks</span><span>'</span><span>,</span> <span>'</span><span>Average Ranks</span><span>'</span><span>])</span>

<span># Display the table </span><span>print</span><span>(</span><span>"</span><span>Error Table (%) with Ranking:</span><span>"</span><span>)</span>
<span>print</span><span>(</span><span>formatted_results</span><span>)</span>

<span># Save the formatted table as an image </span><span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>14</span><span>,</span> <span>8</span><span>))</span>
<span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>tight</span><span>'</span><span>)</span>
<span>ax</span><span>.</span><span>axis</span><span>(</span><span>'</span><span>off</span><span>'</span><span>)</span>
<span>table</span> <span>=</span> <span>ax</span><span>.</span><span>table</span><span>(</span><span>cellText</span><span>=</span><span>formatted_results</span><span>.</span><span>values</span><span>,</span> <span>colLabels</span><span>=</span><span>formatted_results</span><span>.</span><span>columns</span><span>,</span> <span>cellLoc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>,</span> <span>loc</span><span>=</span><span>'</span><span>center</span><span>'</span><span>)</span>
<span>table</span><span>.</span><span>auto_set_font_size</span><span>(</span><span>False</span><span>)</span>
<span>table</span><span>.</span><span>set_fontsize</span><span>(</span><span>12</span><span>)</span>
<span>table</span><span>.</span><span>scale</span><span>(</span><span>2.5</span><span>,</span> <span>2.5</span><span>)</span>
<span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>left</span><span>=</span><span>0.2</span><span>,</span> <span>bottom</span><span>=</span><span>0.2</span><span>,</span> <span>right</span><span>=</span><span>0.8</span><span>,</span> <span>top</span><span>=</span><span>1</span><span>,</span> <span>wspace</span><span>=</span><span>0.2</span><span>,</span> <span>hspace</span><span>=</span><span>0.2</span><span>)</span>
<span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>table_with_rankings.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span>
<span>plt</span><span>.</span><span>show</span><span>()</span>

<span>print</span><span>(</span><span>"</span><span>Table saved as </span><span>'</span><span>table_with_rankings.png</span><span>'"</span><span>)</span>

<span># Perform the Friedman Test </span><span>friedman_stat</span><span>,</span> <span>p_value</span> <span>=</span> <span>friedmanchisquare</span><span>(</span><span>*</span><span>rankings_matrix</span><span>.</span><span>T</span><span>.</span><span>values</span><span>)</span>
<span>print</span><span>(</span><span>f</span><span>"</span><span>Friedman test statistic: </span><span>{</span><span>friedman_stat</span><span>}</span><span>, p-value = </span><span>{</span><span>p_value</span><span>}</span><span>"</span><span>)</span>

<span># Convert the accuracy matrix into a NumPy array for the critical difference diagram </span><span>scores</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>values</span>
<span>classifiers</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>columns</span><span>.</span><span>tolist</span><span>()</span>

<span>print</span><span>(</span><span>"</span><span>Algorithms:</span><span>"</span><span>,</span> <span>classifiers</span><span>)</span>
<span>print</span><span>(</span><span>"</span><span>Errors:</span><span>"</span><span>,</span> <span>scores</span><span>)</span>

<span># Set the figure size before plotting </span><span>plt</span><span>.</span><span>figure</span><span>(</span><span>figsize</span><span>=</span><span>(</span><span>16</span><span>,</span> <span>12</span><span>))</span>  <span># Adjust the figure size as needed </span>
<span># Generate the critical difference diagram </span><span>plot_critical_difference</span><span>(</span>
    <span>scores</span><span>,</span> 
    <span>classifiers</span><span>,</span> 
    <span>lower_better</span><span>=</span><span>True</span><span>,</span>
    <span>test</span><span>=</span><span>'</span><span>wilcoxon</span><span>'</span><span>,</span>  <span># or nemenyi </span>    <span>correction</span><span>=</span><span>'</span><span>holm</span><span>'</span><span>,</span> <span># or bonferroni or none </span><span>)</span>

<span># Get the current axes </span><span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>gca</span><span>()</span>

<span># Adjust font size and rotation of x-axis labels </span><span>for</span> <span>label</span> <span>in</span> <span>ax</span><span>.</span><span>get_xticklabels</span><span>():</span>
    <span>label</span><span>.</span><span>set_fontsize</span><span>(</span><span>14</span><span>)</span> 
    <span>label</span><span>.</span><span>set_rotation</span><span>(</span><span>45</span><span>)</span> 
    <span>label</span><span>.</span><span>set_horizontalalignment</span><span>(</span><span>'</span><span>right</span><span>'</span><span>)</span>

<span># Increase padding between labels and axis </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>x</span><span>'</span><span>,</span> <span>which</span><span>=</span><span>'</span><span>major</span><span>'</span><span>,</span> <span>pad</span><span>=</span><span>20</span><span>)</span>

<span># Adjust margins to provide more space for labels </span><span>plt</span><span>.</span><span>subplots_adjust</span><span>(</span><span>bottom</span><span>=</span><span>0.35</span><span>)</span>

<span># Optionally adjust y-axis label font size </span><span>ax</span><span>.</span><span>tick_params</span><span>(</span><span>axis</span><span>=</span><span>'</span><span>y</span><span>'</span><span>,</span> <span>labelsize</span><span>=</span><span>12</span><span>)</span>

<span># Save and display the plot </span><span>plt</span><span>.</span><span>savefig</span><span>(</span><span>'</span><span>critical_difference_diagram.png</span><span>'</span><span>,</span> <span>format</span><span>=</span><span>"</span><span>png</span><span>"</span><span>,</span> <span>bbox_inches</span><span>=</span><span>"</span><span>tight</span><span>"</span><span>,</span> <span>dpi</span><span>=</span><span>300</span><span>)</span>
<span>plt</span><span>.</span><span>show</span><span>()</span>
import pandas as pd import numpy as np import matplotlib.pyplot as plt from scipy.stats import friedmanchisquare from aeon.visualisation import plot_critical_difference data = { 'Datasets': [ 'MNIST', 'Fashion-MNIST', 'e1-Spiral', 'e1-Android', 'e2-andorinhas', 'e2-chinese', 'e3-user', 'e3-ecommerce', 'e4-wine', 'e4-heart', 'e5-mamiferos', 'e5-titanic' ], 'Algorithms': [ 'NaiveBayes', 'IBk', 'J48', 'RandomForest', 'LMT', 'XGBoost', 'SVM', 'LGBM', 'Bagging', 'AdaBoost', 'KStar', 'M5P', 'MLP', 'HC', 'E-M' ], 'Performance (Error)': [ [ # MNIST -ok '30.34%', '3.09%', '10.67%', '3.51%', '5.70%', '2.05%', '2.61%', '2.26%', '5.07%', '11.74%', '89.80%', '47.31%', '0%', '44.96%', '88.65%' ], [ # Fashion-MNIST -ok '36.72%', '14.35%', '18.27%', '11.92%', '13.62%', '8.58%', '9.47%', '9.50%', '12.20%', '19.00%', '90%', '46.78%', '0.52%', '51.45%', '90%' ], [ # e1-Spiral -ok '29.125%', '0.38%', '2.25%', '1.75%', '2.375%', '3.12%', '1.88%', '3.12%', '0%', '4.37%', '5.51%', '1.43%', '0%', '49.75%', '72.50%' ], [ # e1-Android -ok '8.1317%', '7.7285%', '4.7491%', '4.5475%', '4.3683%', '4.03%', '4.37%', '3.47%', '6.38%', '5.60%', '8.94%', '7.61%', '1.67%', '49.98%', '38.95%' ], [ '8.1%', '6.65%', '5.60%', '4.90%', '4.60%', '4.00%', '3.75%', '4.25%', '3.50%', '4.75%', '4.36%', '2.85%', '3.92%', '48.60%', '49.25%' ], [ # e2-chinese -ok '27.1589%', '12.8911%', '34.9186%', '7.5094%', '10.6383%', '7.50%', '6.25%', '5.63%', '16.25%', '34.38%', '0%', '1.21%', '0%', '87.36%', '78.22%' ], [ '0%', '4.8571%', '0%', '0%', '0%', '0.1429%', '2.14%', '0%', '0%', '0%', '0%', '0%', '0%', '0.39%', '0%', '79.14%', '4.57%' ], [ # e3-ecommerce -ok '11.37%', '11.15%', '2.39%', '2.07%', '2.42%', '0.90%', '8.80%', '0.70%', '10.35%', '2.85%', '0.02%', '7.56%', '3.96%', '22.11%', '41.49%' ], [ # e4-wine -ok '44.96%', '35.21%', '38.59%', '29.89%', '39.65%', '48.95%', '56.56%', '46.85%', '43.94%', '50.99%', '39.23%', '50.82%', '36.51%', '57.34%', '77.98%' ], [ # e4-heart -ok '43.51%', '46.61%', '35.82%', '37.20%', '35.88%', '45.71%', '34.51%', '45.73%', '44.16%', '46.1%', '46.15%', '64.18%', '49.22%', '88.1962%', '69.94%' ], [ # e5-mamiferos -ok '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '0%', '1.57%', '0%', '0.20%', '31.20%', '44.80%' ], [ # e5-titanic -ok '21.3244%', '22.7834%', '22.5589%', '28.3951%', '19.7531%', '16.76%', '27.56%', '14.85%', '7.48%', '10.79%', '27.18%', '61.62%', '26.76%', '38.16%', '38.50%' ] ] } # Convert the data into a DataFrame datasets = data['Datasets'] algorithms = data['Algorithms'] performance_data = data['Performance (Error)'] # Create a list of dictionaries for each dataset rows = [] for dataset, performance in zip(datasets, performance_data): row = {'Dataset': dataset} row.update({alg: perf for alg, perf in zip(algorithms, performance)}) rows.append(row) # Create the DataFrame df = pd.DataFrame(rows) # Convert string percentages to floats for alg in algorithms: df[alg] = df[alg].str.replace(',', '.').str.rstrip('%').astype(float) / 100 # Calculate the ranking of each algorithm for each dataset rankings_matrix = df[algorithms].rank(axis=1, method='min', ascending=True) # Format the results formatted_results = df[algorithms].copy() for col in formatted_results.columns: formatted_results[col] = formatted_results[col].round(3).astype(str) + " (" + rankings_matrix[col].astype(int).astype(str) + ")" # Add a row for the sum of ranks and average of ranks sum_ranks = rankings_matrix.sum().round(3).rename('Sum Ranks') average_ranks = rankings_matrix.mean().round(3).rename('Average Ranks') # Add the rows to the formatted DataFrame using concat formatted_results = pd.concat([formatted_results, sum_ranks.to_frame().T, average_ranks.to_frame().T]) # Add the 'Dataset' column to the formatted DataFrame formatted_results.insert(0, 'Dataset', df['Dataset'].tolist() + ['Sum Ranks', 'Average Ranks']) # Display the table print("Error Table (%) with Ranking:") print(formatted_results) # Save the formatted table as an image fig, ax = plt.subplots(figsize=(14, 8)) ax.axis('tight') ax.axis('off') table = ax.table(cellText=formatted_results.values, colLabels=formatted_results.columns, cellLoc='center', loc='center') table.auto_set_font_size(False) table.set_fontsize(12) table.scale(2.5, 2.5) plt.subplots_adjust(left=0.2, bottom=0.2, right=0.8, top=1, wspace=0.2, hspace=0.2) plt.savefig('table_with_rankings.png', format="png", bbox_inches="tight", dpi=300) plt.show() print("Table saved as 'table_with_rankings.png'") # Perform the Friedman Test friedman_stat, p_value = friedmanchisquare(*rankings_matrix.T.values) print(f"Friedman test statistic: {friedman_stat}, p-value = {p_value}") # Convert the accuracy matrix into a NumPy array for the critical difference diagram scores = df[algorithms].values classifiers = df[algorithms].columns.tolist() print("Algorithms:", classifiers) print("Errors:", scores) # Set the figure size before plotting plt.figure(figsize=(16, 12)) # Adjust the figure size as needed # Generate the critical difference diagram plot_critical_difference( scores, classifiers, lower_better=True, test='wilcoxon', # or nemenyi correction='holm', # or bonferroni or none ) # Get the current axes ax = plt.gca() # Adjust font size and rotation of x-axis labels for label in ax.get_xticklabels(): label.set_fontsize(14) label.set_rotation(45) label.set_horizontalalignment('right') # Increase padding between labels and axis ax.tick_params(axis='x', which='major', pad=20) # Adjust margins to provide more space for labels plt.subplots_adjust(bottom=0.35) # Optionally adjust y-axis label font size ax.tick_params(axis='y', labelsize=12) # Save and display the plot plt.savefig('critical_difference_diagram.png', format="png", bbox_inches="tight", dpi=300) plt.show()

Enter fullscreen mode Exit fullscreen mode

Changing to accuracy instead of error

The code gist already includes a full working script that uses accuracy as the performance metric.

If you’re already using the error percentage in the data variable, you can add this last line to convert the values to accuracy:

<span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span>
<span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>"</span><span>,</span><span>"</span><span>,</span> <span>"</span><span>.</span><span>"</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>"</span><span>%</span><span>"</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span>
<span># Convert to accuracy </span> <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>1</span> <span>-</span> <span>df</span><span>[</span><span>alg</span><span>]</span>
<span># Convert string percentages to floats </span><span>for</span> <span>alg</span> <span>in</span> <span>algorithms</span><span>:</span>
    <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>df</span><span>[</span><span>alg</span><span>].</span><span>str</span><span>.</span><span>replace</span><span>(</span><span>"</span><span>,</span><span>"</span><span>,</span> <span>"</span><span>.</span><span>"</span><span>).</span><span>str</span><span>.</span><span>rstrip</span><span>(</span><span>"</span><span>%</span><span>"</span><span>).</span><span>astype</span><span>(</span><span>float</span><span>)</span> <span>/</span> <span>100</span>
    <span># Convert to accuracy </span>    <span>df</span><span>[</span><span>alg</span><span>]</span> <span>=</span> <span>1</span> <span>-</span> <span>df</span><span>[</span><span>alg</span><span>]</span>
# Convert string percentages to floats for alg in algorithms: df[alg] = df[alg].str.replace(",", ".").str.rstrip("%").astype(float) / 100 # Convert to accuracy df[alg] = 1 - df[alg]

Enter fullscreen mode Exit fullscreen mode

Then change the rankings_matrix to set the rank ascending value to False:

<span># Calculate the ranking of each algorithm for each dataset (higher accuracy = better rank) </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>"</span><span>min</span><span>"</span><span>,</span> <span>ascending</span><span>=</span><span>False</span><span>)</span>
<span># Calculate the ranking of each algorithm for each dataset (higher accuracy = better rank) </span><span>rankings_matrix</span> <span>=</span> <span>df</span><span>[</span><span>algorithms</span><span>].</span><span>rank</span><span>(</span><span>axis</span><span>=</span><span>1</span><span>,</span> <span>method</span><span>=</span><span>"</span><span>min</span><span>"</span><span>,</span> <span>ascending</span><span>=</span><span>False</span><span>)</span>
# Calculate the ranking of each algorithm for each dataset (higher accuracy = better rank) rankings_matrix = df[algorithms].rank(axis=1, method="min", ascending=False)

Enter fullscreen mode Exit fullscreen mode

Finally, when plotting the Critical Difference diagram, change the lower_better value to False.

<span>plot_critical_difference</span><span>(</span>
<span>scores</span><span>,</span>
<span>classifiers</span><span>,</span>
<span>lower_better</span><span>=</span><span>False</span><span>,</span> <span># False for accuracy (higher is better) </span> <span>test</span><span>=</span><span>"</span><span>wilcoxon</span><span>"</span><span>,</span>
<span>correction</span><span>=</span><span>"</span><span>holm</span><span>"</span><span>,</span>
<span>)</span>
<span>plot_critical_difference</span><span>(</span>
    <span>scores</span><span>,</span>
    <span>classifiers</span><span>,</span>
    <span>lower_better</span><span>=</span><span>False</span><span>,</span>  <span># False for accuracy (higher is better) </span>    <span>test</span><span>=</span><span>"</span><span>wilcoxon</span><span>"</span><span>,</span>
    <span>correction</span><span>=</span><span>"</span><span>holm</span><span>"</span><span>,</span>
<span>)</span>
plot_critical_difference( scores, classifiers, lower_better=False, # False for accuracy (higher is better) test="wilcoxon", correction="holm", )

Enter fullscreen mode Exit fullscreen mode

Conclusion

I hope this guide helps someone out 🙂 If you have any suggestions or questions, feel free to leave a comment or reach out, and I’ll do my best to get back to you!

原文链接:Comparing Machine Learning Algorithms Using Friedman Test and Critical Difference Diagrams in Python

© 版权声明
THE END
喜欢就支持一下吧
点赞8 分享
Misery can be caused by someone being just weak and indecisive.
一个人仅仅因为软弱无能或优柔寡断就完全可能招致痛苦
评论 抢沙发

请登录后发表评论

    暂无评论内容