Mel spectrogram and MFCC

Algorithm flow

  • Read 220Hz audio data
<span>import</span> <span>audioflux</span> <span>as</span> <span>af</span>
<span>audio_path</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>sample_path</span><span>(</span><span>'220'</span><span>)</span>
<span>audio_arr</span><span>,</span> <span>sr</span> <span>=</span> <span>af</span><span>.</span><span>read</span><span>(</span><span>audio_path</span><span>)</span>
<span>import</span> <span>audioflux</span> <span>as</span> <span>af</span>

<span>audio_path</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>sample_path</span><span>(</span><span>'220'</span><span>)</span>
<span>audio_arr</span><span>,</span> <span>sr</span> <span>=</span> <span>af</span><span>.</span><span>read</span><span>(</span><span>audio_path</span><span>)</span>
import audioflux as af audio_path = af.utils.sample_path('220') audio_arr, sr = af.read(audio_path)

Enter fullscreen mode Exit fullscreen mode

  • Extract spectrogram of dB
<span>low_fre</span> <span>=</span> <span>0</span>
<span>spec_arr</span><span>,</span> <span>fre_band_arr</span> <span>=</span> <span>af</span><span>.</span><span>mel_spectrogram</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>,</span> <span>low_fre</span><span>=</span><span>low_fre</span><span>)</span>
<span>spec_dB_arr</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>power_to_db</span><span>(</span><span>spec_arr</span><span>)</span>
<span>low_fre</span> <span>=</span> <span>0</span>
<span>spec_arr</span><span>,</span> <span>fre_band_arr</span> <span>=</span> <span>af</span><span>.</span><span>mel_spectrogram</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>,</span> <span>low_fre</span><span>=</span><span>low_fre</span><span>)</span>
<span>spec_dB_arr</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>power_to_db</span><span>(</span><span>spec_arr</span><span>)</span>
low_fre = 0 spec_arr, fre_band_arr = af.mel_spectrogram(audio_arr, samplate=sr, low_fre=low_fre) spec_dB_arr = af.utils.power_to_db(spec_arr)

Enter fullscreen mode Exit fullscreen mode

  • Show mel spectrogram plot
<span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span>
<span>from</span> <span>audioflux.display</span> <span>import</span> <span>fill_spec</span>
<span>import</span> <span>numpy</span> <span>as</span> <span>np</span>
<span># calculate x/y-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span>
<span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>spec_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span>
<span>y_coords</span> <span>=</span> <span>np</span><span>.</span><span>insert</span><span>(</span><span>fre_band_arr</span><span>,</span> <span>0</span><span>,</span> <span>low_fre</span><span>)</span>
<span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span>
<span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>spec_dB_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span>
<span>x_coords</span><span>=</span><span>x_coords</span><span>,</span>
<span>y_coords</span><span>=</span><span>y_coords</span><span>,</span>
<span>x_axis</span><span>=</span><span>'time'</span><span>,</span> <span>y_axis</span><span>=</span><span>'log'</span><span>,</span>
<span>title</span><span>=</span><span>'Mel Spectrogram'</span><span>)</span>
<span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>,</span> <span>format</span><span>=</span><span>"%+2.0f dB"</span><span>)</span>
<span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span>
<span>from</span> <span>audioflux.display</span> <span>import</span> <span>fill_spec</span>
<span>import</span> <span>numpy</span> <span>as</span> <span>np</span>

<span># calculate x/y-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span>
<span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>spec_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span>
<span>y_coords</span> <span>=</span> <span>np</span><span>.</span><span>insert</span><span>(</span><span>fre_band_arr</span><span>,</span> <span>0</span><span>,</span> <span>low_fre</span><span>)</span>

<span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span>
<span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>spec_dB_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span>
                <span>x_coords</span><span>=</span><span>x_coords</span><span>,</span>
                <span>y_coords</span><span>=</span><span>y_coords</span><span>,</span>
                <span>x_axis</span><span>=</span><span>'time'</span><span>,</span> <span>y_axis</span><span>=</span><span>'log'</span><span>,</span>
                <span>title</span><span>=</span><span>'Mel Spectrogram'</span><span>)</span>
<span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>,</span> <span>format</span><span>=</span><span>"%+2.0f dB"</span><span>)</span>
import matplotlib.pyplot as plt from audioflux.display import fill_spec import numpy as np # calculate x/y-coords audio_len = audio_arr.shape[0] x_coords = np.linspace(0, audio_len/sr, spec_arr.shape[1] + 1) y_coords = np.insert(fre_band_arr, 0, low_fre) fig, ax = plt.subplots() img = fill_spec(spec_dB_arr, axes=ax, x_coords=x_coords, y_coords=y_coords, x_axis='time', y_axis='log', title='Mel Spectrogram') fig.colorbar(img, ax=ax, format="%+2.0f dB")

Enter fullscreen mode Exit fullscreen mode

  • Extract mfcc data
<span>cc_arr</span><span>,</span> <span>_</span> <span>=</span> <span>af</span><span>.</span><span>mfcc</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>)</span>
<span>cc_arr</span><span>,</span> <span>_</span> <span>=</span> <span>af</span><span>.</span><span>mfcc</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>)</span>
cc_arr, _ = af.mfcc(audio_arr, samplate=sr)

Enter fullscreen mode Exit fullscreen mode

  • Show mfcc plot
<span># calculate x-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span>
<span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>cc_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span>
<span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span>
<span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>cc_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span>
<span>x_coords</span><span>=</span><span>x_coords</span><span>,</span> <span>x_axis</span><span>=</span><span>'time'</span><span>,</span>
<span>title</span><span>=</span><span>'MFCC'</span><span>)</span>
<span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>)</span>
<span># calculate x-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span>
<span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>cc_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span>

<span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span>
<span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>cc_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span>
                <span>x_coords</span><span>=</span><span>x_coords</span><span>,</span> <span>x_axis</span><span>=</span><span>'time'</span><span>,</span>
                <span>title</span><span>=</span><span>'MFCC'</span><span>)</span>
<span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>)</span>
# calculate x-coords audio_len = audio_arr.shape[0] x_coords = np.linspace(0, audio_len/sr, cc_arr.shape[1] + 1) fig, ax = plt.subplots() img = fill_spec(cc_arr, axes=ax, x_coords=x_coords, x_axis='time', title='MFCC') fig.colorbar(img, ax=ax)

Enter fullscreen mode Exit fullscreen mode

原文链接:Mel spectrogram and MFCC

© 版权声明
THE END
喜欢就支持一下吧
点赞11 分享
Aim for the moon. If you miss, you may hit a star.
把月亮作为你的目标。如果你没打中,也许你还能打中星星
评论 抢沙发

请登录后发表评论

    暂无评论内容