Algorithm flow
- Read 220Hz audio data
<span>import</span> <span>audioflux</span> <span>as</span> <span>af</span><span>audio_path</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>sample_path</span><span>(</span><span>'220'</span><span>)</span><span>audio_arr</span><span>,</span> <span>sr</span> <span>=</span> <span>af</span><span>.</span><span>read</span><span>(</span><span>audio_path</span><span>)</span><span>import</span> <span>audioflux</span> <span>as</span> <span>af</span> <span>audio_path</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>sample_path</span><span>(</span><span>'220'</span><span>)</span> <span>audio_arr</span><span>,</span> <span>sr</span> <span>=</span> <span>af</span><span>.</span><span>read</span><span>(</span><span>audio_path</span><span>)</span>import audioflux as af audio_path = af.utils.sample_path('220') audio_arr, sr = af.read(audio_path)
Enter fullscreen mode Exit fullscreen mode
- Extract spectrogram of dB
<span>low_fre</span> <span>=</span> <span>0</span><span>spec_arr</span><span>,</span> <span>fre_band_arr</span> <span>=</span> <span>af</span><span>.</span><span>mel_spectrogram</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>,</span> <span>low_fre</span><span>=</span><span>low_fre</span><span>)</span><span>spec_dB_arr</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>power_to_db</span><span>(</span><span>spec_arr</span><span>)</span><span>low_fre</span> <span>=</span> <span>0</span> <span>spec_arr</span><span>,</span> <span>fre_band_arr</span> <span>=</span> <span>af</span><span>.</span><span>mel_spectrogram</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>,</span> <span>low_fre</span><span>=</span><span>low_fre</span><span>)</span> <span>spec_dB_arr</span> <span>=</span> <span>af</span><span>.</span><span>utils</span><span>.</span><span>power_to_db</span><span>(</span><span>spec_arr</span><span>)</span>low_fre = 0 spec_arr, fre_band_arr = af.mel_spectrogram(audio_arr, samplate=sr, low_fre=low_fre) spec_dB_arr = af.utils.power_to_db(spec_arr)
Enter fullscreen mode Exit fullscreen mode
- Show mel spectrogram plot
<span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span><span>from</span> <span>audioflux.display</span> <span>import</span> <span>fill_spec</span><span>import</span> <span>numpy</span> <span>as</span> <span>np</span><span># calculate x/y-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span><span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>spec_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span><span>y_coords</span> <span>=</span> <span>np</span><span>.</span><span>insert</span><span>(</span><span>fre_band_arr</span><span>,</span> <span>0</span><span>,</span> <span>low_fre</span><span>)</span><span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span><span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>spec_dB_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span><span>x_coords</span><span>=</span><span>x_coords</span><span>,</span><span>y_coords</span><span>=</span><span>y_coords</span><span>,</span><span>x_axis</span><span>=</span><span>'time'</span><span>,</span> <span>y_axis</span><span>=</span><span>'log'</span><span>,</span><span>title</span><span>=</span><span>'Mel Spectrogram'</span><span>)</span><span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>,</span> <span>format</span><span>=</span><span>"%+2.0f dB"</span><span>)</span><span>import</span> <span>matplotlib.pyplot</span> <span>as</span> <span>plt</span> <span>from</span> <span>audioflux.display</span> <span>import</span> <span>fill_spec</span> <span>import</span> <span>numpy</span> <span>as</span> <span>np</span> <span># calculate x/y-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span> <span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>spec_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span> <span>y_coords</span> <span>=</span> <span>np</span><span>.</span><span>insert</span><span>(</span><span>fre_band_arr</span><span>,</span> <span>0</span><span>,</span> <span>low_fre</span><span>)</span> <span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span> <span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>spec_dB_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span> <span>x_coords</span><span>=</span><span>x_coords</span><span>,</span> <span>y_coords</span><span>=</span><span>y_coords</span><span>,</span> <span>x_axis</span><span>=</span><span>'time'</span><span>,</span> <span>y_axis</span><span>=</span><span>'log'</span><span>,</span> <span>title</span><span>=</span><span>'Mel Spectrogram'</span><span>)</span> <span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>,</span> <span>format</span><span>=</span><span>"%+2.0f dB"</span><span>)</span>import matplotlib.pyplot as plt from audioflux.display import fill_spec import numpy as np # calculate x/y-coords audio_len = audio_arr.shape[0] x_coords = np.linspace(0, audio_len/sr, spec_arr.shape[1] + 1) y_coords = np.insert(fre_band_arr, 0, low_fre) fig, ax = plt.subplots() img = fill_spec(spec_dB_arr, axes=ax, x_coords=x_coords, y_coords=y_coords, x_axis='time', y_axis='log', title='Mel Spectrogram') fig.colorbar(img, ax=ax, format="%+2.0f dB")
Enter fullscreen mode Exit fullscreen mode
- Extract mfcc data
<span>cc_arr</span><span>,</span> <span>_</span> <span>=</span> <span>af</span><span>.</span><span>mfcc</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>)</span><span>cc_arr</span><span>,</span> <span>_</span> <span>=</span> <span>af</span><span>.</span><span>mfcc</span><span>(</span><span>audio_arr</span><span>,</span> <span>samplate</span><span>=</span><span>sr</span><span>)</span>cc_arr, _ = af.mfcc(audio_arr, samplate=sr)
Enter fullscreen mode Exit fullscreen mode
- Show mfcc plot
<span># calculate x-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span><span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>cc_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span><span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span><span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>cc_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span><span>x_coords</span><span>=</span><span>x_coords</span><span>,</span> <span>x_axis</span><span>=</span><span>'time'</span><span>,</span><span>title</span><span>=</span><span>'MFCC'</span><span>)</span><span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>)</span><span># calculate x-coords </span><span>audio_len</span> <span>=</span> <span>audio_arr</span><span>.</span><span>shape</span><span>[</span><span>0</span><span>]</span> <span>x_coords</span> <span>=</span> <span>np</span><span>.</span><span>linspace</span><span>(</span><span>0</span><span>,</span> <span>audio_len</span><span>/</span><span>sr</span><span>,</span> <span>cc_arr</span><span>.</span><span>shape</span><span>[</span><span>1</span><span>]</span> <span>+</span> <span>1</span><span>)</span> <span>fig</span><span>,</span> <span>ax</span> <span>=</span> <span>plt</span><span>.</span><span>subplots</span><span>()</span> <span>img</span> <span>=</span> <span>fill_spec</span><span>(</span><span>cc_arr</span><span>,</span> <span>axes</span><span>=</span><span>ax</span><span>,</span> <span>x_coords</span><span>=</span><span>x_coords</span><span>,</span> <span>x_axis</span><span>=</span><span>'time'</span><span>,</span> <span>title</span><span>=</span><span>'MFCC'</span><span>)</span> <span>fig</span><span>.</span><span>colorbar</span><span>(</span><span>img</span><span>,</span> <span>ax</span><span>=</span><span>ax</span><span>)</span># calculate x-coords audio_len = audio_arr.shape[0] x_coords = np.linspace(0, audio_len/sr, cc_arr.shape[1] + 1) fig, ax = plt.subplots() img = fill_spec(cc_arr, axes=ax, x_coords=x_coords, x_axis='time', title='MFCC') fig.colorbar(img, ax=ax)
Enter fullscreen mode Exit fullscreen mode
© 版权声明
THE END
暂无评论内容