The Assignment: To record, plot, edit, and process the human voice. Utilization of
different plotting methods as well as processing techniques made this lab quite different from those previous.
Plot the time-waveform of your own voice speaking your name.
When it came to this part of the project, plotting the time-waveform of the
audio signal was trivial. It was recording the audio that was a bit difficult to accomplish. With
some help from online sources, though, it came fairly easily.
Plot the Wideband Spectrogram of the waveform from Part 1a.
The wideband spectrogram of a signal has higher time-resolution, making it more accurate
with regards to exposing the formant structure of a voice signal. With more time to process the signal (a longer
window), the harmonic content is residually clouded.
Plot the Narrowband Spectrogram of the waveform from Part 1c.
Inverse to the wideband spectrogram, the narrowband spectrogram has higher
frequency-resolution, making it more accurate with regards to the harmonic content of a voice signal. Due to the
shorter time span to process (a smaller window), the articulation of the speech, (ie: formants) are not as
Select two different 30 ms vowel segments in the recorded utterance and plot their magnitude
(dB vs. Hz) and phase spectra (radius vs. Hz) spectra, from 0 Hz to fs/2 Hz.
Vowel #1 ('aeh')
Becoming accustomed to scanning through spectrograms looking for particular
phonemes took some time, but after a bit of familiarizing with the gammut of different parts of human speech,
finding a vowel from my name became simple.
Vowel #2 ('ih')
The same method for extracting the first vowel was used to obtain the second one.
Generate a 5 sec chirp consisting of a tone swept from 20 Hz to
20 kHz at constant rate and constant amplitude.
The configuration of the 'chirp' is about as exciting as the graph plotting it's
amplitude over time. The chirp command was used to generate the frequency sweep, and only
presented a bit of an issue before I had realized that the sampling rate plays a vital role in the producable
frequency of a tone. Once I corrected the sample rate to one with a Nyquist frequency higher than that of the
highest tone desireable to sound, the chirp command worked perfectly.
Plot the envelope required to maintain constant loudness level for this chirp.
Obtaining this curve was made possible thanks to Jeff Tackett; his release of the
iso226.m file featured via
Matlab Central (through the MathWorks Inc.) provides a
29-point vector which accurately mimics the Fletcher-Munson curve, or equal loudness contour. Interpolating this
vector to match the size of the frequency sweep signal allowed for its use as an envelop to modify the audio clip.
Due to the fact that his signal happened to be a linear sweep of frequency through time, the interpolated
loudness curve was used as a coefficient to the signal in the time-domain.
Part 2c (Time Domain)
Generate the equal loudness version of this chirp.
Creating the equal loudness version of the chirp, as stated in the analysis from the
previous part, was completely possible from within the time domain. A point-by-point multiplication was carried
out to shift the frequency sweep to be equally loud across all frequencies.
Part 2c (Frequency Domain)
Frequency domain conversion was completed, but for some reason, the sequence of filtering processes used yielded
a very noisy output signal. The method used was a windowing scheme with a zero-padded, 512-point FFT.
This FFT was multiplied by an interpolated extraction of the Equal Loudness Curve from the iso226.m file, and then
returned to the time-domain by taking the real portion of the ifft of the affected signal. See the code
for more details on what was attempted.
This is the plot of the audio clip (brazil.wav) from Bela Fleck. Equal Loudness was
applied to this clip in the frequency domain, however, the conversion was not properly executed. Despite all
of my experimenting, I could not get the output of my code to play without the terrible noise floor that is
audible with each of the clips converted in the frequency domain. The methods used within the code seem to be sound,
yet there are still notable artifacts within the output clips. They do seem to be, however, equal