Slim Eddie
The Assignment: To record, plot, edit, and process the human voice. Utilization of different plotting methods as well as processing techniques made this lab quite different from those previous.




Voice Processing

Part 1a

Plot the time-waveform of your own voice speaking your name.

Plot 1a



Analysis:

When it came to this part of the project, plotting the time-waveform of the audio signal was trivial. It was recording the audio that was a bit difficult to accomplish. With some help from online sources, though, it came fairly easily.

Part 1b

Plot the Wideband Spectrogram of the waveform from Part 1a.

Plot 1b



Analysis:

The wideband spectrogram of a signal has higher time-resolution, making it more accurate with regards to exposing the formant structure of a voice signal. With more time to process the signal (a longer window), the harmonic content is residually clouded.

Part 1c

Plot the Narrowband Spectrogram of the waveform from Part 1c.

Plot 1c



Analysis:

Inverse to the wideband spectrogram, the narrowband spectrogram has higher frequency-resolution, making it more accurate with regards to the harmonic content of a voice signal. Due to the shorter time span to process (a smaller window), the articulation of the speech, (ie: formants) are not as pronounced.

Part 1d

Select two different 30 ms vowel segments in the recorded utterance and plot their magnitude (dB vs. Hz) and phase spectra (radius vs. Hz) spectra, from 0 Hz to fs/2 Hz.

Vowel #1 ('aeh')

Plot 1d



Analysis:

Becoming accustomed to scanning through spectrograms looking for particular phonemes took some time, but after a bit of familiarizing with the gammut of different parts of human speech, finding a vowel from my name became simple.

Vowel #2 ('ih')

Plot 1d



Analysis:

The same method for extracting the first vowel was used to obtain the second one.

Part 2a

Generate a 5 sec chirp consisting of a tone swept from 20 Hz to 20 kHz at constant rate and constant amplitude.

Plot 2a



Analysis:

The configuration of the 'chirp' is about as exciting as the graph plotting it's amplitude over time. The chirp command was used to generate the frequency sweep, and only presented a bit of an issue before I had realized that the sampling rate plays a vital role in the producable frequency of a tone. Once I corrected the sample rate to one with a Nyquist frequency higher than that of the highest tone desireable to sound, the chirp command worked perfectly.

Part 2b

Plot the envelope required to maintain constant loudness level for this chirp.

Plot 2b



Analysis:

Obtaining this curve was made possible thanks to Jeff Tackett; his release of the iso226.m file featured via Matlab Central (through the MathWorks Inc.) provides a 29-point vector which accurately mimics the Fletcher-Munson curve, or equal loudness contour. Interpolating this vector to match the size of the frequency sweep signal allowed for its use as an envelop to modify the audio clip. Due to the fact that his signal happened to be a linear sweep of frequency through time, the interpolated loudness curve was used as a coefficient to the signal in the time-domain.

Part 2c (Time Domain)

Generate the equal loudness version of this chirp.

Plot 2c



Analysis:

Creating the equal loudness version of the chirp, as stated in the analysis from the previous part, was completely possible from within the time domain. A point-by-point multiplication was carried out to shift the frequency sweep to be equally loud across all frequencies.

Part 2c (Frequency Domain)

Part 2c - Sweep



Analysis:

Frequency domain conversion was completed, but for some reason, the sequence of filtering processes used yielded a very noisy output signal. The method used was a windowing scheme with a zero-padded, 512-point FFT. This FFT was multiplied by an interpolated extraction of the Equal Loudness Curve from the iso226.m file, and then returned to the time-domain by taking the real portion of the ifft of the affected signal. See the code for more details on what was attempted.

Part 2c - Song

This is the plot of the audio clip (brazil.wav) from Bela Fleck. Equal Loudness was applied to this clip in the frequency domain, however, the conversion was not properly executed. Despite all of my experimenting, I could not get the output of my code to play without the terrible noise floor that is audible with each of the clips converted in the frequency domain. The methods used within the code seem to be sound, yet there are still notable artifacts within the output clips. They do seem to be, however, equal loudness!

Sam Drazin © 2017

Home | About | Contact | Site Map