| http://shay.ecn.purdue.edu/~ee649 | |
| Class 3; Credit 3 | |
| Communications and Signal Processing Area |
| Professor: | Leah Jamieson | |
| Room: MSEE 242 | Login: lhj@purdue.edu | |
| Phone: 765-494-3653 | FAX: 765-494-3371 | |
| Office hours: | T 10:30-12:00, Th 2:00-3:00 | |
| TA: | Wen Wang | |
| Room: MSEE 292 | Login: wang28@purdue.edu | |
| Phone: 765-49-40434 | ||
| Office hours: | W 10:30-noon, F 1:00-2:30 | |
| Class: | TTh 9:00-10:15, Room EE 226 |
Prerequisites:
Undergraduate or graduate course in digital signal processing -
e.g., Purdue EE 538 (Digital Signal Processing I);
programming experience in C, C++, Matlab, or Fortran.
Course Description: The course covers the main aspects
of speech processing by computer. Topics include: models of
the vocal tract; identification and extraction of speech features;
speech compression;
the recognition of speech and speakers by computer;
and control of speech synthesizers.
In the required projects, students will implement speech analysis
software and build a small speech recognition or compression system.
Text:
Additional References:
| Grading: | 3 tests | (20% each) | 60% |
| Projects: | speech characteristics | 5% | |
| speech analysis techniques | 5% | ||
| speech compression techniques | 10% | ||
| speech recognition or coding | 20% |
Notes on Projects: Computer projects may be written in
Matlab, C, C++, or Fortran. You are welcome to use software packages
(e.g., MatLab, ESPS, Mathematica, SoundEdit) if you wish, but
these are not needed for the projects. Speech data and Fortran
and C source code for some signal processing subroutines will
be provided on ECN machines.
On a PC, data files will require between
50 and 150 Megs. Also required: audio
playback capability and internet access to download speech data
files.
| Outline: | ||
| 1. | Speech production and representation: | |
| articulation, hearing, classification of phonetic units, | ||
| digital representations of speech, | ||
| short-time Fourier analysis | ||
| 2. | Speech analysis: | |
| linear predictive coding, | ||
| cepstrum analysis, | ||
| distortion measures, | ||
| vector quantization, | ||
| pitch determination and excitation identification | ||
| 3. | Speech compression/coding: | |
| code-excited linear prediction (CELP) | ||
| MPEG coding | ||
| wavelet-based coding | ||
| 4. | Automatic recognition of speech: | |
| dynamic time warping, | ||
| hidden Markov models | ||
| 5. | Speech synthesis: speech synthesizers, text-to-speech systems | |
| Tests | ||