EE 649: SPEECH PROCESSING BY COMPUTER - Spring 2002
http://shay.ecn.purdue.edu/~ee649
Class 3; Credit 3
Communications and Signal Processing Area

Professor:Leah Jamieson
Room: MSEE 242 Login: lhj@purdue.edu
Phone: 765-494-3653 FAX: 765-494-3371
Office hours:T 10:30-12:00, Th 2:00-3:00
TA:Wen Wang
Room: MSEE 292 Login: wang28@purdue.edu
Phone: 765-49-40434
Office hours: W 10:30-noon, F 1:00-2:30
Class:TTh 9:00-10:15, Room EE 226


Prerequisites: Undergraduate or graduate course in digital signal processing - e.g., Purdue EE 538 (Digital Signal Processing I); programming experience in C, C++, Matlab, or Fortran.

Course Description: The course covers the main aspects of speech processing by computer. Topics include: models of the vocal tract; identification and extraction of speech features; speech compression; the recognition of speech and speakers by computer; and control of speech synthesizers. In the required projects, students will implement speech analysis software and build a small speech recognition or compression system.

Text:

  1. L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall, 1995, ISBN 0-13-015157-2

  2. Article reprints

Additional References:

  1. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Prentice-Hall, 1978, ISBN 0-13-213603-1.

  2. J. L. Flanagan, Speech Analysis Synthesis and Perception, second edition, Springer-Verlag (1972).

    Grading:3 tests(20% each) 60%
    Projects:speech characteristics 5%
    speech analysis techniques 5%
    speech compression techniques 10%
    speech recognition or coding 20%

    Homework and solutions will be distributed; homework will not be graded.

    Notes on Projects: Computer projects may be written in Matlab, C, C++, or Fortran. You are welcome to use software packages (e.g., MatLab, ESPS, Mathematica, SoundEdit) if you wish, but these are not needed for the projects. Speech data and Fortran and C source code for some signal processing subroutines will be provided on ECN machines. On a PC, data files will require between 50 and 150 Megs. Also required: audio playback capability and internet access to download speech data files.






    Outline:
    Weeks
    1.Speech production and representation:
    2
    articulation, hearing, classification of phonetic units,
    digital representations of speech,
    short-time Fourier analysis
    2.Speech analysis:
    5
    linear predictive coding,
    cepstrum analysis,
    distortion measures,
    vector quantization,
    pitch determination and excitation identification
    3.Speech compression/coding:
    2
    code-excited linear prediction (CELP)
    MPEG coding
    wavelet-based coding
    4.Automatic recognition of speech:
    4
    dynamic time warping,
    hidden Markov models
    5.Speech synthesis: speech synthesizers, text-to-speech systems
    1
    Tests
    1