The Wavelet Digest Homepage
Return to the homepage
Search the complete Wavelet Digest database
Help about the Wavelet Digest mailing list
About the Wavelet Digest
The Digest The Community
 Latest Issue  Back Issues  Events  Gallery
The Wavelet Digest
   -> Volume 3, Issue 7

Question: Wavelets in Speech Recognition
images/spacer.gifimages/spacer.gif Reply into Digest
Previous :: Next  
Author Message
Tilo Schuerer (

PostPosted: Mon Dec 02, 2002 1:01 pm    
Subject: Question: Wavelets in Speech Recognition
Reply with quote

Question: Wavelets in Speech Recognition

First test: WAVELETS in Speech Recognition

In the last time I read much about usage of wavelets in signal compression.
My idea is that if I can compress a signal then the computed coefficients after
the compression can be used for classification too because they must
represent essential information about the original signal.

I made some tests in the field of speech recognition in order to
verify that. This are my results after 2 weeks of work:

o According to the paper from Neil Getz

"A Fast Discrete Periodic Wavelet Transform"
Memorandum No. UCB/ERL M92/138

I made some test in compression of speech signals. I tried to find out
which levels of the computed detail-coefficients of the FPWT
are necessary when compressing and decompressing some speech data.
When using a window with 256 points and no overlap the 2nd, 3rd and 4th
level detail-coefficients (56 values) are enough to reconstruct the
speech data so that you can still well understand what was said.

o My first test was in the field of isolated word recognition over telephone
lines. I recorded around 200 calls of people speaking german digits
and some controlling words (approx. 3500 words total). With conventional
methods (PLP, RastaPLP, MFCC and LPC) together with VQ and an MLP I can
get a recognition performance of around 89% over all data.
Now I substituted only the feature extraction and used the detail-
coefficients of the FPWT from level 2 to level 4 (56 values total).
But I only reached 35% recognition performance of the unknown test data
while the training data was recogniced to 80%. I think it is clear that
I will not be better than the conventional methods but I never thought
I would be so bad!

Therefore my questions:

o Did anybody make some tests in using wavelets in speech recognition?
o Does it really make sense to use the detail-coefficients of the FPWT?
o I read a lot about scalograms in order to show the energy of wavelet-
coefficients in the time-sequency-plane. This would be very similar
to spectrograms derived from the FFT. How can I compute the scalogram?

Thanks in advance for your help and best greetings,

Tilo Schuerer
All times are GMT + 1 Hour
Page 1 of 1

Jump to: 

disclaimer -
Powered by phpBB

This page was created in 0.025898 seconds : 18 queries executed : GZIP compression disabled