17 years ago
I would like to detect an echo delay in the following scenario: two
microphones record the same voice. Their distance is arbitrary (not more
then several meters), but well known. Both soundtracks are synchronized
and overlayed. I would like to restore/detect the delay time between the
two recordings. This task seems to be pretty simple with cepstrum
analysis: I do the following (e.g. in Matlab):
where x is the vector from the two overlayed recordings from both mics.
I just take the first/most obvious peak in the resulting cepstrum as
time delay in samples.
My question: what about environments, where different people speak
together at the same time, but at different locations? I would still
like to isolate a single person due to a distinct time delay. But I
cannot find apropiate peaks in the cepstrum. There are many peaks, but
they don't seem to relate to the actual time delays I would expect.
I need to have a better understanding of cepstrum analysis and how to
read a cepstrum.
Is there anyone who could help me/point me to literature about echo
Are there alternatives rather then cepstrum analysis?