echo delay detection with cepstrum analysis?

Hi!
I would like to detect an echo delay in the following scenario: two microphones record the same voice. Their distance is arbitrary (not more
then several meters), but well known. Both soundtracks are synchronized and overlayed. I would like to restore/detect the delay time between the two recordings. This task seems to be pretty simple with cepstrum analysis: I do the following (e.g. in Matlab):
    fft(log(fft(x)))
where x is the vector from the two overlayed recordings from both mics. I just take the first/most obvious peak in the resulting cepstrum as time delay in samples.
My question: what about environments, where different people speak together at the same time, but at different locations? I would still like to isolate a single person due to a distinct time delay. But I cannot find apropiate peaks in the cepstrum. There are many peaks, but they don't seem to relate to the actual time delays I would expect.
I need to have a better understanding of cepstrum analysis and how to read a cepstrum.
Is there anyone who could help me/point me to literature about echo delay identification?
Are there alternatives rather then cepstrum analysis?
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload

look for the peak of the cross correlation.
Scott
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload

Polytechforum.com is a website by engineers for engineers. It is not affiliated with any of manufacturers or vendors discussed here. All logos and trade names are the property of their respective owners.