Hi folks.
Assume you are given som 20,000 - 100,000 points of raw (x,y,z)
navigation data as previously logged on some platform, and are asked
to clean up the data.
My first, naive idea is to use some Kalman filter. So far so good.
However, the data might contain glitches (jumps) or other
inconsistencies
that need to be sorted out before feeding them to the Kalman filter.
How would you go about detecting glitches and maybe even do a
first-pass correction of such data? A manual check of 100,000 data
points doesn't seem as a very tempting prospect...
Rune

"Rune Allnor" kirjoitti
viestissä: snipped-for-privacy@m73g2000cwd.googlegroups.com...
I don't know about the systems to exclude irregularities, I doubt if there
are any without some presumptions, e.g. maximum distance between two
consecutive points etc. I will stay tuned to here about such methods.
However, if you after getting rid of the exceptions (errors in data) want to
reduce the number of points in your navigation data, you could apply my
freeware GeoConv. With GeoConv you can reduce the number of points in your
navigation data still maintaining the shape of the track as close to the
original track as possible.
GeoConv:

Anything that can be said to be a "glitch" or an "irregular feature".
Some of the tracks seem to "jump" sideways on the order of 2Dx
to 3Dx, where Dx is the along-track distance between measurements.
Then they get back "on-track" within a variable number of samples,
say 5 to 20 samples.
I have a couple of ideas about how to *detect* these things, for
manual correction. But then, if there already exists a wheel, why
re-invent it...
Rune

Sounds like you are witnessing innovation outliers in a small order AR
process (since the signal take some time to get back on track after an
outlier). You can try an m-estimate (robust) linear predictor, and
threshold the prediction error for outlier detection. Are the outliers
correlated in the three dimensions eg. you have one in x-axis, then you
have one in y-axis?.
Alternatively, if you have to smooth the data afterwards in any case,
you might be interested in the procedure described here:
G Doblinger: "Adaptive Kalman Smoothing of AR Signals distrubed by
impulses and colored noise", Proc. 1998 IEEE Symp. on Advances in Dig.
Filt. and Sig. Proc., June 5-6, 1998.
The paper is available on the net.
AR modeling can also be useful for automatic correction
(interpolation).
Regards,
Andor

It depends on what you call a glitch.
If you are talking about several non-plausible points, try to check
agains physical constraints like maximum velocity (1st derivative) or
maximum acceleration/deceleration (2nd derivative). However, in any case
you need the time. So either delta-t between two points must be always
the same or you need tuples of {x,y,z,t}.
Marcel
(f'up)

We'll see. My idea is a bit more simplistic/naive, to sheck deviations
in forward direction between time steps. Yours is probably more
computationally efficient.
I don't know. I haven't had a very close look at the data yet.
Thanks! I guess there are one or two journal papers based on this
one, if the idea turned out to be good.
I was thinking along the lines of Wiener filters for data smoothing,
but I guess we are in the same ball park.
Rune

The ideas are the same - differencing is the simplest form of linear
prediction (where you predict the future value to be equal to the
current value). The first difference filter can be interpreted as the
prediction error filter. In general, predictors can be made data
adaptive. However, differencing is a good first guess for a linear
predictor. As Marcel writes, you can even threshold the prediction
error based on physical criteria.
No, mine is more costly - you have to compute the data adaptive
predictor (usually a new one every couple of samples).
Would be interesting to know.
Yup. Have fun in your Waterworld :-)!
Regards,
Andor

This may not be of immediate help to you but there's a book titled
"Detection of Abrupt Changes" online at

formatting link

.
Regarding correction, your problem sounds similar to that of click
removal in audio which has been treated here before, I think, and is
commonly attacked with LPC extrapolation.
Martin

Thanks. I'll download the book and look... eh... listen into clicks.
Ehum... I am totally aware that I reveal my ignorance for all worrld to
see, but what does that mean...?
The first four words are Latin, but "altum viditur" could be Icelandic
for "knows [it] all"...
Rune

Eliminating "stray" waypoints requires knowledge of the situation.
If you are travelling by foot and taking trackpoints every minute, then
a point that is more than 250 metres off the general direction of travel
is an exception. ( running at 15km/h for one minute) If it is an olympic
runner, hen you need to increase that value.
If travelling on a bicycle, you need to know the condition. If on a main
road, a trackpoint doesn't need to be that far off before it is an
exception. But if travelling up a switchback on a mountain pass road in
the alps, then a lot of your points will be "off course" but still very
valid to map your track.
If travelling by car on the same road, your trackpoints may not be taken
at sufficiently close intervals to log some of the switchbacks and log
just a mostly straight road up the mountain with minor ddviations
between each point.
If you have 3 points, A B and C, B can be considred an exception if the
discance between B and the axis A-C is greater than some amount. (the
aviation formulary has those equations to calculate distance between a
point and an axis).
Now, if you have 4 points, A B C and D with B and C being odd, the above
logic won't work because B will be within the limits of the A-C axis.

Hey Rune:
One thing you could do would be to compare the data with the Kalman
filter output, and ignore it whenever there's too much difference. This
causes problems if your system truly moves outside of your 'known good'
band.
Another thing that works is to discount the data when it's glitchy, but
not completely. You could either have two state evolution rules, one
for when the data is 'bad' and another for when it's 'good'. This would
result in the filter being influenced by genuinely bad data, but it
would also result in the filter being able to find it's way home if it
were way off track.
My understanding of Kalman filters is somewhat casual*, but I do know
that the classic construction assumes linear systems excited by Gaussian
noise, and quadratic cost functions. If your underlying systems aren't
linear, if your noise isn't Gaussian or if your cost functions aren't
quadratic then the optimal solution isn't linear anymore, so a plain ol'
Kalman filter is ruled out -- search around on the term 'extended
Kalman' for insight in that case.
* I've got to get a good book on this.

Rune,
this problem is the classical application for so called "robust statistical
methods".
1) Decide for a window length of n data points. This window will be used to
scan through your data for outlier and glitch detection.
2) Get the first n points of data (in case of your three dimensional
problem, cover each dimension on its own)
3) Compute the MEDIAN of the data (= the 50% PERCENTILE) If in doubt how to
compute the MEDIAN, check MATLAB help. The MEDIAN is a very good
approximation to the mean value of your data. However, the MEDIAN is
completely insensitive to outliers and glitches as long as the number of
ourliers within your window is smaller then n/2. Note that you now have a
outlier insensitive measure for the mean of your data. The MEDIAN is to
robust counterpiece to the MEAN.
4) For all data within your window compute the absolute value of the
difference between the data and the median and put the results of this
operation into a buffer array.
5) Multiply all values in your buffer with a scaling constant 1/0.654.
6) Compute the MEDIAN of the the data in your buffer. The MEDIAN of the
buffer values is the so called "MAD" (Median absolute deviation) of the
original values within your window. The MAD is the outlier insensitive
robust counterpiece to the normal standard deviation which is highly
sensitive to outliers. The scaling factor of 1/0.645 makes the MAD directly
comparable to the standard deviation for normal distributed values.
7) A rule of thumb in classical statistics is: If data is more far away from
the mean than 5 time the standard deviation, it is most probably a outlier.
Now that you have robust values available apply this rule to the robust
values: If data is more than 5 time away from the MEDIAN than the MAD then
it is most probably a outlier.
8) If a outlier is detected be sure to remove it from all diemensions.
9) Shift your window by one and go to step 3
10) Repeat for all data
Sounds as if a lot computing power is necessary for the algorithm and yes,
indeed it is. Note that computing the MEDIAN involves ordering the data
within your window in an ascending order and you need to compute a MEDIAN
twice per window shift. Use a fast sorting algorithm!
I use the above method as my standard outlier detection routine whereever i
expect to be confronted with outliers and it works very well. Note that the
length of the window is a compromise between processing time and having
enough values in the window to make significant statistics with. Note also,
that this algorithm detects outliers fairly well but is not well suited if
your your data contains inconsitencies like sudden steps (offsets that
stay).
Regards
Ulrich
"Rune Allnor" schrieb im Newsbeitrag
news: snipped-for-privacy@m73g2000cwd.googlegroups.com...

Right... things get messy when all the nice idealizations and
assumptions turn out to be a bit too ideal... been there.
Let me know if you find one. All I have in my library (which I don't
have access to for another couple of weeks) is an early 90s
edition of Brown and Kwang(?).
Rune

You didn't say what kind of navigational system this is. I think that
might affect the nature of what you see quite a bit.
I'm no nav system expert. I've simply had cause to use the output of nav
systems for various purposes. Understanding the qualities of the nav
data has been a somewhat low priority compared to my other tasks. That
said.....
My experience with taking data from inertial navigation systems is that
at first sight they look real glitchy. However, if you apply almost any
simple LPF - say, a single pole LPF - with a suitable cut off frequency
the data quietens down wonderfully. What appear at first sight to be
glitches actually appear to be unbiased noise that simply has a very
spiky quality.
Steve

No. The ideas were given to me here (sci.geo.satellite-nav) years ago
while I was post processing some tracks data I had accumulated during a
bike trip through australia. I built logic to simplify the number of
points (remove those that do not define a change in direction, and
remove those that are "stray".).
Another thing to consider: if you are a large cargo ship overating in
the st-lawrence seaway, a stray point may be really easy to spot because
a ship simply cannot move on a dime. But if you are a canoe on a rapidly
flowing river, you may catch an eddy currnt that quickly pushes you off
course.
But for an aircraft, you need to really define the requirements and
handling os any glitch. Are you allowed to filter data out which will
not get logged onto the flight recorder ? The advantage of aviation is
that you can coorolate GPS with the primary instruments. (UINS and
barometric altimetre). So if you see a glitch on the GPS but no glitch
on the other systems, then the GOPS probably experienced a glitch. But
if it is visible on all 3 systems, then it is probably turbulence etc etc.

PolyTech Forum website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.