Subject
- Posted on
Visual recognition and the SIFT algorithm
- 02-23-2006
February 23, 2006, 2:31 pm
Last night at the HomeBrew Robotics Club (http://www.hbrobotics.org )
here in the Silicon Valley, 4 of our members put on an amazing
demonstration of visual recognition using Lowe's SIFT algorithm.
How they were able to put on the demonstration in the first place was
actually amazing in it's self. They first searched the web to find any
code that would help them, and when that failed, they plowed through the
mathematical paper on the algorithm and them implemented it. That was a
very tedious task.
It was a great presentation. First, one of the fellows showed his
computer (with webcam attached) first train on objects and then
recognizing these various objects by speaking what they were: a doll, a
dollar bill, etc. What was amazing was these objects could be rotated,
partially obsured, distorted, at varied distance, etc. and could still
be recoginized using this algorithm. For example, the computer trained
on both a one dollar bill and a twenty dollar bill. The bills were then
shown to the webcam rotated, crumpled, with a finger overlaying the bill
at varying distances and the computer said what the object was each and
every time. The kicker is that the computer can be trained on many many
many objects and recognize any or all of them. The training data is
stored in a database and the real problem is that it really comes down
to searching the database speed as to how fast objects are recognized.
Next, there was a gentle introduction to the algorithm. The guy showed a
few slides and then a program he had written in Visual Basic that took
you visually through the steps of the algorithm and explained what was
going on in each step.
Next, there was a slightly more technical explaination.
But what really blew the crowd away (not that the crowd of about 50
members wasn't already) was this: the next presenter had taken a CMUJcam
and attached it to a cheap FPGA in which he had implemented some of the
algorithm. While it was not yet recognizing objects (he is taking
development in steps and he has a real job with Xilinks). He pointed the
CMUcam at us, and at 60 FPS, there the crowd was in outlined form. A guy
in the back of the crowd started throwing and spinning a hat in the air.
No problems. The crowd started moving to see how robust this was. No
problem. Truely amazing.
There now are plans of finishing implementing Lowe's SIFT algorithm in
an FPGA, attaching a camera lens to it and selling them to HBRC members
to play with (read debug). The expected cost was somewhere under $50.
That's right. I'll type it again. $50. But I would think it worth it for
twice that much or more.
here in the Silicon Valley, 4 of our members put on an amazing
demonstration of visual recognition using Lowe's SIFT algorithm.
How they were able to put on the demonstration in the first place was
actually amazing in it's self. They first searched the web to find any
code that would help them, and when that failed, they plowed through the
mathematical paper on the algorithm and them implemented it. That was a
very tedious task.
It was a great presentation. First, one of the fellows showed his
computer (with webcam attached) first train on objects and then
recognizing these various objects by speaking what they were: a doll, a
dollar bill, etc. What was amazing was these objects could be rotated,
partially obsured, distorted, at varied distance, etc. and could still
be recoginized using this algorithm. For example, the computer trained
on both a one dollar bill and a twenty dollar bill. The bills were then
shown to the webcam rotated, crumpled, with a finger overlaying the bill
at varying distances and the computer said what the object was each and
every time. The kicker is that the computer can be trained on many many
many objects and recognize any or all of them. The training data is
stored in a database and the real problem is that it really comes down
to searching the database speed as to how fast objects are recognized.
Next, there was a gentle introduction to the algorithm. The guy showed a
few slides and then a program he had written in Visual Basic that took
you visually through the steps of the algorithm and explained what was
going on in each step.
Next, there was a slightly more technical explaination.
But what really blew the crowd away (not that the crowd of about 50
members wasn't already) was this: the next presenter had taken a CMUJcam
and attached it to a cheap FPGA in which he had implemented some of the
algorithm. While it was not yet recognizing objects (he is taking
development in steps and he has a real job with Xilinks). He pointed the
CMUcam at us, and at 60 FPS, there the crowd was in outlined form. A guy
in the back of the crowd started throwing and spinning a hat in the air.
No problems. The crowd started moving to see how robust this was. No
problem. Truely amazing.
There now are plans of finishing implementing Lowe's SIFT algorithm in
an FPGA, attaching a camera lens to it and selling them to HBRC members
to play with (read debug). The expected cost was somewhere under $50.
That's right. I'll type it again. $50. But I would think it worth it for
twice that much or more.
Re: Visual recognition and the SIFT algorithm
that the SIFT algorithm is really useful for extracting a feature vector
from the image at hand. Do you know what type of algorithm they used for
classification?
Cheers
Padu
Re: Visual recognition and the SIFT algorithm
would weigh in on SIFT. I took a computer vision class last spring and
while we didn't implement SIFT (we did implement Hough), we spent quite
some time studying it. It's not for the faint of heart. If anybody is
interested in it, however, I can probably find some notes and post them
here. The technology is currently being used in some photostitching
software which takes a collection of photos and automatically stitches
them together (instead of manually creating a mapping between photos).
It works *really* well, and I'd be shocked if the inventor isn't a
millionaire some day.
Re: Visual recognition and the SIFT algorithm
I agree with Padu, really interesting stuff.
I googled with the subject line,
Lowe's SIFT algorithm
to get more information.
--
JC
Re: Visual recognition and the SIFT algorithm
David G. Lowe has a US patent 6,711,293 that describes a visual aparatus and
method whereby images are blurred and then subtracted from the original
image. Individual pixels that are maximal or minimal, called extrema in the
patent, are picked out from the image and then a calcualtion involving
radial zones and the summation of difference vectors in those zones is
determined for each area around those extrema. Once these numbers are
determined, Lowe uses a generalized Hough transform to correlate objects,
with a pre-trained set of objects.
Hough has an early 1962 US patent 3,069,654 which describes a method of
classifying images by charting the slopes and intercepts of line segments
found in images and then finding correlations in those patterns.
Interesting reading, especially for not so mathematically inclined, since
the entire apparatus is designed without a computer.
Since then, I know that many researchers have adopted this technique and
developed what is called a Generalize Hough Trasform. And I know that some
researchers have used this around pixel extremas. I haven't dechiphered what
Lowe is doing so far in his patent that makes it novel. Perhaps it is the
incorporation of scanning different image sizes, that makes it novel, and
this accomodates the recognition of an object at varying distances.
I believe this is the same technology marketed by Evolution in its ViPR
technology.
Please tell me if I am wrong because this is the first time I heard the name
SIFT. At RoboNexus I picked up a demo copy of ViPR and it seems to work
reasonably well. I didn't learn 10,000 images, as it claims to be able to
accomodate.
Who gave the talk at Homebrew? Wish I had been there.
Thanks,
Brad Smallridge
aivision dot com
Re: Visual recognition and the SIFT algorithm
I think it is the same the Evolution passed out at RoboNexus. At least
they had the Evolution disk there with them last night. However, in
their demo they did not use Evolution's stuff. They used their own
software/firmware/hardware.
There were 4 persons giving the talk and demonstration. They were Dave,
Ingolf, John, and Brandon. Dave and John gave talks, Ingolf showed the
implementation he had done in Visual Basic and Brandon showed the
implementation he had done with the FPGA (which was not fully
implemented yet).
Re: Visual recognition and the SIFT algorithm
Hi Brad. If you look at Lowe's page, you'll see he is on the advisory
committee of ER.
http://www.cs.ubc.ca/~lowe/
Regards what makes this, or any other, patent unique is always a moot
discrimination.
Certainly, the first part of this is old-hat ... "... images are
blurred and then subtracted ...", called lowpass/bandpass filtering,
inverse filtering, etc. And as you point out, some of the other aspects
are also old-hat or covered by other patents. Some of us think that
people with money file patents simply to keep others from "using" the
"obvious", after the obvious has been "pointed out". :-)
Re: Visual recognition and the SIFT algorithm
I hope I didn't say that the patent wasn't novel. I meant to say that I
haven't
identified what was novel and that is because I haven't studied the claims
well enough, and I don't have a good knowledge of correlation algorithms.
I was hoping that someone might fill in the blanks.
Site Timeline
- » Battery based DC motors
- — Next thread in » General Robotics Forum
-

- » Competition: Programing waste eater Nanobots
- — Previous thread in » General Robotics Forum
-

- » evoMUSART 2013: First CFP (with correct dates)
- — Newest thread in » General Robotics Forum
-

- » Heat pump refrigerant change to R-22 substitute
- — The site's Newest Thread. Posted in » General Metalworking
-

- » DCC sound question
- — The site's Last Updated Thread. Posted in » Model Railroad Forum
-









