Low cost vision system, thoughts

- M
- mlw
  
  Contact options for registered users
posted
18 years ago

Wed, Feb 1, 2006 1:57 PM

I was thinking about this last night, I've had in the back of my mind an idea for a low cost vision system. What is required is a good inexpensive camera. 320x240 is OK, 640x480 would be better, much larger would make the problem harder.

First, is there a camera that has 320x240 or 640x480 resolution that has an equivalent of a fast shutter speed? i.e. when the robot moves, it isn't blurred?

USB2, maybe fireware, but that's typically more expensive.

Second, I'm not sure of the specifics just yet, but an infrared cross hair laser.

I'm still thinking out the details, but if done right could be a very efficient vision system, reduced processing, and if the parts are cheap enough, pretty inexpensive.

The idea that you have a laser point at a known angle to a video camera allows you to know the distance between the object and the camera based on the height of the dot in the image, right? The horizontal line of the cross hair is like that, but for a field of view.

The general shape of the horizontal line in the field of view, may be able to be used to create a known position map.

Any comments?

- P
- Padu
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 6:16 PM

"mlw"

Not exactly cheap, but cheapest firewire camera I could find:

formatting link

I bought the DFK 21AF04 and shutter speed goes as fast as 1/10,000s. I also bought a prosilica (1/3" sensor instead of 1/4" sensor), but I've still not compared the image quality on both of them.

I'm not sure, but take a look at what I call the coffe makers (sick ladars), it was my understanding that they use some similar technique. And by the way, everybody at the DARPA GC was using lots of them. Actually, from my perception, there were far more SICKs than traditional vision on the GC.

Cheers

Padu

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 7:08 PM

This is also a problem for biological eyes. When we move our head and body the eyes balls move to compensate. When the eye balls move the data is ignored.

If you watch a bird of prey hovering, its head is still relative to the earth while its body moves about. Hold a chook and move it sideways and the head stays fixed.

Wave your hand back and forth in front of your face and it is "blurred" but we seem to cope with that ok.

You might find it practical to simply stop the robot, take an image, process it, and make the next move. I think you will also find that the bluring problem is more in relation to the camera moving sideways (such as when the robot rotates) not when it is looking in the direction the robot is moving.

Get some real data. Take some images while your robot is moving about. See how bad the blur is when it is moving in a straight line at the speed your robot moves.

I presume here you are thinking about vision for navigation, (positional recognition?) and obstacle avoidance?

I have not used IR laser light, I am not even sure a web cam can "see" IR light. I think some have IR filters so as they are not swamped out by IR light. Others are designed to just see IR light from hot objects such as cars and people.

You might even consider white light using a lens to focus it. Although it expands with distance the resolution of your camera also decreases to match. White light also has the advantage of returning the color of the object.

But this is not armchair thinking stuff it requires getting real data and looking at the images and thinking about how you can extract the information you need.

In reference to the Padu post if LMS units from SICK that helped a vehicle map the terrain. Are they cheap? It says the "helped" not that they were sufficient by themselves and they used five of them.

I think you need to get some real data and think about how you are going to process it so you are not trying to solve an imaginary problem or failing to solve a real problem you didn't know existed.

-- JC

- M
- mlw
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 7:52 PM

To a point, if you watch someone's eyes when they move, they stay focused on one object as they move, and jump and refocus. Try it yourself, it takes effort to not to do this.

Obviously objects that are moving fast will be blurred, but things moving at a reasonable speed need not be. The current field of inexpensive cameras do not behave "fast enough." If I may use a photographic analogy, say the eyes have an ISO speed of, say, 50, a lot of web cams have a speed of, lets say,

1 or 10. (And of course, the effective shutter speed follows)

I have no benchmarks to test this, but this is what I see when I process images. There is no reason a slow moving object should be blurred.

As a start, yes.

You should get a web cam and point your IR remote control at it. It comes through clear as day as "white" light.

Actually, the advantage of an IR LED is that color is not returned.

True, but I have done an amount of looking at images taken from my stupid logitech quick cam, enough to theorize about what I want to try next.

I have done a bit of thinking (and investigating with my quick cam) about it, and the first problems to be solved is navigation and object avoidance. The IR laser should help this, it would be sweet if we could get an IR laser grid projector, but I digress.

- G
- Gordon McComb
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 8:24 PM

The concept you're talking about is called structured light, and it's been around for, oh, decades. There's plenty of data out there, searchable via Google, and a number of patents, which are available at the uspto.gov site. Variations include scanning a laser beam with a mirror or polygonal mirror, splitting it with a diffraction grating (the basis of CD focusing, research on Foucault), using an animorphic lens, and several others. A number of factory automation systems use this approach for stereoscropy, and it's a method that has applications in medical imaging. There are commercially-available DirectShow filters that rely on coded light structures of one shape or another. Somewhere there's probably a page on how someone did their own filter.

Do a search on Google for structured light vision and you'll find thousands of hits. Most texts on machine vision will discuss the many approaches as well.

-- Gordon

- M
- MR Robot
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 9:35 PM

Most webcams have a filter for infared light, a small amount of hacking can remove it

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 9:44 PM

mlw wrote:

...

Well blow me down you are right I had never actually tried that before :)

Why? Color is invariant to position and rotation and many things can be identified by color alone.

Or do you mean the advantage is they don't absorb color the way a red laser does?

Is IR evenly reflected despite the color?

...

If you can get a cheap IR laser maybe the lens used in the red laser pens for making a line would work? That would be a start. Have you thought about what you would do with the line data and how you would use it to recognize a position?

-- JC

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 9:45 PM

Gordon McComb wrote: ...

I actually don't like the structured light idea for aesthetic reasons :-)

As for Direct Show I would most likely need to get up to speed with VC++ .NET and get some good tutorials on using Direct Show.

I just googled Direct Show tutorials went to the msdn site and it was unclear what to do with the code snippets and vague descriptions.

However it would make any software Windoze dependent.

I have been looking at learning Java which seems to have a lot of media support but is it fast enough for my own processing code?

-- JC

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 9:47 PM

My old monochrome ccd based Connectix camera used an IR filter so as not to be swamped by the IR outdoors.

-- JC

- J
- john.orlando
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 10:09 PM

If you're looking for a low-cost, fast embedded vision system to experiment with, I would suggest the AVRcam...check out

formatting link

. It is $99 for a basic kit which includes everything you need (camera, embedded board, software that runs on your Windoze/Linux PC or Mac, and cabling). The resolution is reduced from what you need (its only 88 x 144), but may be sufficient for what you need. It process and tracks color blobs at

30 frames/second, so your real-time processing should be no problem.

Oh yeah...and its all open-source, so you can dive into the code and fiddle to your hearts content:

formatting link

Standard disclaimer: I developed the AVRcam but have had quite a bit of positive feedback regarding its performance. At the very least you may get some ideas of spinning your own system from scratch. I'm also interested in hearing about other people's projects in the low-cost embedded vision arena, so keep us posted!

Regards, John Orlando

formatting link

- G
- Gordon McComb
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 10:12 PM

There is little to choose from. There is one (older) book on DirectShow, but that's about it. The market is VERY small for this, so it's not the kind of thing book publishers will fall over themselves for. There are some FAQs and mini guides at some of the Microsoft MVP sites -- The March Hare has one -- you can find it by going through the DS-related Microsoft groups.

If you're a book learner, DirectShow can be very frustrating. It's mostly an architecture for people with infinite patience and a knack for experimenting. I have one of those qualities...

-- Gordon

- G
- Gordon McComb
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Feb 1, 2006 10:28 PM

More for image rendition, likely. Skin looks REALLY bad under IR light. Veins show through, and skin looks mottled. With a color imager, the color tone of skin can look pasty.

For reasons I won't go into here, more and more cameras now have an integrated filter as part of the imager. It cannot be removed. The very cheapest IR security cameras are, by design, IR-sensitive and don't have the filter. But their image quality tends to be really, really bad. I've tried about a half dozen of them, and they're all useless for vision.

For anyone considering a stationary IR laser on a robot: it's a dumb idea, not legal (in the US) if you demonstrate it in public, and (usually) not allowed if you enter the robot in competitions. By federal law, the output of a IR laser, especially one that is not scanning, must not be exposed where it could enter a person's eyes. The danger here is obvious. There are plenty of sites and FAQs that talk about laser safety.

Visible light laser is safer, and just as good, if not better because under IR all the pixels in an imager will pick up at least some light, regardless of the color filter in front of them. For a color camera IR light is not "colorless."

-- Gordon

- D
- D Herring
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Feb 2, 2006 12:38 AM

Double-posted from the "$75 vision system" thread:

Here's my current setup for vision on a PC.

- Use a webcam - Logitech 4000Pro's are a good model near the specified price point.

- Use the Intel's OpenCV to import the pictures on a PC.

formatting link

- Use CMVision to quickly find colored objects

formatting link

- Use Qt for a nice GUI interface.

formatting link

- Use MinGW for a MSWindows compiler

formatting link

For a robot, the steps are similar. We have some custom camera to DSP interface boards at my school. Reading in the images doesn't require any fancy library. I will probably port CMVision or something similar to C for the DSP this semester.

Later, Daniel

- M
- mlw
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Feb 2, 2006 2:18 PM

What's really fun is using a bunch of high output IR LEDS as illumination for a "night vision" webcam.

Well, actually, very little color info is returned.

Using color to identify things is tricky, if the lighting varies, i.e. illumination with an incandescent light, fluorescent light, or even candle light, or even if the position of the sun and the weather all affect color. (I used to do a lot photography and contracted at Polaroid for their electronic camera) The issue of color mapping is huge in imaging, you can't rely on color at all unless you have control over illumination and color response of the imaging device.

To use color reliably, you would have to do an amount of imagine processing on the image just to normalize the RGB data into some known color space.

I haven't done any spectral analysis, but it looks almost monochrome in that the light colors are reflected lightly (more), and dark colors are reflected darkly (less).

I don't see why they wouldn't. "IR" is just light that our eyes can't see.

Say, we have a 320x240 camera, we would have 320 Y references, one for every X or horizontal pixels. Take this data and normalize it to zero, or treat it like a series of differentials, something along those lines.

The rest is all vague and conjecture:

It should be possible to hash this data into a really reduced quick search format that can be close with a probability of error.

Ideally, if the hash algorithm is good, the same view from acceptably different distances and/or perspectives should be the same. (That's the hard part)

- J
- jboothbee
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Feb 2, 2006 9:46 PM

why would this be difficult?

in greyscale image processing, you use a threshold to reduce to monochrome

16 grey levels to monochrome:

threshold = 8 for y =1 to height for x=1 to width if pixel (x,y) > threshold then pixel(x,y) =1 if pixel (x,y) < threshold then pixel(x,y) =0 next x next y

if you have 256 colors, and want to "normalize" colors, why not:

color red = 0 to 16 color green = 17 to 32 color blue = 32 to 48 color yellow = 49 to 64 color orange = 64 to 80

etc... depends on how the color's of your image format work...

for y =1 to height for x=1 to width if pixel (x,y) < green then pixel(x,y) = red if pixel (x,y) < blue then pixel(x,y) = green if pixel (x,y) < yellow then pixel(x,y) = blue if pixel (x,y) < orange then pixel(x,y) = yellow next x next y

so different levels of red = RED different levels of green = GREEN

Jen

- M
- mlw
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Feb 2, 2006 10:40 PM

Well, it *isn't* difficult if you know from what you are converting. As mentioned, the lighting varies color dramatically, you would have to have some way to calibrate instantaneous color.

The hard part is figuring out the calibration.

Yea, that fine but really slow. REALLY REALLY slow. Your best bet is to use lookup tables:

for(int y=0; y < MAX_Y; y++){ for(int x=0; x < MAX_X; x++){ red_ref[y][x] = translate_red[red[x][x]]; green_ref[y][x] = translate_green[green[y][x]]; blue_ref[y][x] = translate_blue[blue[y][x]]; } }

This way, you avoid the conditional expression for each color in each pixel. If you have 640x480 pixels, and a conditional (conditional expressions usually take longer because predictive caching of processors evaluate them differently than non-branching instructions) need to takes about 0.0000001 seconds, think about it. (640*480*3*0.0000001) is about 1/10 of a second. Non branching instructions go faster because they can be pipelined. Then again, I'm talking C/C++ not java or perl.

The brute force of translating the pixels is a challenge, but the *real* challenge is figuring out what those translation values are.

What happens when a light red reflects fluorescent light vs when it reflects incandescent? Fluorescent light is typically very "white" while incandescent is typically fairly yellow (YMMV). The weather makes a difference to, cloudy weather vs sunny weather change color as well.

In photography you usually use blueish flash, and the flash overrides the ambient light and provides a more consistent lighting.

Color is a huge problem, especially if you want to do something with it. Comparative color may be something to look at, but right now I will settle for monochrome.

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Feb 2, 2006 10:57 PM

...

I am aware of the problems you mention above.

The human vision system of course uses neighbouring colors to decide what color something is thus yellow (in RGB values) can look brown, and shade also is a relative value not the absolute pixel value.

Yet color can be used, see D Herring post below, "Use CMVision to quickly find colored objects"

formatting link

In an experiment I did I used the R:G:B ratio to locate (blob and thus outline) balls of different colors. The light source was constant and I guess it might not have worked in sunlight or another type of light source without a different set of ratios.

-- JC

- M
- mlw
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Feb 2, 2006 11:39 PM

That is sort of the problem, variable light sources create different color ratios, if you have more yellow light, your objects look yellow. If you had some sort of reference in the camera view at all times, then you could do it.

- A
- aiiadict
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Feb 3, 2006 12:47 AM

we don't need no stinkin structured light. If a human sees a GREEN apple under intense RED light, it is a RED apple! we can pick the apple up and examine it under the red light... it is still red.

when we turn off the red light, and switch to white light instead, we have a green apple!

You could have a "reference" from the objects in the field of view perhaps.. IF pixels = VERY BRIGHT then object= light source

If lightsource present, other colors in view = other colors + light source color

otherwise the objects are the color they appear to be, under the non-white light.

Rich

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Feb 3, 2006 1:05 AM

...

A white reference card to view?

-- JC