What do you want your robot to do?

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 2:21 AM

I have done enough visual processing of images to have some idea what *I* can and cannot do with a video camera. We are not talking "human vision" anymore than someone using ultrasonics is talking dolphin or bat "vision".

The only problem I have had, mainly because I am not a professional programmer, is getting the code to grab images from a video camera fast enough to be useful for a robot operating in real time.

[...]

Computer vision is also being used to read car number plates and road signs. Something that might be useful for a robot?

-- John

- B
- Brian Dean
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 2:23 AM

I'm not so sure about this - don't underestimate the processing required to analyze the the sense of touch. Touch can determine a huge amount of information about the environment including temperature, surface type, texture, complex shapes, etc. Just think about what it means to come to the conclusion that a surface is rough or smooth, #60 grit vs #220 grit vs paper smooth vs brushed metal vs polished. Our hands can sense all of these with very good accuracy. That's just texture, and only a highly limited sample. Think about holding a handful of sand, marbles, etc. The amount sensory input from skin, and the processing to analyze it, is quite large.

A fairly deep "understanding" of the surface can be determined as well

- texture is a big factor, but also there is the amount of thermal conductivity and density which are taken into account as well. We can easily differentiate a steel ball bearing from a wooden ball not only by weight, but because the steel draws heat from our skin which we can feel because the object feels cold.

Additionally, we can assess pliability - is the surface hard or squishy? Or liquid? Is the surface hard, but has some give? How about combinations like wet sand?

When you were a kid, did you ever walk through mud barefoot where the mud squishes up between your toes? What a sensation!

These are all pretty easy for us, but I don't think it is because the processing is easy. It's easy in the sense that vision is easy for us. But for a computer, it's still a very difficult problem.

Even so, I think such an artificial skin would also present some serious complexities in analysis, as well as data bandwidth issues, just like we have with cameras input today. Maybe not to the level of vision, but I certainly don't think it can be said to be easy or even significantly easier.

It's relatively easy for a human to differentiate a puf of wind on our arm vs a spider crawling on our arm vs leaf brushing up against our arm from the tree we just walked past. If such an artifical skin that can provide the level of input of human skin was available right now, and assuming we had a high bandwidth interface to get the data into the computer for analysis, would the analysis of these signals be all that much easier to process than vision data to make these precise differentiations?

I guess this is a demonstration of where the power of the human brain architecture begins to show how weak our naive and single-threaded if-then-else type of logic is at analysing these raw data streams. Raw computing power aside, we haven't been able to really come close to what our biological built-in wet-ware processes are able to achieve in terms of analysing our environment, let alone sense it.

Certainly our ability to truly make sense of the world based on our crude sensory input devices and crude analyses is very limited.

-Brian

- G
- Gordon McComb
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 4:44 AM

I didn't really imply that replicating human touch would be easy, but "using the information" from these sensors is relatively straightforward. You can judge size of an object by determing how many cells are activated; shape by the arrangement of the activated cells; weight by the relative pressure on the cells; and "smooth" or "rough" by the variety of readings from adjacent cells, and so on.

There are ongoing experiments with fiber optic "fabrics" and organic transistors (grown on a sheet) that, while not yet duplicating the number of sensors per square inch of human skin, shows promise. But we're not yet even close to the point of perfecting "artificial skin," so much of this is academic.

By comparison, we now have cameras that provide a higher resolution than the human eye (though not the dynamic range). But what to do with that image...that's the problem!

Look here for an example of some ongoing haptic research. One mm spatial resolution isn't great, but I don't think you can say they are having any trouble analyzing the data that comes from the sensor:

formatting link

-- Gordon

- G
- Gordon McComb
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 5:09 AM

These are also limitations in the analysis of the data, which is what I was talking about. An ultrasonic sensor is actually far simpler than the echolocation in bats. But in any case, it's the little bat's brain that synthesizes this information to deduce whether there's a wall ahead of it, or a juicy praying mantis.

The human brain works mostly through recognition, not measurement, so most of the machine vision systems in use today are not remotely like how the brain works. We might design a robot eye to look for an orange ball, because orange is a specific color we can determine by looking at the RGB values of pixels. Balls have the same profile from any direction, so we can readily judge the distance from the size of the ball, though this assumes we know the diameter of the ball. Neat stuff, but as I said, far more crude than what a megapixel full color image might otherwise lend itself to.

Possibly, but this is more like optical character recognition (in fact, it is exactly like optical character recognition). It works because the system is looking for known two dimensional shapes, in fairly well established orientations, against high contrast backgrounds. In other words, these things are made to be highly visible, by man (or machine). It would be great if everything were so clearly dileneated.

-- Gordon

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 8:55 AM

I would say brains measure also, in order to recognize.

Certainly a modern video camera can have a lot more data than any machine vision system will use. But that's good. It makes it easier to decide what to throw away until all that is left is the bit you want. Such as your orange ball. With high resolution that ball will still be "round" at greater distances and not be reduced to say 4 square pixels.

Another example is stereo vision where the higher the resolution (particularly the horizontal resolution) the better.

The frog for example has fairly good eyes but not much of a brain. The higher the resolution the better it can "see" those flying bugs, even though I read its vision is essentially binary in nature.

Perhaps one last example for high resolution is texture. Different surfaces may have different textures and therefore that becomes a possible "feature" for recognition and delineation.

And just as we use reflectors, white (yellow/red) lines and high contrast signs to help the motorist there is no reason we can't do the same for our nearly blind robots.

-- John

- M
- mlw
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 3:35 PM

Comparative not quantitative, look at "forced perspective" effects where something can be closer and appear bigger when the clues of its distance are obscured.

- S
- steamer
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 3:55 PM

--Robotic rat catcher. I'm working on something using a Stamp to catch 'em but it would be better if it could "attack" rather than just "defend".

- A
- aiiadict
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 4:35 PM

I was talking with a guy one day, he asked me what I did for hobby. I told him I enjoy making robots...

He had an idea for a bio-powered robot.

Put a cage on wheels. Put cat food in the cage. Make the robot roam around looking for cats. Find a cat, sit still. The cat walks in the cage to get food. The door closes, and spikes in the top of the cage come down on cat. A kitty-cat eating robot.

I guess that's what he wanted HIS robot to do.

Rich

- J
- JGCASEY
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Wed, Apr 27, 2005 6:12 PM

Indeed it is all comparative and the same must be true for machine vision if it is to make sense of the 2D data collected by a camera with which it is to construct a visual "representation" of "what's out there".

Machine vision involves the same data as human vision. It can suffer the same illusions of size as a human visual system if it uses the same kinds of assumptions in estimating size or any other visual value such as shade, color or 3D shapes from 2D data.

The 2D data has many interpretations as to size and shape. Is it an oval or a circle seen at an angle? An absolute pixel value has many interpretations. Is that "gray" pixel value of 128 part of a white square or a black square? Will we classify that RGB value as yellow or brown?

Vision is an "ill posed problem". Machine vision would have to make use of assumptions as to how interpret those absolute measurements just as we do. Illusions unmask those assumptions.

-- John

- B
- Brian Dean
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Thu, Apr 28, 2005 3:03 AM

Agreed. However, using the information to the degree and effect that a human makes use the information is not at all straightforward and approaches the complexity of vision, IMHO. I'm refering to more subtle uses of touch that we make use of and rely on every day - not just simply to answer the question "is there something there". We use touch in far more sophisticated ways than that, and in combination with our other senses.

To be honest, what I see in this press release is the equivalent of snapping a photo with a camera, except with a grid of presure sensors instead of a CCD. I do not see any analysis of that "image", just the image itself. Am I missing something?

But this is a great advancement to be sure, and sure to take the mechanical sense of touch to the next level. But I really think this is quite a ways from duplicating what real human skin and associated nerves can sense, i.e., this appears to sense switch closure only, no temperature. Also, it's not clear if any gradients of pressure are reported or not, or just on/off data for this particular device.

But I do agree that, at a crude level, this "touch" image is more immediately useful than a visual image for computer analysis since there is less possible ambiguity in the information it represents.

-Brian

- T
- Tim Polmear
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, Apr 29, 2005 12:59 PM

- M
- Mitch Berkson
  
  Contact options for registered users
Vote on answer
posted
18 years ago

Fri, May 6, 2005 3:24 AM

Carnivorous robot:

formatting link

Mitch