Voice Controlled Robosapien

Yesterday I released the first version of Robosapien Dance Machine with Voice Control (Version Alpha - 2.0.1). You can now control a Robosapien with
just your voice. The software still has all the original powerful scripting abilities for creating Robosapien movies, dances, and performances.
What's really great is that the voice recognition is being provided by the wonderful CMU Sphinx 3.5 speech recognition engine.
You can see a short movie of my Robosapien responding to my voice commands here:
http://www.robodance.com /
You can find the Robosapien Dance Machine files here:
http://sourceforge.net/projects/robodance
Technical support for the program can be received here:
http://www.roboburp.com/phpBB2 /
It's an Alpha version so it's most likely still got some wrinkles in it. I'd appreciate a bug report if you find any problems. Currently it requires the superb USB UIRT from Jon Rhees:
http://www.usbuirt.com /
I'm going to devote the month of April to supporting more infrared transmitters, especially some cheaper ones, as many of you have requested.
Thanks, Robert Oschler Robosapien Dance Machine http://www.robodance.com /
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
That looks fake, you must have the remote where the camera doesn't see it and you act like it is voice recognition.
Make a video with the remote by him and then do your speech recognition.
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
wrote:

That would certainly be proof positive for a google poster! ;-)
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload

Simpler proof. It's an open source project. Download the program and try it. You don't need an infrared transceiver or even a Robosapien to test the voice recognition (it will just print the recognized text on the bottom of the screen).
Thanks, Robert Oschler http://www.robodance.com /
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Robert Oschler wrote:

Come on, Robert, admit it! It's clearly a fake. Nothing in robotics works this well!! <g>
(Seriously, this is awesome. VERY nice job on this.)
-- Gordon
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload

Gordon!
How are you?
Thanks for the compliment. Just wait till you see the next version coming out on May 1st. It's got even more.
Check out my thread on the HandyCam vision stuff. I keep waiting for your book on machine vision done on a budget. :)
Robert,
--
Thanks,
Robert
  Click to see the full signature.
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Detailed hardware interfacing and vision software description, please!
I have some great articles on vision that I put together, with tons of drawings, charts, diagrams, and psuedo-code so it's easily translatable to any language.
I did the same for the navigation program. If only I could find a publisher....
Rich

Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload

Rich,
Do you have a web site? I'd like to see some those items.
Thanks, Robert
--
Robert Oschler
http://www.robotsrule.com/phpBB2 /
  Click to see the full signature.
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Robert Oschler wrote:

The hardware for machine vision is fairly cheap. No problem there. It's all in the software!
I am currently working with commercially-available DirectShow filters that are intended for machine vision in factory automation-type applications. Beats reinventing the wheel, and I can apply all of my time in working with the data itself. Obviously not a Linux or Mac solution, though.
My immediate application isn't actually for robotics, but for doing certain image analysis of motion pictures, in realtime or better (preferably faster than 24 fps). But many of the same techniques can be used for robotics. Curiously, most people who have applied machine vision ideas to video/film stopped at shot change detection, or limit their systems to highly controlled studio environments for motion tracking (CG stuff). There's a whole lot more out there.
-- Gordon
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Robert Oschler wrote:

Gordon McComb wrote:

And those that know how to do it aren't telling.
What we need is ROBOT BASIC (or ROBOT C). This could be written in C++ using DirectX etc and compile for Windows and Linux but allow those without a degree in programming to tailor the hardware to their own ideas.
Actually I think Linux would be the most suitable from what I have read. A Robotic interface to the Linux kernal?
Essentially provide a simple means of reading USB ports and grabbing images from webcams at a reasonable speed.
In other words make it as easy to program a MB as it is to program a PIC using BASIC or C.
-- John
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
JGCASEY wrote:

For native Linux, I imagine the typical DirectShow filter could be revised, as it's pretty much just standard C, but it's the idea of building "graphs" out of multiple filters that makes DirectShow so useful for this application (and the fact that DirectShow will automatically interpose necessary colorspace and decompression filters as needed, saving you the hassle of hand-building each graph from scratch). Does Linux offer a similar architecture?
Actually, DirectShow is kinda kludgy, and hard to use in VB or even C# (it's not COM compatible), but I hear Longhorn will use a new architecture, and will rely on managed code through .NET. Performance issues and OS dogma notwithstanding, this ought to bring the world of machine vision closer to mere mortals, but OS hooks through existing DirectShow filters will have to be revised. I figure a 3-5 year timeline.

Personally I feel a high frame rate is very useful. Low frame rates, especially at low shutter speeds, just creates blurs. Hard to do anything meaningful with these. The work that I'm doing relies on reasonably high resolution (but still standard def) video, at full 24fps or 30fps speeds. The limiting factor I'm up against is that the bulk of the video processing is on film that's been transferred to video. For most scenes, a film camera takes a fairly long exposure for each frame, so motion blur is common. Adds a layer of complexity in trying to figure out what's happening. OTOH, you can sometimes use the length and direction of the motion blur to determine velocity, etc, given the right circumstances.
Webcams are okay, but I think a better approach is a fairly good camera with an excellent lens, and full video speeds, over USB 2.0 or Firewire. Some of the mini-ITX boards have built-in video inputs, as well. Some interesting things can be done with a high resolution BW medical camera. (A really cool arrangement might be two cameras pointing at the same scene, or through a beam splitter. One could be color, and the other high res BW.)
-- Gordon
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Gordon McComb wrote:

The direction I am coming from is, "what is possible as regards using vision in a hobby robot"?
This would require something like a mini-ITX board, even if the i/o board uses its own uCs, and the web cam is the cheapest easily available option. It is sufficient for the needs of a simple robot.
So the next question is "what is the cheapest setup that would be accessible to the widest range of hobbyists?
Keep in mind that a hobbyist may not have a degree in computer programming and it is not reasonable that they should have one. The only choice I see is VB, despite its cost, or Java. Even with these languages they need to have access to routines to grab images from a webcam and use the USB port to communicate with the i/o card **and be shown how to use these routines** in their own programs. Even those who have played with VB may only have been able to reach a certain level with the "How to Learn Visual Basic in 21 days".
When I read "teach yourself Visual C++ 5 in 21 days" it didn't really explain anything. Just a set of recipes to follow to do a limited number of things using the Wizard things.
Of course there is no reason why a professional programmer could not provide a C++ shell in which a C programmer could access the camera and USB port. I have been using such a shell with VC++ but it is not suitable for general use as it limits you to Windows and requires you buy VC++ also it is very slow...
-- John
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Gordon McComb wrote:

Video for Linux is basically a typical unix device interface, based on ioctl() calls to set various parameters, and read() to get frames. This makes it several orders of magnitude easier to use than direct show, but it doesn't offer the directed graph architecture that DirectShow does.
Really, I never cared much for the DirectShow architecture -- I usually end up doing any required color mapping to some kind of useful YUV or RGB format close to the source, and any processing happens subsequently. If I want a piping architecture, it's easy enough to do it myself, which has the added benefit of making the code significantly more portable. It also doesn't lock me in to all the indirection that the DirectShow API requires. Basically, I usually have a single FrameProducer class, an optionally ref-counted Frame class, and I just run with it from there.
One advantage that DirectShow has over VFL is that if you want a configuration UI for a device, you can just invoke the dialogs associated with the device. VFL doesn't offer this, since it generally adheres to the "interface not bound to implementation" philosophy (which is overall a good thing).
The only significant drawback to video capture on linux really has to do with setting up devices; there's support for a lot of capture/webcam hardware for Linux, but the hardware usually tends to be older, since very little ships with linux drivers. You need to take some care when choosing a capture card or webcam, and be wary of newer hardware. It's generally worth spending some quality time with Google before making a purchase.
Cheers - m
--
(Replies: cleanse my address of the Mark of the Beast!)

Teleoperate a roving mobile robot from the web:
  Click to see the full signature.
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Come on guy's this is fake I could tell by his vocal chords that it is fake.
Also Gordon McComb you of all people, I would not think that you would think that this is real.
Come on now?
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Robo1 wrote:

Have you downloaded the code and/or precompiled executable yet from Sourceforge?
The speech recognition engine is Microsoft's -- comes free with Windows -- and IR transceivers aren't exactly new technology. I'm not sure what it is you're thinking is faked. I just like how Robert has put everything together.
-- Gordon
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Gordon McComb wrote:

Sorry..my mistake! CMU's Sphinx. (So that's why it loaded so fast! <g>)
-- Gordon
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
That does not mean anything just because there is the source code doesn't mean he did it come on now.
If he has the sapien with VR why doesn't he just make another video with him showing the saipen and pointing the camera 360d and show that know one else is around and point the camera at a angle that only show's him and the sapien and then he does the VR thing agian.
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Also Gordon McComb I am the plastic guy that called today about the color of the plastic.
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
wrote:

Didn't you say the below would be proof for you??? ;-)

Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload
Robo1 wrote:

I'm still not sure what it is you're doubting. Are you doubting speech recongition (it's not voice recognition) technology today can't accurately reco a collection of words from a single speaker? Or that the speech can't trigger macro commands through an IR interface? What's so hard to believe, other than someone would take the time to want to share this with others?

I dunno...he has better things to do with his time? Or maybe he's camera shy? (Have you ever seen a picture of me?)
I plan on playing with this over the weekend, and I'll let you know.
-- Gordon
Add pictures here
<% if( /^image/.test(type) ){ %>
<% } %>
<%-name%>
Add image file
Upload

Polytechforum.com is a website by engineers for engineers. It is not affiliated with any of manufacturers or vendors discussed here. All logos and trade names are the property of their respective owners.