Voice Controlled Robosapien

Translate This Thread From English to

Threaded View
Yesterday I released the first version of Robosapien Dance Machine with
Voice Control (Version Alpha - 2.0.1). You can now control a Robosapien with
just your voice. The software still has all the original powerful scripting
abilities for creating Robosapien movies, dances, and performances.

What's really great is that the voice recognition is being provided by the
wonderful CMU Sphinx 3.5 speech recognition engine.

You can see a short movie of my Robosapien responding to my voice commands
here:

http://www.robodance.com/

You can find the Robosapien Dance Machine files here:

http://sourceforge.net/projects/robodance

Technical support for the program can be received here:

http://www.roboburp.com/phpBB2/

It's an Alpha version so it's most likely still got some wrinkles in it. I'd
appreciate a bug report if you find any problems. Currently it requires the
superb USB UIRT from Jon Rhees:

http://www.usbuirt.com/

I'm going to devote the month of April to supporting more infrared
transmitters, especially some cheaper ones, as many of you have requested.

Thanks,
Robert Oschler
Robosapien Dance Machine
http://www.robodance.com/





Re: Voice Controlled Robosapien

That looks fake, you must have the remote where the camera doesn't see
it and you act like it is voice recognition.

Make a video with the remote by him and then do your speech recognition.


Re: Voice Controlled Robosapien



That would certainly be proof positive for a google poster! ;-)

Re: Voice Controlled Robosapien



Simpler proof.  It's an open source project.  Download the program and try
it.  You don't need an infrared transceiver or even a Robosapien to test the
voice recognition (it will just print the recognized text on the bottom of
the screen).

Thanks,
Robert Oschler
http://www.robodance.com/



Re: Voice Controlled Robosapien


Come on, Robert, admit it! It's clearly a fake. Nothing in robotics
works this well!! <g>

(Seriously, this is awesome. VERY nice job on this.)

-- Gordon

Re: Voice Controlled Robosapien



Gordon!

How are you?

Thanks for the compliment. Just wait till you see the next version coming
out on May 1st.  It's got even more.

Check out my thread on the HandyCam vision stuff.  I keep waiting for your
book on machine vision done on a budget. :)

Robert,

--
Thanks,
Robert
http://www.evosapien.com/
Robosapien Hacks & Tricks



Re: Voice Controlled Robosapien

Detailed hardware interfacing and vision software
description, please!

I have some great articles on vision that I put
together, with tons of drawings, charts,
diagrams, and psuedo-code so it's easily
translatable to any language.

I did the same for the navigation program.
If only I could find a publisher....

Rich



Re: Voice Controlled Robosapien



Do you have a web site?  I'd like to see some those items.

Thanks,
Robert


--
Robert Oschler
http://www.robotsrule.com/phpBB2/
Robot & Android Discussion Forum



Re: Voice Controlled Robosapien


The hardware for machine vision is fairly cheap. No problem there. It's
all in the software!

I am currently working with commercially-available DirectShow filters
that are intended for machine vision in factory automation-type
applications. Beats reinventing the wheel, and I can apply all of my
time in working with the data itself. Obviously not a Linux or Mac
solution, though.

My immediate application isn't actually for robotics, but for doing
certain image analysis of motion pictures, in realtime or better
(preferably faster than 24 fps). But many of the same techniques can be
used for robotics. Curiously, most people who have applied machine
vision ideas to video/film stopped at shot change detection, or limit
their systems to highly controlled studio environments for motion
tracking (CG stuff). There's a whole lot more out there.

-- Gordon

Re: Voice Controlled Robosapien



Robert Oschler wrote:

Gordon McComb wrote:

And those that know how to do it aren't telling.

What we need is ROBOT BASIC (or ROBOT C). This could
be written in C++ using DirectX etc and compile for
Windows and Linux but allow those without a degree in
programming to tailor the hardware to their own ideas.

Actually I think Linux would be the most suitable
from what I have read. A Robotic interface to the
Linux kernal?

Essentially provide a simple means of reading USB
ports and grabbing images from webcams at a reasonable
speed.

In other words make it as easy to program a MB as it
is to program a PIC using BASIC or C.

-- John


Re: Voice Controlled Robosapien


For native Linux, I imagine the typical DirectShow filter could be
revised, as it's pretty much just standard C, but it's the idea of
building "graphs" out of multiple filters that makes DirectShow so
useful for this application (and the fact that DirectShow will
automatically interpose necessary colorspace and decompression filters
as needed, saving you the hassle of hand-building each graph from
scratch). Does Linux offer a similar architecture?

Actually, DirectShow is kinda kludgy, and hard to use in VB or even C#
(it's not COM compatible), but I hear Longhorn will use a new
architecture, and will rely on managed code through .NET. Performance
issues and OS dogma notwithstanding, this ought to bring the world of
machine vision closer to mere mortals, but OS hooks through existing
DirectShow filters will have to be revised. I figure a 3-5 year
timeline.


Personally I feel a high frame rate is very useful. Low frame rates,
especially at low shutter speeds, just creates blurs. Hard to do
anything meaningful with these. The work that I'm doing relies on
reasonably high resolution (but still standard def) video, at full 24fps
or 30fps speeds. The limiting factor I'm up against is that the bulk of
the video processing is on film that's been transferred to video. For
most scenes, a film camera takes a fairly long exposure for each frame,
so motion blur is common. Adds a layer of complexity in trying to figure
out what's happening. OTOH, you can sometimes use the length and
direction of the motion blur to determine velocity, etc, given the right
circumstances.

Webcams are okay, but I think a better approach is a fairly good camera
with an excellent lens, and full video speeds, over USB 2.0 or Firewire.
Some of the mini-ITX boards have built-in video inputs, as well. Some
interesting things can be done with a high resolution BW medical camera.
(A really cool arrangement might be two cameras pointing at the same
scene, or through a beam splitter. One could be color, and the other
high res BW.)

-- Gordon

Re: Voice Controlled Robosapien



The direction I am coming from is, "what is possible
as regards using vision in a hobby robot"?

This would require something like a mini-ITX board,
even if the i/o board uses its own uCs, and the web
cam is the cheapest easily available option. It is
sufficient for the needs of a simple robot.

So the next question is "what is the cheapest setup that
would be accessible to the widest range of hobbyists?

Keep in mind that a hobbyist may not have a degree in
computer programming and it is not reasonable that they
should have one. The only choice I see is VB, despite its
cost, or Java. Even with these languages they need to have
access to routines to grab images from a webcam and use
the USB port to communicate with the i/o card **and be
shown how to use these routines** in their own programs.
Even those who have played with VB may only have been
able to reach a certain level with the "How to Learn
Visual Basic in 21 days".

When I read "teach yourself Visual C++ 5 in 21 days"
it didn't really explain anything. Just a set of recipes
to follow to do a limited number of things using the
Wizard things.

Of course there is no reason why a professional programmer
could not provide a C++ shell in which a C programmer could
access the camera and USB port. I have been using such a
shell with VC++ but it is not suitable for general use as
it limits you to Windows and requires you buy VC++ also it
is very slow...


-- John


Re: Voice Controlled Robosapien


Video for Linux is basically a typical unix device interface, based on
ioctl() calls to set various parameters, and read() to get frames. This
makes it several orders of magnitude easier to use than direct show, but
it doesn't offer the directed graph architecture that DirectShow does.

Really, I never cared much for the DirectShow architecture -- I usually
end up doing any required  color mapping to some kind of useful YUV or
RGB format close to the source, and any processing happens subsequently.
If I want a piping architecture, it's easy enough to do it myself, which
has the added benefit of making the code significantly more portable. It
also doesn't lock me in to all the indirection that the DirectShow API
requires. Basically, I usually have a single FrameProducer class, an
optionally ref-counted Frame class, and I just run with it from there.

One advantage that DirectShow has over VFL is that if you want a
configuration UI for a device, you can just invoke the dialogs
associated with the device. VFL doesn't offer this, since it generally
adheres to the "interface not bound to implementation" philosophy (which
is overall a good thing).

The only significant drawback to video capture on linux really has to do
with setting up devices; there's support for a lot of capture/webcam
hardware for Linux, but the hardware usually tends to be older, since
very little ships with linux drivers. You need to take some care when
choosing a capture card or webcam, and be wary of newer hardware. It's
generally worth spending some quality time with Google before making a
purchase.

Cheers - m
--
(Replies: cleanse my address of the Mark of the Beast!)

Teleoperate a roving mobile robot from the web:
http://www.swampgas.com/robotics/rover.html

Coauthor with Dennis Clark of "Building Robot Drive Trains".
Buy several copies today!

Re: Voice Controlled Robosapien

Come on guy's this is fake
I could tell by his vocal chords that it is fake.

Also Gordon McComb you of all people, I would not think that you would
think that this is real.

Come on now?


Re: Voice Controlled Robosapien


Have you downloaded the code and/or precompiled executable yet from
Sourceforge?

The speech recognition engine is Microsoft's -- comes free with Windows
-- and IR transceivers aren't exactly new technology. I'm not sure what
it is you're thinking is faked. I just like how Robert has put
everything together.

-- Gordon

Re: Voice Controlled Robosapien


Sorry..my mistake! CMU's Sphinx. (So that's why it loaded so fast! <g>)

-- Gordon

Re: Voice Controlled Robosapien

That does not mean anything
just because there is the source code doesn't mean he did it come on
now.

If he has the sapien with VR why doesn't he just make another video
with him showing the saipen and pointing the camera 360d and show that
know one else is around and point the camera at a angle that only
show's him and the sapien and then he does the VR thing agian.


Re: Voice Controlled Robosapien

Also Gordon McComb I am the plastic guy that called today about the
color of the plastic.


Re: Voice Controlled Robosapien



Didn't you say the below would be proof for you???  ;-)



Re: Voice Controlled Robosapien


I'm still not sure what it is you're doubting. Are you doubting speech
recongition (it's not voice recognition) technology today can't
accurately reco a collection of words from a single speaker? Or that the
speech can't trigger macro commands through an IR interface? What's so
hard to believe, other than someone would take the time to want to share
this with others?


I dunno...he has better things to do with his time? Or maybe he's camera
shy? (Have you ever seen a picture of me?)

I plan on playing with this over the weekend, and I'll let you know.

-- Gordon

Site Timeline