::scr speech recognition?

simon wistow scr@thegestalt.org
Thu, 15 Nov 2001 12:21:39 +0000


> would anyone here have inspiration on how she could interface this to
> vector graphics, with speech input/output in something resembling
> realtime, without too much work from the c/java whiz who might be
> available?

Very simply you could run the speech through a winamp/xmms plugin - most of
them are open source and you could probably code in a hook that would take the
guessed speech and pump it into the visualisation.

Most of the plugins are of the psychedelic swirly pulsy kind which might be
pedestrian for this sort of thing but there are some interesting ones out
there and AFAIK plugins aren't that hard to write either. For certain values
of hard. Freeamp is even easier.

For example their are various dancer plugins - have people on screen dance to
your voice. http://www.na.linux.hr/projects/xplsisnjasp/ lets you plug in in
hardware light shows and stuff.

The other thing to do might be to play around with something that celia and I
saw at the Whitney in NY - take the text - do a google for that phrase and
then either projects it on a wall or generates a web page or and image using
something like webcollage (http://www.jwz.org/webcollage/). This would be easy
to do using Imlib or ImageMagick or GD and using something like Evas
(http://www.enlightenment.org) you could do animation by fading or blurring
out previous layers or scrolling them off screen. Java could also do this. The
other thing might be to look at OpenGL - it's actually fairly easy to program
and texturemapping text and images onto flat polygons and then spinning,
stretching, scrolling and generally distorting them wouldn't hard at all.

If she needs any help then I'm looking for something neat to work on :)


As for speech recognition - I've seen Kevin do his demo at TPC and it was
amazing, Sphinx has matured a lot over the last year or so but yes, getting i
to work means wrestling with shitty *nix drivers but it does work under NT as
well. 

http://mambo.ucsc.edu/psl/speech.html has lots of links.

http://freespeech.sourceforge.net/ is GPL'd

It might actually be worth buying one of the commercial ones - they're not
*that* expensive and Dragon Dictate and Viavoice were quite good last time I
tried them.

</brain dump>




-- 
:  everything after here is irrelevant