::scr speech recognition?

jo walsh scr@thegestalt.org
Thu, 15 Nov 2001 11:52:29 +0000 (GMT)


this is partly desktop dipsy 2; a designer friend of mine is looking at
doing a friendly, graphical/aural version of a dadadodo/eliza-style inane
agent. i'm pushing festival at her for reasonable speech synthesis,
but she's been told the voice recognition aspect of her idea is "(almost)
impossible".
 
on a quick google for open source speech recognition i found sphinx:
http://www.speech.cs.cmu.edu/sphinx/ - speech.cs.cmu.edu is kevin lenzo's
patch, isn't it? yes. sadly enough, the next hit is a piece of ms/mundie
propaganda:

"There is an equally important tradition of commercial companies
 having the opportunity to benefit from and apply this public
 knowledge, including by developing commercial products that are
 protected by IP rights. There are many examples of this, including
 the many products that grew from research in the space program
 and the advances in speech recognition technology that followed
 work done at pre-eminent institutions such as Carnegie Mellon."[0]

poor lenzo. this is a bad side-track, though.

i recall alex playing, utterly fruitlessly, with a speech-to-text program
that comes free with SuSe - presumably sphinx-based? unrecall. i'd have a
play but my microphone battery is flat and the sound drivers on my laptop
are shot to fuck.

as liz's idea would be more desktop dadadodo than dipsy, focused on art
rather than utility, the speech recognition wouldnt have to be perfect, as
long as it worked in some vague sense. 

would anyone here have inspiration on how she could interface this to
vector graphics, with speech input/output in something resembling
realtime, without too much work from the c/java whiz who might be
available?

z

[0]http://www.microsoft.com/presspass/exec/craig/05-03sharedsource.asp
--
<robin> they're taking the elephants away from London zoo
<robin> so my electric ones are needed more than ever