If set to "true" allows SAPI4 Recognizers to "guess" the current SpeakerProfile, so that
subsequent calls to SpeakerManager.getCurrentSpeaker will return the profile that the engine is
currently using.
Basic audio format converter, which is guaranteed to
convert between any two javax.sound.sampled.AudioFormats - this is unlike the
javax.sound.sampled.AudioSystem conversion,
which sometimes can't provide the needed conversions.
Basic audio format converter, which is guaranteed to
convert between any two AudioFormats provided both are
16-bit - this is unlike the javax.sound.sampled.AudioSystem conversion,
which sometimes can't provide the needed conversions.
AudioSink wrapper for a javax.media playback device, enabling audio data to be
played to speakers etc (avoiding the bugs encountered by the
javax.sound.sampled playback devices when a TargetDataLine is running at
the same time).
AudioSource wrapper for a javax.media capture device, enabling audio data to be
captured from a microphone etc (avoiding the bugs encountered by the
javax.sound.sampled capture devices when a SourceDataLine is running at
the same time).
Writes audio data to a URL using the JMF packages, with content type defined by
one of the FileTypeDescriptor String fields, eg MPEG_AUDIO, QUICKTIME etc
A class which has an internal buffer, which it uses to store data read from
it's source (set using the setSource method) before being sent to it's
sink (set using the setSink method).
A class designed to allow a single AudioSource to supply the same audio
data to multiple AudioSinks (useful, say, to broadcast speech to multiple
remote clients).
Returns false if the AudioFormat cannot be set - eg some Recognizers
may not be able to change their input format, so the CGAudioManager
(which is an AudioSink) for that Recognizer will return false here.
This class opens two RTP steams - one receives audio data from the url "rtp://:12346/audio"
and plays it to the local output device, and the other sends audio data (captured from the
local audio capture device) to the url "rtp://:12344/audio".
This class provides the standard methods of the EngineCentral interface,
but also allows synchronization with the AWT EventQueue to be turned on
or off.
This class provides the standard methods of the EngineProperties interface,
but also allows UI components specific to the Microsoft speech API to be
displayed.
Implements both the FinalRuleResult and FinalDictationResult interfaces,
as described in the JSAPI documentation, but adds two methods allowing
the user to obtain the recognizer's confidence in it's estimate of the current result.
Returns false unless you have selected a Profile and/or Voice, depending
on the mode of this panel - ie, if the mode is SHOW_SYNTHESIZERS | SHOW_RECOGNIZERS
both a SpeakerProfile and Voice will need to be selected for this method to return true.
Used by a server to receive notices that a client has attached or removed itself to or from
an AudioSocketSource, and also to be informed of the bytes transferred to or from a
specific client.
This package provides classes originally intended for redirecting output
from a Synthesizer to a File, a SourceDataLine, a remote client or a custom
AudioSink, and providing audio data to a Recognizer from a File, TargetDataLine,
remote client or a custom AudioSource.
The com.cloudgarden.speech package provides public access
to a few classes which implement
interfaces in the javax.speech packages but add a few additional functions.
The com.cloudgarden.speech.userinterface package provides classes for
drawing a mouth shape as defined by a CGSpeakableEvent
displaying speech engines in an extension of a JTree (a SpeechEngineTree),
with the SpeechEngineChooser class providing various dialogs for selecting speech
engines, SpeakerProfiles and Voices.
Reads audio data from a file, and uses an AudioSplitter to send the
data to a recognizer as well as another file, saving the audio data
in the same format as the recognizer, which may be different from the
input file.
Tests out basic dictation from the default audio device (usually the microphone) -
also demonstrates getting the list of speaker profiles and setting the current speaker.
Tests whether a duplex sound card is being used - keep saying one of the
five commands ("Nice day", "Hello", "How are you" etc) while the computer
is replying - the computer should hear what you said while it was still talking
and reply when it has finished it's current reply.
Demonstrates network transmission of audio data in compressed (GSM) and
uncompressed (RAW) formats - for ease of demonstration incorporates both
server and client since both run on the localhost.
Returns a modal SpeechEngineChooser initialized to display all available Recognizers
that match the reqRec parameter and all Synthesizers that match the reqSyn parameter..
Returns a modal SpeechEngineChooser initialized to display all available Recognizers
that match the reqRec parameter and all Synthesizers that match the reqSyn parameter..
Returns a confidence value from 0 to 100, indicating the degree of
confidence the recognizer has in a certain alternate
result (0 for the set of best tokens).
Returns a confidence value in a range determined
by the engine , indicating the degree of
confidence the recognizer has in a certain alternate
result (0 for the set of best tokens).
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the openess of the jaw.
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the horizontal tension of the lips.
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the total height of the mouth (from min height to max height)
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the total upturn of the mouth (from max downturn to max upturn)
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the total width of the mouth (from min width to max width)
If from a SAPI4 engine, returns the IPA phoneme corresponding to this event,
otherwise, if from a SAPI5 engine, returns the Microsoft PhoneID for this
sound - you will need to translate from the PhoneID to a unicode value.
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the visibility of the lower teeth.
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the visibility of the upper teeth.
If this is a MOUTH_SHAPE event (or a VISEME event after convertToMouthShapeEvent has been called),
returns a number from 0 to 255 specifying the position of the tongue (0 being lowest, 255 being highest).
getTopLip() -
Method in class com.cloudgarden.speech.userinterface.Mouth
Returns the polygon of points in this Component outlining the upper lip.
Tests activating/deactivating rules and grammars in response to spoken commands - one grammar
contains the words "alpha","bravo","charlie","delta", but only one rule is active at a time
and the other grammar contains "one","two","three","four" - again, only one rule
active at a time: also, only one of the grammars is active at a time - switch between
grammars with the command "switch" and activate the next rule in each grammar
with the command "next".
Demonstrates loading a grammar file, "grammars.helloWorld", which
imports another grammar, "grammars.numbers", and recognition of commands
from the helloWorld grammar.
Creates multiple recognizers and synthesizers at once - note, for the Dragon
engine, only one recognizer can be created - this test will fail on trying to allocate
the second recognizer.
Informs the AudioServerSink that this client is initiating a connection
- causes the AudioServer to send a CLIENT_ADDED signal - this
is called automatically when the AudioClientSource is created, but should be called
after closeConnection if you wish to inform the server that this client is starting
a new session.
Informs the AudioServerSource that this client is initiating a connection
- causes the AudioServer to send a CLIENT_ADDED signal - this
is called automatically when the AudioClientSink is created, but should be called
after closeConnection if you wish to inform the server that this client is starting
a new session.
Demonstrates simple speech synthesis, but also the
addition of a word (with pronunciation)
to the VocabManager, and JSML tags in the speech string, and
oddities of the SAPI4 engines.
Used to read data from this source - called by the AudioSink which this
source is connected to (if its startGetting method is used) so need
not be called explicitly by an application.
A JPanel used to set Recognizer properties, add/remove words from
the Recognizer's VocabManager, test dictation and a minimal set
of commands, and display
the native interfaces for the Recognizer
This example demonstrates sending audio data from a client machine to
a server machine using the com.cloudgarden.speech.audio package - it
does not use a speech engine.
This example demonstrates using an AudioServerSource object
to provide audio data to a CGAudioManager attached to a Recognizer
which it receives from a remote client.
Creates a Recognizer which listens to an incoming RTP stream and replies
with synthesized speech on a separate RTP stream - to send the
RTP stream to the recognizer and hear it's response, start up the CaptureAndPlay class, then
this class, then say "what date/time is it", or "goodbye computer".
Creates a Recognizer which listens to an incoming RTP stream and replies
with synthesized speech on a separate RTP stream - to send the
RTP stream to the recognizer and hear it's response, start up the CaptureAndPlay class,
then speak into the microphone - the recognizer should respond
with what it thinks you said.
Allows the audio data from a recognized sample of speech to be saved
to a file - the fileName parameter should be the name (new or already
existing) of a WAVE file.
If realTime is true, causes this DataSource to delay inside of calls to read(Buffer)
so that the data is released no faster than real-time (otherwise it may be released
much faster than real time which may cause a problem if sending the data to
an RTP client).
Sets the sink for this source - this method should also call the
setSource method on the sink object (making sure to avaoid an endless
loop) to ensure that source and sink are connected to each other
(and not to different sinks/sources).
Sets the source for this sink - this method also calls the
setSink method on the source object (while avoiding an endless
loop) to ensure that sink and source are connected to each other
(and not to different sinks/sources).
Sets the source for this sink - this method should also call the
setSink method on the source object (while avoiding an endless
loop) to ensure that sink and source are connected to each other
(and not to different sinks/sources).
An example which splits synthesized speech to three AudioLineSinks
and starts them at staggered intervals, so you can hear the three identical
but separate streams.
Creates new form SpeechControlPanel which can display Recognizers and/or
Synthesizers (depending on the value of "mode", and also only
those Recognizers/Synthesizers selected by the given RecognizerModeDesc
and SynthesizerModeDesc (if they are null, then all Synthesizers/Recognizers are shown).
Creates and initializes a SpeechEngineTree which will contain Recognizers and/or Synthesizers
depending on the value of mode (which can be a combination of SHOW_SYNTHESIZERS and SHOW_RECOGNIZERS)
Displays all available engines of either type.
Creates and initializes a SpeechEngineTree which will contain Recognizers and/or Synthesizers
depending on the value of mode (which can be a combination of SHOW_SYNTHESIZERS and SHOW_RECOGNIZERS),
but only those Recognizers and/or Synthesizers matching requiredRec and requiredSyn.
If CGEngineProperties.setEventsInNewThread enables speech events to be
started each in a new thread, then the threads they are started in are all
SpeechEventThreads.
This method should start a thread which repeatedly writes data to
the AudioSink object which this object is connected to, until it
writes data with length END_OF_DATA, or until the stopSending
method is called.
A JPanel used to set Synthesizer properties, add/remove words from
the Synthesizer's VocabManager, speak arbitrary words using any of
the Synthesizer's voices (selected from a drop-down list) and display
the native interfaces for the Synthesizer
Creates a TestResultListener which will deallocate the given Recognizer
after "nRecs" accepted recognitions, and will re-play the recorded audio
if "playAudio" is true (and audio is being saved using
RecognizerProperties.setResultAudioProvided(true))
An exceptional method to allow synthesizers to get around bug in Infovox 330 (and possibly
other versions of the engines) synthesizers regarding the wordReached events.
Sets use of smoothing in conversion when going from
a higher frequency to a lower frequency - sometimes gives better quality,
than not smoothing, but slower.
A sample Swing-based application which uses the Mouth component and a
JEditorPane to display a text document and highlight words currently being spoken.
Used to write data to this sink - called by the AudioSource which this
sink is connected to (if its startSending method is used) so need
not be called explicitly by an application.
This method will write audio data to an internal buffer until it is called with
END_OF_DATA as it's length parameter, at which point it will
write all the stored data to the file.