# Majordom ## General idea - The stt "pocketsphinx_continous" continousely records the microphone and search for sentences matching in a given language - Then it interprets utterances according to "natural command" concepts (see ./testing_natural_command/*) - Commands (more or less hardcoded) are sent to the xAAL bus - The result of is pronouced by the "espeak" tts (just for fun) ## Notes - The code here is based on pocketsphinx_continous relase "5prealpha" URL: svn://svn.code.sf.net/p/cmusphinx/code/trunk/pocketsphinx/src/programs/continuous.c Last Changed Rev: 12846 Last Changed Date: 2015-02-12 00:10:29 +0100 (Thu, 12 Feb 2015) Text Last Updated: 2015-04-29 20:58:37 +0200 (Wed, 29 Apr 2015) You can use continuous.patch to rebuild the file continuous.c used here. - However this may works with last stable release (libpocketsphinx-dev 0.8-x) ## Results - Tests are performed on french language - Acoustic model is lium_french_f0.tar.gz http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/French%20F0%20Broadcast%20News%20Acoustic%20Model/ - The proposed lm-grammar and dictionnary are build by scripts proposed in ./tools/ - Sentences to pronouce: ./tools/corpus.txt ## Documentation http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx http://cmusphinx.sourceforge.net/wiki/faq#qcan_pocketsphinx_reject_out-of-grammar_words_and_noises ## Build - Get latest cmu-sphinx release, compile it, install it: $ svn checkout svn://svn.code.sf.net/p/cmusphinx/code/trunk cmusphinx-code $ cd cmusphinx-code $ for i in sphinxbase pocketsphinx cmuclmtk ; do $ cd $i $ ./autogen.sh && ./configure && make $ sudo make install $ cd .. $ done Note that sphinxbase detects itself at configure time if your system uses PulseAudio, JACK, ALSA, OSS (in this order) - Dependencies: . libsphinxbase libpocketsphinx (see above) . libxaal + packages uuid-dev libjson-c-dev libsodium-dev . package libfstrcmp-dev . packages libespeak-dev mbrola mbrola-fr1 (just for fun) ## Discussions - About sphinx . The classical lm-grammar search is fine, however one may get false-positive. The "keyword spotting search" under development seems very promising. - About natural command . The algorithm is more or less a kind of maximum-likelihood estimation over a list of keywords associated to expected commands. No grammar consideration. . Pro: the algorithm is naive but verry simple ;-) . Cons: the associated xAAL commands are more or less hardcoded. ## How to customize the Majordom - The 'Speech To Text' part . Choose an accoustic model (from audio to phonems) . Choose a dictionnary (from phonems to words) . Choose a lm-grammar (expected sentences in terms of probability to get singles words, sequences of two words, sequences of three words) . Look at scripts in ./tools/ . Look at CMU Sphinx for details - The 'Natural Command' part, i.e. from the world of sentences in natural language to the world of xAAL commands . Edit 'translate.doc': for each target words of the xAAL world (devTypes, methods to call, tags of devices), this gives corresponding naming in your natural languages (the sources words) . If needed, edit 'cmd_lib.c' (and recompile): here are code of xAAL actions, and their signature in terms of target words recognized by previous steps - The 'Text To Speech' part . The Majordom select French or English output according to your locale, so setup your LC_LANG shell variable before . Parameters of espeek+mbrolla engines are hardcoded in tts.c, so edit it and recompile if needed . Sentences to pronounce and translated ones (in french) are in .mo files. So customize .po files and rebuild .mo files if needed ## Phone to your Majorom The idea is to connect the majordom to an asterisk ipbx via the alsa audio loopback driver. - Load alsa drivers at boot time. Edit /etc/modules and add snd-aloop - Let the loopback driver be the default audio device. Create a file named /etc/modprobe.d/sound.conf and add alias snd-card-0 snd-aloop - Configure asterisk to load its alsa channel plugin. Edit /etc/asterisk/modules.conf with load => chan_alsa.so noload => chan_oss.so - Configure the alsa plugin for asterisk. Edit /etc/asterisk/alsa.conf [general] autoanswer=yes context=default extension=s input_device=plughw:0,1 output_device=plughw:0,1 - Edit /etc/asterisk/extensions.conf [default] ../.. include => majordome ../.. [majordome] exten => 6000,1,Answer exten => 6000,2,Ringing exten => 6000,3,Wait(2) exten => 6000,4,eSpeak("Bonjour. Je suis le majordome de Experiment'HAAL. Comment puis-je vous aider ?") exten => 6000,5,Dial(console/alsa) exten => 6000,6,Hangup Then, call 6000!