Home

PocketSphinx is service which performs continuous speech recognition CSR (also referred to as automatic Speech Recognition ASR, Voice to Text).

It was ported to Android by the cmusphinx team at SourceForge.

Table of Contents Requirements Use cases Transcription of Audio Files References

Requirements

This project aims to use the PocketSpinx to meet the following requirements:

Transcribe Dictation/create transcript for podcasts/create subtitles for video
- Audio File to Text (Map of time period to Array of Hypotheized text)
- Boolean to run it on the device, or to send it to a Sphinx server elsewhere
General Eyes-Free Speech Recognition
- Register PocketSphinx as a service which responds to android.speech.RecognizerIntent so that users can make it the default in the preferences (ie. if they have no data connection on their Android, or they are generally not online)
- Create an Open Intent for other developers to call PocketSphinx
- Function very similarly to com.google.android.voicesearch , except allow a boolean to control whether it stops "listening" on silence, or on user action {back button, screen tap, gesture, top to bottom swipe etc}

Audio file processing should allow for a boolean splitOnSilence

By minimizing the service to audiofile->array of strings

- developer can check if wifi is active, and only transfer data when on wifi
- developer can choose to schedule the service overnight for example...
- developer can provide the logic to record via handset OR paired headsets microphone OR bluetooth
- Audio is saved to device so that the user doesn't "loose" their thoughts and they can re-listen to their audio to correct the transcription.