Recently, I had a project where I needed to convert some audio to text. It took a bit more googling than I was used to in order to find the code, so I went ahead and whipped up a project that demonstrates its usage, so people can more easily find it.
This code uses the .NET System.Speech namespace and demonstrates how to transcribe audio using either a microphone or a previously created .wav file using C#.
The code can be divided into 2 main parts:
- configuring the SpeechRecognitionEngineobject (and its required elements)
- handling the SpeechRecognizedandSpeechHypothesizedevents.
Step 1: Configuring the SpeechRecognitionEngine
_speechRecognitionEngine = new SpeechRecognitionEngine();
_speechRecognitionEngine.SetInputToDefaultAudioDevice();
_dictationGrammar = new DictationGrammar();
_speechRecognitionEngine.LoadGrammar(_dictationGrammar);
_speechRecognitionEngine.RecognizeAsync(RecognizeMode.Multiple);At this point, your object is ready to start transcribing audio from the microphone. You need to handle some events though, in order to actually get access to the results.
Step 2: Handling the SpeechRecognitionEngine Events
_speechRecognitionEngine.SpeechRecognized -= new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized -= new EventHandler(SpeechHypothesizing);_speechRecognitionEngine.SpeechRecognized += new EventHandler(SpeechRecognized);
_speechRecognitionEngine.SpeechHypothesized += new EventHandler(SpeechHypothesizing);
private void SpeechHypothesizing(object sender, SpeechHypothesizedEventArgs e)
{
///real-time results from the engine
     string realTimeResults = e.Result.Text;
}
private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
///final answer from the engine
     string finalAnswer = e.Result.Text;
}That’s it. If you want to use a pre-recorded .wav file instead of a microphone, you would use _speechRecognitionEngine.SetInputToWaveFile(pathToTargetWavFile); instead of _speechRecognitionEngine.SetInputToDefaultAudioDevice();.
There are a bunch of different options in these classes and they are worth exploring in more detail. This covers the bare essentials for a prototype.
Awesome! I was looking for exactly this and having the code project with it made it even sweeter! Have you planned on doing anything more with the speech recognition? I’m just starting out using it and C# for that matter (coming from Java) for home automation. Would be interested in any other projects related to speech recognition. Thanks again!
Hey Nicholas,
I really do not have any more plans for working with speech recognition. I found the Microsoft solution and a Google solution. I would like to get my hands on the Dragon NaturallySpeaking SDK and see how good that is, but I have no plans in the immediate future for working on it.
Good luck with your home automation and thanks for responding to my blog!
hi Micheal, thanks for help, nowadays i am working on Speech Recognition project , in which i have to convert video files into text, here i have tried to convert audio file( .mp3 ) into text through your code but its not properly working , so please do u tell me the code so i can proceed trough my work. please reply as soon as possible.
i am working on Speech Recognition project , in which i have to convert video or audio files into text,if you done this workplz send me the code or project
email ::kanlalmuhammad@yahoo.com
hey maham can u plz do hindi speech recognition which read hindi audio as a hindi text file…if u can do it plz sent me the code at sudarshanjha11781@gmail.com
hey admin can u plz do speech recognition in hindi which read as a text file in hindi……plz do it if u can…
Hey can you please send me the code , i have also looking for this audio to text conversion in C# visual studio.. i have tried above code. but i getting through so many errors, so please help, will b very grateful to you
am working on Speech Recognition project , in which i have to convert video or audio files into text,if you done this workplz send me the code or project
email ::kanlalmuhammad@yahoo.com
if u done thiswork send it to me
Hey can you guide to build and debug this project.
Hi,
Any ideas on how to handle wildcard/garbage from XML grammar ?
http://stackoverflow.com/questions/12101120/matching-wildcard-dictation-in-microsoft-speech-grammar/12535235
What was the Google solution?
Like the article. Do you know if its possible to have a voice recognition system which identifies a certain word or words when someone speaks it or them and then converts it to text with a time when the word was spoken? Can it be incorporated into an ipad or tablet? If I have a sales person and I want them to emphasise the brand name for example, it will display as text every time they mentioned the brand name.
hey it is awesome but there are some problem with this code
1.Not converting long audio file(more than 1 min) in proper format
2.converted txt not match with audio voice
Please give me some solutions as soon as possible
thanks
Where could I read to actually build a command line application that would take microphone input or a wav file and provide me with a cout of the text in visual studio 2010
Thanks
Eu queria mais explicações. Teria como eu entrar em contato com você?
I want to get involved in tts and stt functioning appl production. Because in special environment of which deaf person recognize the message and normal person’s environment requires such app in order to communicate each other. Well, sometimes I thought only text can solve the problem. But the trend is going to where apple understand and transfer the sound or text into counter format of message ie text, sound. If normal person like to speak out then the apple should be adjusted according to its trend. Likewise if deaf person like to send text message then appl shoud do so. Well all of these things depend on the this era’s trend. Anyway I want to get some help in terms of compiler and SDK. I am in need of detailed link and kind explanation as I am novice to this STT, TTS area. God bless you all!!! What if I download from this link? visualstudio
hi,
i wish to know about building the grammer. how many words can be added in the grammer so that it works fine ? Also i am getting the accuracy of the recognized words close to 30 % ! is there a way to increase this accuracy ?
any help is appreciated.
thank you
CAN YOU PLEASE TELL ME IF THIS CODE ENABLES THE SPEECH OR AUDIO TO BE CONVERTED INTO TEXT IN ANY TEXT FIELD OR JUST FOR A TEXT BOX IN THE IDE??
CAN YOU PLEASE TELLVME IF THIS CODE IS ABLE TO CONVERT THE VOICE (SPEECH OR SOUND) INTO TEXT IN ANY TEXT FIELD SUCH AS, MS WORD, MS POWERPOINT, NOTPAD ETC, OR I HAVE TO CREAT A TEXT FIELD LIKE A TEXT BOX??
Hi..
Can you give sample existing program for the better references.
I am working on a project in visual studio with C# where I need to covert speech into text. I am getting codes for window based application but I need for web based application. Kindly Help.
What About : http://www.speech.cs.cmu.edu/ ?
Is it better than your solution ?
Yeah you could get more accurate results with that, but it would be alot more complicated.
Hi I’m working on a project and need audio converter in real time text, ie, that is to capture the package of rtp and converter in wav and use a tool, thinking of a level of sales ie 50 calls, you can do using your example?