PowerShell:speech and encoding

If you have some text in a file, and push that through the SAPI to get speech out, be aware that the output is sensitive to the encoding of the file. The problem happens in the text, not in the speech: the speech is just a victim of the text.

As an example, look at this text:

SpeechEncoding01

Pass this through Get-Content, and see what has been stored:

SpeechEncoding02

If you then pass that through SAPI, then quite reasonably what you see there is what you will get, including “… Euro Symbol, Trade Mark Symbol…”.

And that is because on my laptop, the default encoding is not the same as the encoding of the source file. If I look at the menu in NotePad++, I see this:

SpeechEncoding03

So if I now add the correct encoding switch to Get-Content, we now get this…

SpeechEncoding04

Doing an end-to-end run now gives no odd speech incidentals, although if you listen to the SoundCloud below, you will hear it is not ideal. For that we would probably need the Speech Markup language.

SpeechEncoding05

 

PS>Add-Type -AssemblyName system.speech
PS>$speech = New-Object -TypeName system.speech.synthesis.speechsynthesizer
PS>$speech.SelectVoice("Microsoft Hazel Desktop")
PS>$text = Get-Content -Path C:\temp\voicesnip01.txt -Encoding UTF8
PS>$text
‘It’s Jeremy’s sledge.’.
Véronique was extremely angry.
The Place Vendôme was unusually quiet.
PS>$speech.Speak($text)

SpeechEncoding06

$speech.SetOutputToWaveFile("c:\temp\snippy.wav")
$speech.Speak($text)
$speech.Dispose()
$speech = $null
Advertisements