PowerShell:speech and encoding

If you have some text in a file, and push that through the SAPI to get speech out, be aware that the output is sensitive to the encoding of the file. The problem happens in the text, not in the speech: the speech is just a victim of the text.

As an example, look at this text:


Pass this through Get-Content, and see what has been stored:


If you then pass that through SAPI, then quite reasonably what you see there is what you will get, including “… Euro Symbol, Trade Mark Symbol…”.

And that is because on my laptop, the default encoding is not the same as the encoding of the source file. If I look at the menu in NotePad++, I see this:


So if I now add the correct encoding switch to Get-Content, we now get this…


Doing an end-to-end run now gives no odd speech incidentals, although if you listen to the SoundCloud below, you will hear it is not ideal. For that we would probably need the Speech Markup language.



PS>Add-Type -AssemblyName system.speech
PS>$speech = New-Object -TypeName system.speech.synthesis.speechsynthesizer
PS>$speech.SelectVoice("Microsoft Hazel Desktop")
PS>$text = Get-Content -Path C:\temp\voicesnip01.txt -Encoding UTF8
‘It’s Jeremy’s sledge.’.
Véronique was extremely angry.
The Place Vendôme was unusually quiet.


$speech = $null