Windows 10… N – don’t use it for app development.

At least, if the app in question uses speech. Background is that I routinely use Azure VMs for development… so let’s extend that to writing apps. The ready-made VMs that also include Visual Studio are the “N” variants. So I used them, as usual. However, I wasted a number of hours failing to get to the bottom of this error when executing a pretty standard block of text-to-speech code (at the time I was not questioning whether “N” was OK to use):

MigratingWinPhoneApp15

Class not registered

at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.
ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.
HandleNonSuccessAndDebuggerNotification(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
 at HW.MainPage.d__1.MoveNext()} System.Runtime.InteropServices.COMException

I spent a lot of time after that trying to debug, googling up the wrong tree, reinstalling various releases of Visual Studio, all with the same, bad, result. In fact I should have just gone to bed, because as ever a tiny light bulb came on when I thought about… N.

So this morning I googled issues around Visual Studio and Windows N 10, and almost immediately found this:

NotWin10N_05

So I tried the media feature pack, got a message that “does not apply to this installation” or somesuch. Yes I could have persevered, but decided to trash the N instance, and create a non-N instance, and manually install Visual Studio, and this sample. And then it was all fine.

NotWin10N01

And I thought there might be some issue with trying to do speech on a VM, but that was all fine, and came loud and clear through my speakers on my host PC. QED.

 

Advertisements

Windows 10 Speech: speaking and storing as audio

This both speaks and stores as audio (wav) the passed text:


 

The code underpinning that:


 

using Windows.UI.Xaml.Controls;
using TextToSpeech;
namespace App1 {
public sealed partial class MainPage : Page
 {
 public MainPage()
 {
 InitializeComponent();
 var si = new SpeakIt();
 var textToSpeak = " I have a high respect for your nerves";
 SpeakIt.ReadText(textToSpeak);
 si.StoreText(textToSpeak);
 }
 }
}

 

 

using System;
using Windows.UI.Xaml.Controls;
using Windows.Media.SpeechSynthesis;
using System.Linq;
using Windows.Storage;
using Windows.Storage.Streams;
using System.Threading.Tasks;
namespace TextToSpeech
{
 public class SpeakIt
 {
 private const string PreferredVoice = "Susan";
 private const int BufferSize = 4096;
 private SpeechSynthesizer _synthesizer = new SpeechSynthesizer();
public SpeakIt() {
 SetPreferredVoice();
 }
public static async void ReadText(string mytext) {
 // requires the using Windows.UI.Xaml.Controls namespace...
 var mediaPlayer = new MediaElement();
using (var speech = new SpeechSynthesizer()) {
 speech.Voice = SpeechSynthesizer.AllVoices.First(voice => voice.Id.Contains(PreferredVoice));
 var stream = await speech.SynthesizeTextToStreamAsync(mytext);
 mediaPlayer.SetSource(stream, stream.ContentType);
 mediaPlayer.Play();
 }
 }
 
 public async void StoreText(string myText) {
 var synthesisStream = await _synthesizer.SynthesizeTextToStreamAsync(myText);
 var sf = await CreateLocalFile($"{Guid.NewGuid()}.wav");
 await SaveSpeechStreamToStorageFile(synthesisStream, sf);
 }
private static async Task<StorageFile> CreateLocalFile(string fileName) {
 // https://msdn.microsoft.com/en-gb/library/windows/apps/br227251
 var sfo = ApplicationData.Current.LocalFolder;
 var sf = await sfo.CreateFileAsync(fileName); 
 return sf;
 }
private static async Task SaveSpeechStreamToStorageFile(SpeechSynthesisStream synthesisStream, StorageFile sf) {
 var writeStream = await sf.OpenAsync(FileAccessMode.ReadWrite);
 var outputStream = writeStream.GetOutputStreamAt(0);
 var dataWriter = new DataWriter(outputStream);
 var buffer = new Windows.Storage.Streams.Buffer(BufferSize);
while (synthesisStream.Position < synthesisStream.Size) {
 await synthesisStream.ReadAsync(buffer, BufferSize, InputStreamOptions.None);
 dataWriter.WriteBuffer(buffer);
 }
 dataWriter.StoreAsync().AsTask().Wait();
 outputStream.FlushAsync().AsTask().Wait();
 outputStream.Dispose();
 writeStream.Dispose();
 }
private void SetPreferredVoice() {
 _synthesizer.Voice = SpeechSynthesizer.AllVoices.First(voice => voice.Id.Contains(PreferredVoice));
 }
 }
}

					

PowerShell: splitting an input file and saving to wav format in chunks

On 4 out of 5 days, I have a car journey that is between 0.75 and 1.25 hours. I want to be able to take a free (e.g. Project Gutenberg) book, or at least a DRM free book, split it into sections, and create an audio file from each section.
Let’s say that the following is my entirety of my book:

Guten01

I want to read/hear in sections: lines 1 and 2 (section 1), lines 3 and 4 (section 2), lines 5 and 6 (section 3), line 7 (section 4), giving this:

Guten02

This PowerShell is one way to do that (Although I write the split text back out to disk and then read it back in, that step could be removed).

Guten03

function Get-FileName($extension = "txt") {
 "{0}{1}_{2}.{3}" -f ($outputRootDir, $outputFileNamePrefix, $chunk, $extension)
}
function Write-WavFile() {
 $speech = New-Object -TypeName System.Speech.Synthesis.SpeechSynthesizer
 $speech.SelectVoice("Microsoft Hazel Desktop")
 $textToSpeak = Get-Content -Path $(Get-FileName) -Encoding UTF8
 $speech.SetOutputToWaveFile($(Get-FileName "wav"))
 $speech.Speak($textToSpeak)
 $speech.Dispose()
 $speech = $null
}
function Split-File (
 $fileToSplit = 'C:\Temp\pandp.txt',
 $splitMarker = "SPLITHERE",
 $outputFileNamePrefix = "TheseLinesAudio",
 $outputRootDir = "c:\temp\"
) {
 Add-Type -AssemblyName System.Speech
 $reader = New-Object -TypeName System.IO.StreamReader($fileToSplit)
 $chunk = 1
 $speech = New-Object -TypeName System.Speech.Synthesis.SpeechSynthesizer
 $speech.SelectVoice("Microsoft Hazel Desktop")
 while (($line = $reader.ReadLine()) -ne $null) {
 if ($line -match $splitMarker) {
 Write-WavFile
 $chunk++
 } else {
 Add-Content -Path $(Get-FileName "txt") -Value $line -Encoding utf8
 }
 }
 Write-WavFile
 $reader.Close()
 $reader.Dispose()
 $reader = $null
}
#entry point...
Split-File

Microsoft Speech: Hazel and Susan

These are both GB voices. To my ears, the more recent Susan voice has more quality than the Hazel voice.

Programmatically, the Hazel voice can be got at easily:

SpeechPs01

Add-Type -AssemblyName system.speech
$x = New-Object -TypeName System.Speech.Synthesis.SpeechSynthesizer
$x.GetInstalledVoices() | % { $_.voiceinfo}

SpeechPs02

$x.SelectVoice("Microsoft Hazel Desktop")

And from there, we can get some speech out:

SpeechPs03

$x.Speak(“East Fife, 4. Forfar, 5”)

If you then run a Get-Member over the object, you see its methods:

$x | gm -MemberType Method
WavFiless01

So we can do this:

$x.SetOutputToWaveFile("c:\temp\test.wav") 
$x.Speak("East Fife, 4. Forfar, 5") 
$x.Dispose() $x = $null

drm02

Obviously you’ll have to run that yourself to hear the evidence, but you now have a valid wav file speaking in the Hazel voice.

But going back to the list of installed voices, even though I am on Windows 10, and the Susan voice appears in Time and Language/Speech, I cannot get it to surface easily. Well, at all, right now.

SpeechPs04

I’ve been ploughing through the Registry, and from there I find where the artefacts are held both for the Hazel and the Susan voices (in fact I used George in the end as the Susan equivalent, because it does not occur so often as Susan in the Registry). For example:

George03

My hope is that the only differences between the 2 types are location, and once I can coerce the new voices into the same place as the old voices, then SAPI will just discover them. That may well be naive. We shall see. Finally for tonight, having done shed loads of registry screenshots in the hope that some of them will give me strong clues in the next pass, I’m dumping them here:

 

 

Windows 10 Speech: very basic code

No error handling, very dirty, just wanted to get something that produces sound.

MainPage.xaml

<Page
    x:Class="App1.MainPage"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:local="using:App1"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d">
 
 
 
    <Grid Background="{ThemeResource ApplicationPageBackgroundThemeBrush}">
        <RelativePanel>
            <MediaElement x:Name="media" AutoPlay="False"/>
            <TextBox x:Name="textBox1" Text="My Dear Text" Margin="5"/>
            <Button x:Name="blueButton" Margin="5" Background="LightBlue" Content="ButtonRight" RelativePanel.RightOf="textBox1"/>
            <Button x:Name="orangeButton" Click="orangeButton_Click" Margin="5" Background="Orange" Content="ButtonBelow"
                    RelativePanel.RightOf="textBox1" RelativePanel.Below="blueButton"/>
            
        </RelativePanel>
    </Grid>
</Page>


		
		



using System;
using System.Collections.Generic;
using Windows.Media.SpeechSynthesis;
using Windows.UI.Xaml.Controls;
using Windows.UI.Xaml.Media;
using Windows.ApplicationModel.Resources.Core;
 
// The Blank Page item template is documented at http://go.microsoft.com/fwlink/?LinkId=402352&clcid=0x409
 
namespace App1 {
    /// <summary>
    /// An empty page that can be used on its own or navigated to within a Frame.
    /// </summary>
    public sealed partial class MainPage : Page {
 
        private SpeechSynthesizer synthesizer;
        private ResourceContext speechContext;
        private ResourceMap speechResourceMap;
 
        public static MainPage Current;
        public MainPage() {
            this.InitializeComponent();
            synthesizer = new SpeechSynthesizer();
            speechContext = ResourceContext.GetForCurrentView();
            speechContext.Languages = new string[] { SpeechSynthesizer.DefaultVoice.Language };
            speechResourceMap = ResourceManager.Current.MainResourceMap.GetSubtree("LocalizationTTSResources");
        }
 
        public List<Scenario> Scenarios
        {
            get { return this.scenarios; }
        }
 
        private async void orangeButton_Click(object sender, Windows.UI.Xaml.RoutedEventArgs e) {
 
            if (media.CurrentState.Equals(MediaElementState.Playing)) {
                media.Stop();
            }
            else {
                string text = textBox1.Text.ToString();
                if (!String.IsNullOrEmpty(text)) {
                    // Change the button label. You could also just disable the button if you don't want any user control.
 
 
                    try {
                        // Create a stream from the text. This will be played using a media element.
                        SpeechSynthesisStream synthesisStream = await synthesizer.SynthesizeTextToStreamAsync(text);
 
                        // Set the source and start playing the synthesized audio stream.
                        media.AutoPlay = true;
                        media.SetSource(synthesisStream, synthesisStream.ContentType);
                        media.Play();
                    }
                    catch (System.IO.FileNotFoundException) {
                        // If media player components are unavailable, (eg, using a N SKU of windows), we won't
                        // be able to start media playback. Handle this gracefully
 
                        var messageDialog = new Windows.UI.Popups.MessageDialog("Media player components unavailable");
                        await messageDialog.ShowAsync();
                    }
                    catch (Exception) {
                        // If the text is unable to be synthesized, throw an error message to the user.
 
                        media.AutoPlay = false;
                        var messageDialog = new Windows.UI.Popups.MessageDialog("Unable to synthesize text");
                        await messageDialog.ShowAsync();
                    }
                }
            }
 
 
 
 
        }
 
      
    }
}    


Ref SSML... this worked... and the difference between loud and soft is perceptible:
string Ssml =
     @"<speak version='1.0' " +
     "xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-GB'>" +
     "<prosody volume='x-loud'> This is extra loud volume. </prosody>";


This worked:
string Ssml =
               @"<speak version='1.0' " +
               "xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-GB'>" +
               "Hello <prosody contour='(0%,+80Hz) (10%,+80%) (40%,+80Hz)'>World</prosody> " +
               "<break time='500ms' />" +
               "Goodbye <prosody rate='slow' contour='(0%,+20Hz) (10%,+30%) (40%,+10Hz)'>World</prosody>" +
               "</speak>";
https://msdn.microsoft.com/en-us/library/windows.media.speechsynthesis.speechsynthesizer.aspx

ref ssml:
https://msdn.microsoft.com/en-us/library/jj127898.aspx

speechy10
Googling the above, see a lot of complaints about this. When I have time I will try this out:

I installed fresh Windows 10 and Visual Studio Community 2015, and the designer failed to load (for MainPage.xaml etc). I had to:

  1. enable developer mode in system settings (update section) as suggested in info dialog
  2. (re)install Visual C++ redistributable for VS 2015

But I don’t know which one exactly resolved the problem… Now the designer loads as expected. (I tried only C# universal app yet)

 

… and generally tidy up the post.

 


					

Windows 10 on Windows Phone, with Speech Apps

WP_20160207_21_07_49_ProWP_20160207_21_06_22_Pro

And in the end it wasn’t SO hard. By the end of the weekend, I have this:

  • Lumia 635 with no micro-SD upgraded to Windows 10
  • Windows 10 speech app (Universal) building and running ok in Visual Studio 2015. It has both synthesis and recognition, so using that as a template, I should now be able to build whatever I need.
  • The same app running on the above Lumia 635 (this is from Microsoft and GitHub, to be clear)

Don’t know why, but right now I cannot take screenshots on this phone, so these are photos of the Lumia 635 taken from my Lumia 735: