Powershell: fixed width data to CSV

This of course needs a spec to say how to map the fixed width data. Code

A basic test in Pester at the end of the script file, broken then fixed:

 

Advertisements

PowerShell: pattern matching

Here, I am defining a hex character, as 1 of A-F (case insensitive) or 1 of 0-9. The format is no different from any other common Regex/pattern matching implementation:

Then, we want to look for the pattern anywhere within the passed line (\b..\b – word boundary). Still in that $guidPattern string, we are saying that the pattern starts with exactly 8 of the hex characters we defined earlier, followed by a literal ‘-‘, etc… ending up with exactly 12 characters, then a word boundary.

And finally the whole thing in a PowerShell script:


<#
Given a set of files, for each file, if a line does not contain the pattern, then
write the line to name-copy-plus-suffix of the file, else do not write the line.
#>
$root = "C:\Sandbox"
cd $root
$fileSet = Get-ChildItem -Path .\*.xml -Name
$fileSet

$hexChar = "[A-Fa-f0-9]"

$guidPattern = "\b$hexChar{8}-$hexChar{4}-$hexChar{4}-$hexChar{4}-$hexChar{12}\b"

$fileSet | foreach {
$fileName = "$root\$_"
$fileToSave = "$fileName" + ".sansGuid"
Write-Host "Processing $fileName"
$reader = New-Object -TypeName System.IO.StreamReader("$fileName")
$writer = [System.IO.StreamWriter] $fileToSave
$writer.Write("");

while (($line = $reader.ReadLine()) -ne $null) {
if ($line -notMatch $guidPattern) {
$writer.WriteLine($line)
}
}
$writer.Close()
$reader.Close();
}

PowerShell: splitting an input file and saving to wav format in chunks

On 4 out of 5 days, I have a car journey that is between 0.75 and 1.25 hours. I want to be able to take a free (e.g. Project Gutenberg) book, or at least a DRM free book, split it into sections, and create an audio file from each section.
Let’s say that the following is my entirety of my book:

Guten01

I want to read/hear in sections: lines 1 and 2 (section 1), lines 3 and 4 (section 2), lines 5 and 6 (section 3), line 7 (section 4), giving this:

Guten02

This PowerShell is one way to do that (Although I write the split text back out to disk and then read it back in, that step could be removed).

Guten03

function Get-FileName($extension = "txt") {
 "{0}{1}_{2}.{3}" -f ($outputRootDir, $outputFileNamePrefix, $chunk, $extension)
}
function Write-WavFile() {
 $speech = New-Object -TypeName System.Speech.Synthesis.SpeechSynthesizer
 $speech.SelectVoice("Microsoft Hazel Desktop")
 $textToSpeak = Get-Content -Path $(Get-FileName) -Encoding UTF8
 $speech.SetOutputToWaveFile($(Get-FileName "wav"))
 $speech.Speak($textToSpeak)
 $speech.Dispose()
 $speech = $null
}
function Split-File (
 $fileToSplit = 'C:\Temp\pandp.txt',
 $splitMarker = "SPLITHERE",
 $outputFileNamePrefix = "TheseLinesAudio",
 $outputRootDir = "c:\temp\"
) {
 Add-Type -AssemblyName System.Speech
 $reader = New-Object -TypeName System.IO.StreamReader($fileToSplit)
 $chunk = 1
 $speech = New-Object -TypeName System.Speech.Synthesis.SpeechSynthesizer
 $speech.SelectVoice("Microsoft Hazel Desktop")
 while (($line = $reader.ReadLine()) -ne $null) {
 if ($line -match $splitMarker) {
 Write-WavFile
 $chunk++
 } else {
 Add-Content -Path $(Get-FileName "txt") -Value $line -Encoding utf8
 }
 }
 Write-WavFile
 $reader.Close()
 $reader.Dispose()
 $reader = $null
}
#entry point...
Split-File

PowerShell: splitting files

Specific requirements will differ: for me it is to split a file based on a literal occurring at points in a larger file.

The text I used for testing was the Pride and Prejudice extract referenced elsewhere in this blog.

There’s a typo below: the Add-Content line should have an encoding switch¬†

 

function Get-FileName {
 "{0}{1}_{2}.txt" -f ($outputRootDir, $outputFileNamePrefix, $chunk)
}
# This expects a $splitMarker in a source file, to denote the string where 1 file is to end and another is to start
# All the output files have the same name, differing only by the incrementing counter $chunk
function Split-File (
 $fileToSplit = "C:\Temp\PandP.txt", 
 $splitMarker = "SPLITHERE",
 $outputFileNamePrefix = "smallish",
 $outputRootDir = "c:\temp\"
 ) {
 $reader = New-Object -TypeName System.IO.StreamReader($fileToSplit)
 $chunk = 1
while (($line = $reader.ReadLine()) -ne $null) {
 if ($line -match $splitMarker) {
 $chunk++
 } else {
 Add-Content -Path $(Get-FileName) -Value $line
 }
 }
$reader.Close()
 $reader.Dispose()
 $reader = $null
}
#entry point...
Split-File

					

PowerShell: writing to a file

You might be puzzled that if you use write-host to write to a file to record the session output… it’s empty, thus:

Write-output is what you need, although you do lose those useful colour differences:

Redirect operators achieve the same end:

From a Dos Prompt, you can call PowerShell.exe and wrap and invoke the command:

In fact my own preference is to execute write-host in an ISE window, copy and paste the output into a Word doc or Outlook because that preserves the colour. That applies only if you are not dealing with big volume (you judge what big is).

Oh yeah, sorry, why this difference in behaviour? Because in essence, Write-Host terminates the pipeline, and Write-Output carries it forward to the next action. See Jeffrey Snover’s post here. The point is: be aware that Write-Host is largely considered bad practice… I’m just a sucker for its easy colour highlighting. Perhaps there are other ways to achieve that goal that simply, but I don’t know them.

PowerShell: deleting file records based on a condition

Given a file like this, where the second column is [Gender] and the third column is [Human], give me only rows which are [F](offset 20 from 0 – note it is zero and not 1) and [Human] (offset 30). Write the result to a file [HumanFemales.txt].

Dennis M Y
Jan F Y
Emma F Y
Bonzo M N

Names01

This reads the source file content into a variable, counts the records, filters the records based on the condition into an array, and writes the array to a file:

$data = Get-Content -Path C:\temp\names.txt
$data.Count
$filteredData = @()
$data | % {if ($_[19] -eq "F" -and $_[29] -eq "Y") { $filteredData += $_} }
$filteredData.Count
$filteredData | Out-File -FilePath c:\temp\HumanFemales.txt
gc C:\temp\HumanFemales.txt

WriteFile01

WriteFile02

For a file of about 50,000 records on a good I7 box, the records take about 3 minutes to write to the array, uses (depending on the record length) about 1GB of memory, and takes about 5 seconds to write to disk.