Find and Replace Text with PowerShell

magnifying-glass I’ve just finished up a series of tweets with a follower who had a question about finding and replacing a bit of data in a text file. In his case it was a web.config file but it really could be any text file that you can view in PowerShell. In PowerShell 3.0 and later this isn’t too difficult. Although I haven’t tested with really large files to determine if there are any limitations. Before you try anything I’m going to show you, please have a backup copy of of the original file.

Let’s say I have a file like this:

I need to change the IP source address in the connection string. As you probably guessed Get-Content is going to come into play. Normally when you use Get-Content you get an array of strings which adds a little complexity. But starting in v3 Get-Content has a -Raw parameter which writes the entire file as a single string. If we have a single string then we can easily use -Replace.

Your replacement pattern could even be a regular expression pattern. This could even be done as a one-liner:

For many of you, especially with unstructured text files, this technique should get the job done. But in this case, because the text file is also an XML document we could tackle the problem a different way. This is especially useful if you have several changes you want to make to the file.

First, create an XML document from the file.

We can navigate the document until we find the entry we need to change.

xmlconfig

We can assign a new value with revised connection string:

All that remains is to save the file back to disk.

Or, if you didn’t necessarily know the structure of your document, you could use an XPath query with Select-XML to find the node and then change the value.

The node properties will vary depending on your XML file so you will need to look at the object.
xmlnode

The XML approach is a bit more involved but is predictable and you don’t have to save the file until you have the XML just the way you want it.

One thing I will say about both techniques is that the more you understand how to use regular expressions in PowerShell the more you can accomplish. If you are feeling a little wobbly on the topic, there is an entire chapter on regular expressions in PowerShell in Depth.

Convert Text to Object with PowerShell and Regular Expressions

squarepatternA few weeks ago I was getting more familiar with named captures in regular expressions. With a named capture, you can give your matches meaningful names which makes it easier to access specific captures. The capture is done by prefixing your regular expression pattern with a name.

When you know the name, you can get the value from $matches.

This also works, and even a bit better, using a REGEX object.

With the REGEX object you can get the names.

Because the names include index numbers, I usually filter them out. Once I know the names, I can use them to extract the relevant matches.

Then I realized it wouldn’t take much to take this to the next step in PowerShell. I have a name and a value, why not create an object? It isn’t too difficult to create a hashtable on the fly and use that to create a custom object. Eventually I came up with ConvertFrom-Text.

The function requires a regular expression pattern that uses named captures. With the pattern you can either specify the path to a log file, or you can pipe structured text to the function. By “structured text” I mean something like a log file with a predictable pattern. Or even output from a command line tool that has a consistent layout. The important part is that you can come up with a regular expression pattern to analyze the data. I also wanted to be able to pipe in text in the event I only wanted to process part of a large log file.

Here’s an example using the ARP command.

In this particular example, I’m trimming the ARP output to remove any leading or trailing spaces from each line and then converting each line to an object, using the regular expression pattern.

convertfrom-text

If you haven’t jumped to why command is useful, is that once I have objects I can easily filter, sort, group, export, or just about anything else. By converting a log file into a collection of objects I can do tasks like this:

convertfrom-text-2

I hope some of you will try this out and let me know what you think. What works? What is missing? What problem did this solve? Inquiring minds, well at least mine, want to know. Enjoy.

Friday Fun with REST, Regex and Replacements

computereye I just love using the web cmdlets that were introduced in PowerShell 3.0. One of the cmdlets I use the most is Invoke-RESTMethod. This isn’t because I’m dealing with sites that offer REST-ful services, but rather I like that the cmdlet does all the heavy lifting for me when I point it to an RSS feed. I thought I’d share some techniques for working with the results incorporating regular expressions and replacements.

First, I need something to work with.

Here’s an example of what I get back:

The cmdlet, Invoke-RestMethod, has gone ahead and created XML elements. I didn’t have to do anything. Next, I’d like to re-format the results and make it pretty. For example, the pubDate, while readable, won’t sort as a true [datetime] because it is being represented as a string. And some properties, like Description, are buried a bit further.

I can start with something like this:

I’ve grabbed the description text and treated the pubDate as a [datetime] object.

convertingxml

One of the first things I see looking at this in more detail is that the description is full of HTML code.

To make this easier to read I want to strip out HTML tags and convert things like “–” into something I can understand. Here’s where regular expressions come into play.

With a little online research I came up with a regular expression pattern to find HTML tags: <(.|\n)+?> which I can use like this:

The Replace() method took all matches and replaced them with “”, effectively removing them from the text. Because, I have a number of items to potentially replace, I defined a hashtable.

Then in my Select-Object statement I can reformat the description text.

PowerShell goes through the hashtable keys and replaces the text with the key value. The end result is that $text is now clean. Here is my final code

The other addition I made was to get the text from the Category XML element and join the array of strings into a single line separated by commas. Here’s the final result:

get-powershellorg

But wait, there’s more! I have taken the output from Invoke-RestMethod and written new objects to the pipeline. One thing I could do is pipe the result to Out-GridView and use it as an object-picker.

I can send the results to Out-Gridview. From there I can select one or more items, click OK and have the item open in my web browser. Or how about this:

Here I have a couple of replacements going on. After running the script, I re-select the properties and customize the Link property to turn it into an HTML anchor link. I’m doing this so that when I convert to HTML I will get a table with a clickable link. Well, almost. You see when ConvertTo-HTML gets the text for the Link property it turns the < and > back into quoted HTML. Which means I need to turn that back in < and > before saving the results to a file. Notice I’m taking advantage of the pipeline in the ForEach script block.

The Replace() method writes the string object to the pipeline after replacing for < and then it replaces for >. The net result is two replacements with one command.

I hope you had some fun with this and maybe learned a trick or two.

PowerShell Scripting with [ValidatePattern]

I’ve been writing about a number of parameters attributes you can include in your PowerShell scripting to validate parameter values. Today I want to cover using a regular expression pattern to validate a parameter value. I’m going to assume you have a rudimentary knowledge of how to use regular expressions in PowerShell. If not, there is an entire chapter devoted to the topic in Windows PowerShell 2.0: TFM.

The parameter attribute is [ValidatePattern()]. Inside the parentheses you place a scriptblock with the regular expression pattern. For example, in PowerShell we might write a command like this to verify if something is a number of 1 to 3 digits.:


$x -match "^\d{1,3}$"

To use that pattern in a [ValidatePattern()] attribute, you would write it like this:


[ValidatePattern({^\d{1,3}$})]

There is no need to use the -match operator or $_. Sure, I suppose you could write a validation script to achieve the same effect, but this is just as easy. I recommend testing your pattern from the PowerShell prompt, especially testing for failures. Here’s a more complete example.

Param (
[Parameter(Position=0,Mandatory=$True,HelpMessage="Enter a UNC path like \\server\share")]
[ValidatePattern({^\\\\\S*\\\S*$})]
[ValidateScript({Test-Path -Path $_ })]
[string]$Path
)

Write-Host "Getting top level folder size for $Path" -ForegroundColor Magenta
dir $path | measure-object -Property Length -sum

For you regular expression gurus, don’t get hung up on my pattern. It works for my purposes of illustration. Your pattern can be as simple or as complex as you need it to be. In this short script I’m expecting a path value like \\file01\public. If the value is not in this format, the pattern validation will fail, PowerShell will throw an exception and the script will fail.

Notice I’m also using a second parameter validation attribute, [ValidateScript()]. It is possible for the pattern to be correct but invalid so I can combine both validation tests.


PS C:\> S:\Demo-ValidatePattern.ps1 '\\file01\temp'
C:\scripts\Demo-ValidatePattern.ps1 : Cannot validate argument on parameter 'Pa
th'. The "Test-Path -Path $_ " validation script for the argument with value "\
\file01\temp" did not return true. Determine why the validation script failed a
nd then try the command again.
At line:1 char:28
+ S:\Demo-ValidatePattern.ps1 <<<< '\\file01\temp' + CategoryInfo : InvalidData: (:) [Demo-ValidatePattern.ps1], Par ameterBindingValidationException + FullyQualifiedErrorId : ParameterArgumentValidationError,Demo-ValidatePa ttern.ps1

If you'd like to try out my sample script, you can download it here.

Friday Fun Get Content Words

Recently I was tracking down a bug in script for a client. The problem turned out to be a simple typo. I could have easily avoided that by using Set-StrictMode, which I do now, but that’s not what this is about. What I realized I wanted was a way to look at all the for “words” in a script. If I could look at them sorted, then typos would jump out. At least in theory.

My plan was to get the content of a text file or script, use a regular expression pattern to identify all the “words” and then get a sorted and unique list. Here’s what I came up with.


Function Get-ContentWords {

[cmdletbinding()]

Param (
[Parameter(Position=0,Mandatory=$True,
HelpMessage="Enter the filename for your text file",
ValueFromPipeline=$True)]
[string]$Path
)

Begin {
Set-StrictMode -Version 2.0

Write-Verbose "Starting $($myinvocation.mycommand)"

#define a regular expression pattern to detect "words"
[regex]$word="\b\S+\b"
}

Process {

if ($path.gettype().Name -eq "FileInfo") {
#$Path is a file object
Write-Verbose "Getting content from $($Path.Fullname)"
$content=Get-Content -Path $path.Fullname
}
else {
#$Path is a string
Write-Verbose "Getting content from $path"
$content=get-content -Path $Path
}

#add a little information
$stats=$content | Measure-Object -Word
Write-Verbose "Found approximately $($stats.words) words"

#write sorted unique values
$word.Matches($content) | select Value -unique | sort Value
}

End {
Write-Verbose "Ending $($myinvocation.mycommand)"
}

} #close function

The function uses Get-Content to retrieve the content (what else?!) of the specified file. At the beginning of the function I defined a regular expression object to find “words”.


#define a regular expression pattern to detect "words"
[regex]$word="\b\S+\b"

This is an intentionally broad pattern that searches for anything not a space. The \b element indicates a word boundary. Because this is a REGEX object, I can do a bit more than using a basic -match operator. Instead I’ll use the Matches() method which will return a collection of match objects. I can pipe these to Select-Object retrieving just the Value property. I also use the -Unique parameter to filter out duplicates. Finally the values are sorted.


$word.Matches($content) | select Value -unique | sort Value

The matches and filtering are NOT case-sensitive, which is fine for me. With the list I can see where I might have used write-host instead of Write-Host and go back to clean up my code. Let me show you how this works. Here’s a demo script.


#Requires -version 2.0

$comp = Read-Host "Enter a computer name"

write-host "Querying services on $comp" -fore Cyan
$svc = get-service -comp $comp

$msg = "I found {0} services on $comp" -f $svc.count
Write-Host "Results" -fore Green
Write-Host $mgs -fore Green

The script has some case inconsistencies as well as a typo. I’ve dot sourced the function in my PowerShell session. Here’s what I end up with.

For best results, you need to make sure there are spaces around commands that use the = sign. But now I can scan through the list and pick out potential problems. Sure, Set-StrictMode would help with variable typos but if I had errors in say comment based help, that wouldn’t help. Maybe you’ll find this useful in your scripting work, maybe not. But I hope you learned a few things about working with REGEX objects and unique properties.

Download Get-ContentWords and enjoy.