A few days ago I posted a quick PowerShell puzzle as part of an appearance announcement.
ManageEngine ADManager Plus - Download Free Trial
Exclusive offer on ADManager Plus for US and UK regions. Claim now!
Using the about_Splatting help file, what word is used the most frequently? If you can show the top 5 even better. Ideally, your code will treat words like “for,” and “for” as the same. For extra points, create a simple custom object that shows the help topic name, the total number of words, “,” is not a word, the top word and the top word count. Also, try to skip “the” which is almost always a common word.
A few people posted links to their solutions. And if you want to try your hand at the problem, stop reading and come back later to compare your work with mine.
My Solution
First of all, I will be the first to tell you that my answer is not the only solution or even the best solution. But it gets the job done so it is "good enough."
My solution is written as a PowerShell script so that I could easily test it with other abou_* help t topics. The script is nothing more than two PowerShell expressions.
Param([string]$topic = "About_splatting")
#I'm filtering out words 'the', 'a' and 'an' before
#selecting the most-used word.
$results = ($(Get-Help $topic).split()) -replace "\W|_","" |
Where-Object {$_} -OutVariable all | Group-Object |
Where-Object {$_.name -notmatch "\b(the|a|an)\b"} |
Sort-Object -property count -Descending |
Select-Object -first 5
#create a custom object
[PSCustomObject]@{
Topic = $topic
TotalWords = $all.Count
TopWord = $results[0].Name
TopWordCount = $results[0].count
PSVersion = $PSVersionTable.PSVersion
}
Here's how it works. Get-Help writes a string object to the pipeline. I'm splitting the string using the default white space and then replacing each non-word character (\W) or the underscore with nothing. This has the effect of parsing out "about_" and treating it as "about". These results are sent to Where-Object which is filtering out blank lines. I could have included this kind of regex parsing in the first part of the command. This leaves a list of words. Notice that I'm using -OutVariable to save the output of this particular command to a variable, $all. I'll use this later.
The words are then grouped and filtered again to ignore common words, "the","a", and "an". The filtered group results are then sorted on the count property in descending order and I select the first 5. Although I probably only needed the first 1.
The last part of the script creates a custom file from the first item in $results. I could have created a custom object for each item in $results. Anyway, this is what I end up with.
Much of your PowerShell work involves working with objects in the pipeline. If you can visualize the process, or at least verbally describe it, I think you'll find PowerShell easier to use and write. The goal of the problem isn't the end result which is of no practical value, but rather building up your PowerShell muscles.
The solution is wrong in that `Where-Object {$_.name -notmatch ‘the|a|an’}` filters out all words containing an “a” like command or parameter. It should read `Where-Object {$_.name -notmatch ‘^(the|a|an)$’}`. Or even better avoid regex altogether: `Where-Object {$_.name -notin (‘the’, ‘a’, ‘an’)}`
Good catch. In my haste, I didn’t properly test the regex pattern. This would work better: “\b(the|a|an)\b” I’ll update the code.
Jeff,
Thank you for sharing your time and expertise with us. I always learn from these challenges. I enjoy the way you break down each step in your solution.
Cheers!
Eric O