Using Types with Imported CSV Data in PowerShell

The Import-CSV cmdlet in PowerShell is incredibly useful. You can take any CSV file and pump objects to the pipeline. The cmdlet uses the CSV header as properties for the custom object.


PS S:\> import-csv .\testdata.csv

Date : 1/18/2012 6:45:30 AM
Name : Data_1
Service : ALG
Key : 1
Size : 25

Date : 1/18/2012 2:17:30 AM
Name : Data_2
Service : AppIDSvc
Key : 2
Size : -30
...

But there is a downside: all of the properties are strings.


PS S:\> import-csv .\testdata.csv | get-member

TypeName: System.Management.Automation.PSCustomObject

Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Date NoteProperty System.String Date=1/18/2012 6:45:30 AM
Key NoteProperty System.String Key=1
Name NoteProperty System.String Name=Data_1
Service NoteProperty System.String Service=ALG
Size NoteProperty System.String Size=25

The means some tasks such sorting or filtering will fail. But there are ways to get around this limitation. One way is to use an expression to cast a property to a different type. For example, I want to sort my test data on the Date property, but it needs to be a [DateTime] object to sort properly. Here’s how:


PS S:\> import-csv testdata.csv | sort @{expression={$_.date -as [datetime]}} | Select Date,Name,Size

Date Name Size
---- ---- ----
1/9/2012 6:28:30 PM Data_25 26
1/11/2012 11:13:30 AM Data_20 44
1/11/2012 6:28:30 PM Data_23 33
1/13/2012 12:13:30 AM Data_16 42
1/13/2012 4:45:30 PM Data_24 47
...

My output object properties are all still strings. All I did was cast the Date property in the Sort expression. Here’s an example using filtering.


PS S:\> import-csv testdata.csv | where {($_.date -as [datetime]) -le ("1/12/2012" -as [datetime])} | Select Date,Name,Size

Date Name Size
---- ---- ----
1/11/2012 11:13:30 AM Data_20 44
1/11/2012 6:28:30 PM Data_23 33
1/9/2012 6:28:30 PM Data_25 26

These examples are only producing results. More likely I want to import the CSV file as typed objects. Assuming you know in advance the property names and what types you want to use, here’s how you could achieve this.


PS S:\> $data=import-csv testdata.csv | Select @{Name="Date";Expression={[datetime]$_.Date}}, Name,Service,@{Name="Key";Expression={[int32]$_.Key}},@{Name="Size";Expression={[int32]$_.Size}}

I imported my CSV file and piped it to Select-Object, using hash tables to redefine the properties with appropriate types. Import-CSV writes a PSCustomObject to the pipeline anyway so using Select-Object has no effect other than giving me typed properties.


PS S:\> $data | get-member

TypeName: Selected.System.Management.Automation.PSCustomObject

Name MemberType Definition
---- ---------- ----------
Equals Method bool Equals(System.Object obj)
GetHashCode Method int GetHashCode()
GetType Method type GetType()
ToString Method string ToString()
Date NoteProperty System.DateTime Date=1/18/2012 6:45:30 AM
Key NoteProperty System.Int32 Key=1
Name NoteProperty System.String Name=Data_1
Service NoteProperty System.String Service=ALG
Size NoteProperty System.Int32 Size=25

Now I can use $data objects anyway I want.


PS S:\> $data | where {$_.size -ge 40 -AND $_.key -le 10}

Date : 1/17/2012 11:57:30 PM
Name : Data_3
Service : Appinfo
Key : 3
Size : 42

I’m working on something that takes this idea to the next level but it isn’t quite ready for prime time. But I hope this will help manage imported objects a bit more efficiently and let you really take advantage of the PowerShell pipeline.

Simple Where Filters

The comment about how awkward it is in PowerShell to filter out folders with Get-ChidlItem, or its alias dir, came up the other day on Twitter. I’ll be the first to admit that running a DIR command and wanting to skip folders, or perhaps you only want top level folders, is more cumbersome than we would like in PowerShell v2. In PowerrShell v3 this has been addressed but for now we’re stuck with expressions like this:

[cc lang=”PowerShell”]
dir c:\work -rec | where {!$_.PSIsContainer} | group extension | sort count
[/cc]

If you find yourself frequently performing this type of filtering in the shell, here’s a tip that can save a little typing. Create these two functions:

[cc lang=”PowerShell”]
function IsDir {process {$_ | Where {$_.PSIsContainer}}}
function NotDir {process {$_ | Where {!$_.PSIsContainer}}}
[/cc]

These are so simple I’m not even going to include a text download. You can call them whatever you want. The functions are written as filtering functions, notice the use of Process script block. In short, these are wrappers for the standard Where-Object filter we’re used to using. But now I can do this:

[cc lang=”PowerShell”]
dir c:\work -rec | notdir | group extension | sort count
[/cc]

Or perhaps this:

[cc lang=”PowerShell”]
dir c:\work | isdir | Select FullName,@{Name=”Size”;Expression={(dir $_.fullname -rec | notdir | measure length -sum).sum}} | format-table -auto
[/cc]

If you find yourself needing this every day, put the function definitions in your profile. You can use this same technique for other common filters you tend to use often. What else can you come up with?

Filtering Empty Values in PowerShell

I saw this tip today and wanted to leave a comment but couldn’t see how. So I thought I’d post my comments here. This is actually a question I see often and there are better ways to write this kind of code.

The posted tip used an example where you wanted to find processes where the company name is defined. The way suggested in the tip, and a technique I see often goes something like this:

[cc lang=”PowerShell”]
PS C:\> get-process | where {$_.Company -ne $Null} | Sort Company| Select Name,ID,Company
[/cc]

While it mostly works, this is a better PowerShell approach, in my opinion.

[cc lang=”PowerShell”]
PS C:\> get-process | where {$_.Company} | Sort Company| Select Name,ID,Company”
[/cc]

When I run the first technique, I still got a blank company name. The tip offers a work around for this situation like this:

[cc lang=”PowerShell”]
PS C:\> get-process | where {$_.Company -ne $Null -AND $_.company -ne ”} | Sort Company| Select Name,ID,Company
[/cc]

This gives the same result as my suggested approach. My approach uses Where-Object to say, if the Company property exists, pass on the object. If you wanted to find processes without a company name, then use the -NOT operator.

[cc lang=”PowerShell”]
PS C:\> get-process | where {-Not $_.Company}
[/cc]

I use a similar technique to filter out blank lines in text files.

[cc lang=”PowerShell”]
get-content computers.txt | where {$_} …
[/cc]

While we’re on the subject, a related filtering technique I often see involves boolean properties. You don’t have to do this:

[cc lang=”PowerShell”]
PS C:\> dir | where {$_.PsIsContainer -eq $True}
[/cc]

PsIsContainer is a boolean value, so let Where-Object simply evaluate it:

[cc lang=”PowerShell”]
PS C:\> dir | where {$_.PsIsContainer}
[/cc]

As above, use -Not to get the inverse. Don’t feel you need to explicitly evaluate properties in a Where-Object expression. I see this is a VBScript transition symptom that I hope you can break.

Get Properties with Values

One of my nuisance issues when using WMI with Windows PowerShell, is that when looking at all properties I have to wade though many that have no value. I’d prefer to only view properties that have a populated value. Here’s one way. Continue reading