If there's one task I've never stopped doing, it is finding files. I am constantly creating new ways to organize files and display them in a meaningful format. Naturally, PowerShell is a great tool for this task. Get-ChildItem is obviously the proper starting point. The cmdlet works fine in getting only files from a folder and I can do basic early filtering by name and extension with wildcards. But my latest task was organizing files by date, and because of the way Get-ChildItem works under-the-hood, I'm going to need to resort to late filtering with Where-Object. There's nothing wrong with that. But if this is a task I'm likely to repeat, then a PowerShell function is on the drawing board. My goal is to create a function that will display files grouped into aging buckets such as 1-week or 6-months. Even though I'm primarily concerned with age based on the last write time, I (or you) might have a need to base aging on the creation time. Let's code.
ManageEngine ADManager Plus - Download Free Trial
Exclusive offer on ADManager Plus for US and UK regions. Claim now!
Building a List
My function will take pipeline input from Get-Childitem. However, because I'm going to be sorting and grouping files, I can't do that until all of the files from Get-ChildItem have been processed by my function. I need to temporarily store them as they come through the pipeline. I could use an array object and add each pipelined object to it. My code might look something like this.
Begin {
$files = @()
}
Process {
$files += $inputobject
}
End {
$files | Sort-Object -property LastWriteTime ...
}
There's technically nothing wrong with this approach and I have used it for years. But lately, I've been turning to a generic list. Specifically a [System.Collections.Generic.list[]] object. The reason is that when you add to an array, each time you add an element the array is wiped out and rebuilt. For a small number of items, that is no problem. But I could be processing thousands of files. There will be a slight performance trade-off for ease of use. I added almost 9000 files to an array in just over 3 seconds. Using the generic list took less than a second.
In the Begin block of my function, I could define the collection object like this:
$list = [system.Collections.Generic.list[system.io.fileinfo]]::new()
But, I'm going to use a scripting technique to simplify that syntax. In the .ps1 file that defines my function, before the function I'm going to insert a Using statement.
Using namespace System.Collections.Generic
Now in the Begin block, my definition is much simpler:
$list = [List[System.IO.FileInfo]]::new()
You generally tell PowerShell what type of object is going into your list. In my case, I know it will be System.IO.FileInfo objects. If you are unsure or will have a mix of objects, you can simply use 'object'.
The list object has a number of methods that I won't get into here. In the Process block, I'll add each file to the list.
$list.Add($FilePath)
When I get to the End block, $list will contain all the files piped into my function. Now I can do my bucket magic.
Age Grouping
I had to decide what kind of aging buckets I wanted. I decided by year, month, and number of days. I'll control this with a parameter.
[Parameter(HelpMessage = "How do you want the files organized? The default is Month.")]
[ValidateNotNullOrEmpty()]
[ValidateSet("Month","Year","Days")]
[string]$GroupBy = "Month"
I also needed to control what I was basing aging decisions.
[Parameter(HelpMessage = "Specify if grouping by the file's CreationTime or LastWriteTime property. The default is LastWriteTime.")]
[ValidateSet("CreationTime","LastWritetime")]
[ValidateNotNullOrEmpty()]
[string]$Property = "LastWriteTime",
For each file in my list, I need to add a property that indicates its age group. I know I'm going to add a property to the file object.
$file | Add-Member -MemberType NoteProperty -Name AgeGroup -Value $value
The value will be calculated based on the grouping bucket. If I want to group on the Year, for each file I will run this code:
$value = $file.$property.year
$sort = "AgeGroup"
The $sort variable will be used at the end of the function to help format the output. For month grouping, I needed the month and the year. I also needed to turn the value into a DateTime object so that it would sort properly. $Property will be either CreationTime or LastWriteTime.
$value = "$($file.$property.month)/$($file.$property.year)"
$sort = {$_.AgeGroup -as [datetime]}
Custom Property Custom Values
Grouping on the number of days was the trickiest step. I decided to define a set of bucket names like OneWeek and SixMonth. However, this meant sorting would be on the string value which wouldn't be accurate. My solution was to define an Enum in my function file.
Enum FileAge {
YearPlus
Year
NineMonth
SixMonth
ThreeMonth
OneMonth
OneWeek
OneDay
}
Why? Because in an Enum, each value actually has a numeric value starting at 0. Even though I'm using strings to indicate the aging bucket, when I sort, it will be on the implicit numeric value. With this enum, I can assign a value based on the total days.
$age = New-TimeSpan -start $file.$property -end $now
$value = switch ($age.totaldays -as [int]) {
{$_ -ge 365} {[FileAge]::YearPlus ; break}
{$_ -gt 270 -AND $_ -lt 365 }{[FileAge]::Year;break}
{$_ -gt 180 -AND $_ -le 270 }{[FileAge]::NineMonth;break}
{$_ -gt 90 -AND $_ -le 180} {[FileAge]::SixMonth;break}
{$_ -gt 30 -AND $_ -le 90} {[FileAge]::ThreeMonth;break}
{$_ -gt 7 -AND $_ -le 30} {[FileAge]::OneMonth;break}
{$_ -gt 1 -And $_ -le 7} {[FileAge]::OneWeek;break}
{$_ -le 1} {[FileAge]::OneDay;break}
}
$sort = "AgeGroup"
I could have created a custom object, even using a PowerShell class. But I'm happy to piggyback on the System.IO.FileInfo type. However, the default formatting doesn't know anything about my custom property. I know I'll want to create my own formatting so I need to insert a new type name.
$file.psobject.TypeNames.insert(0,"FileAgingInfo")
Planning ahead, I know I will be using grouping in my custom formatting, which means the objects need to be sorted. Normally, I'd leave sorting out of a function, but in this case, I need it.
$list | Sort-Object -Property $sort,DirectoryName,$property,Name
I am sorting first on my $sort variable, then the directory name, then the CreationTime or LastWriteTime property, and finally the file name.
Here's the complete function:
#requires -version 5.1
#declaring a namespace here makes the code simpler later on
Using namespace System.Collections.Generic
#define an enumeration that will be used in a new custom property
Enum FileAge {
YearPlus
Year
NineMonth
SixMonth
ThreeMonth
OneMonth
OneWeek
OneDay
}
Function Get-FileAgeGroup {
[cmdletbinding()]
[alias("gfa")]
[OutputType("FileAgingInfo")]
Param(
[Parameter(Mandatory,Position=0,ValueFromPipeline,HelpMessage = "The path to a file. If you pipe from a Get-ChildItem command, be sure to use the -File parameter.")]
[ValidateNotNullOrEmpty()]
[System.IO.FileInfo]$FilePath,
[Parameter(HelpMessage = "Specify if grouping by the file's CreationTime or LastWriteTime property. The default is LastWriteTime.")]
[ValidateSet("CreationTime","LastWritetime")]
[ValidateNotNullOrEmpty()]
[string]$Property = "LastWriteTime",
[Parameter(HelpMessage = "How do you want the files organized? The default is Month.")]
[ValidateNotNullOrEmpty()]
[ValidateSet("Month","Year","Days")]
[string]$GroupBy = "Month"
)
Begin {
Write-Verbose "[$((Get-Date).TimeofDay) BEGIN ] Starting $($myinvocation.mycommand)"
Write-Verbose "[$((Get-Date).TimeofDay) BEGIN ] Grouping by $GroupBy on the $Property property"
#initialize a list to contain all piped in files
#this code is shorter because of the Using statement at the beginning of this file
$list = [List[System.IO.FileInfo]]::new()
#get a static value for now
$now = Get-Date
} #begin
Process {
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Adding $FilePath"
#add each file to the list
$list.Add($FilePath)
} #process
End {
#now process all the files and add aging properties
Write-Verbose "[$((Get-Date).TimeofDay) END ] Sorting $($list.count) files"
#add custom properties based on the age grouping
foreach ($file in $list) {
switch ($GroupBy) {
"Month" {
$value = "$($file.$property.month)/$($file.$property.year)"
$sort = {$_.AgeGroup -as [datetime]}
}
"Year" {
$value = $file.$property.year
$sort = "AgeGroup"
}
"Days" {
$age = New-TimeSpan -start $file.$property -end $now
$value = switch ($age.totaldays -as [int]) {
{$_ -ge 365} {[FileAge]::YearPlus ; break}
{$_ -gt 270 -AND $_ -lt 365 }{[FileAge]::Year;break}
{$_ -gt 180 -AND $_ -le 270 }{[FileAge]::NineMonth;break}
{$_ -gt 90 -AND $_ -le 180} {[FileAge]::SixMonth;break}
{$_ -gt 30 -AND $_ -le 90} {[FileAge]::ThreeMonth;break}
{$_ -gt 7 -AND $_ -le 30} {[FileAge]::OneMonth;break}
{$_ -gt 1 -And $_ -le 7} {[FileAge]::OneWeek;break}
{$_ -le 1} {[FileAge]::OneDay;break}
}
$sort = "AgeGroup"
}
} #switch
#add a custom property to each file object
$file | Add-Member -MemberType NoteProperty -Name AgeGroup -Value $value
#insert a custom type name which will be used by the custom format file
$file.psobject.TypeNames.insert(0,"FileAgingInfo")
} #foreach file
#write the results to the pipeline. Sorting results so that the default
#formatting will be displayed properly
$list | Sort-Object -Property $sort,DirectoryName,$property,Name
Write-Verbose "[$((Get-Date).TimeofDay) END ] Ending $($myinvocation.mycommand)"
} #end
} #close Get-FileAgeGroup
Formatted Output
The whole point of this work is to make it easy for me to see files grouped by age like this:
This is why I needed a custom typename and pre-sorted output. I used New-PSFormatXML to create a custom format ps1xml file.
<!--
Format type data generated 09/17/2021 11:54:13 by PROSPERO\Jeff
This file was created using the New-PSFormatXML command that is part
of the PSScriptTools module.
https://github.com/jdhitsolutions/PSScriptTools
-->
<Configuration>
<ViewDefinitions>
<View>
<!--Created 09/17/2021 11:54:13 by PROSPERO\Jeff-->
<Name>default</Name>
<ViewSelectedBy>
<TypeName>FileAgingInfo</TypeName>
</ViewSelectedBy>
<GroupBy>
<!--
You can also use a scriptblock to define a custom property name.
You must have a Label tag.
<ScriptBlock>$_.machinename.toUpper()</ScriptBlock>
<Label>Computername</Label>
Use <Label> to set the displayed value.
-->
<PropertyName>AgeGroup</PropertyName>
<Label>AgeGroup</Label>
</GroupBy>
<TableControl>
<!--Delete the AutoSize node if you want to use the defined widths.-->
<AutoSize />
<TableHeaders>
<TableColumnHeader>
<Label>Directory</Label>
<Width>16</Width>
<Alignment>left</Alignment>
</TableColumnHeader>
<TableColumnHeader>
<Label>Created</Label>
<Width>24</Width>
<Alignment>left</Alignment>
</TableColumnHeader>
<TableColumnHeader>
<Label>Modified</Label>
<Width>23</Width>
<Alignment>left</Alignment>
</TableColumnHeader>
<TableColumnHeader>
<Label>Length</Label>
<Width>11</Width>
<Alignment>right</Alignment>
</TableColumnHeader>
<TableColumnHeader>
<Label>Name</Label>
<Width>10</Width>
<Alignment>left</Alignment>
</TableColumnHeader>
</TableHeaders>
<TableRowEntries>
<TableRowEntry>
<TableColumnItems>
<!--
By default the entries use property names, but you can replace them with scriptblocks.
<ScriptBlock>$_.foo /1mb -as [int]</ScriptBlock>
-->
<TableColumnItem>
<PropertyName>DirectoryName</PropertyName>
</TableColumnItem>
<TableColumnItem>
<PropertyName>CreationTime</PropertyName>
</TableColumnItem>
<TableColumnItem>
<PropertyName>LastWriteTime</PropertyName>
</TableColumnItem>
<TableColumnItem>
<PropertyName>Length</PropertyName>
</TableColumnItem>
<TableColumnItem>
<PropertyName>Name</PropertyName>
</TableColumnItem>
</TableColumnItems>
</TableRowEntry>
</TableRowEntries>
</TableControl>
</View>
</ViewDefinitions>
</Configuration>
At the end of my ps1 file, I import the format file into my PowerShell session.
Update-FormatData $psscriptroot\FileAge.format.ps1xml
I only have a single view for now.
Because I'm using the enum, my files are properly sorted and I get meaningful output. I might need to tweak the format file a bit.
Or I can adjust my PowerShell expression.
As with many of my posts, the end result isn't as important as the techniques I used to get there. Comments and questions are welcome.
nice post
some performance suggestions:
Get-ChildItem can recursively search folders but it is very slow, especially in Windows PowerShell. A much faster file enumeration uses [System.IO.DirectoryInfo] and its method GetFiles().
more cool things: https://powershell.one/tricks/filesystem/finding-duplicate-files
You will always get better performance using native .NET code. Cmdlets offer ease of use among many other features which is a trade-off. I still recommend that people stick to cmdlets unless there is a compelling use case to resort to using .NET. And even then, if you really, really need that kind of performance, you are probably better off building a compiled .NET application. PowerShell scripts will always be slower. Still, I was curious and tested. Running ‘get-childitem c:\scripts -file -recurse | Get-FileAgeGroup took 3.7 seconds. Using [System.IO.DirectoryInfo] to get the files and pipe to my function took over 8 seconds. That’s why I also encourage people to test for themselves with Measure-Command.