In the past I've posted a few PowerShell functions that provide all types of file and folder information. The other day I had a reason to revisit one of them and I spent a little time revising and expanding. This new function, Get-Extension will search a given folder and create a custom object for each file extension showing the total number of files, the total size, the average size, the maximum size and the largest size. At it's core, the function takes output from Get-ChildItem and pipes it to Measure-Object. But I've incorporated features such as filtering and the ability to run the entire function as a background job.
ManageEngine ADManager Plus - Download Free Trial
Exclusive offer on ADManager Plus for US and UK regions. Claim now!
By default, the function searches the top level of your $ENV:Temp folder and returns a custom object for each file type.
PS C:\> get-extension | sort TotalSize -Descending | Select -first 1 Analyzing C:\Users\Jeff\AppData\Local\Temp Average : 402240.857142857 Largest : 2655668 Path : C:\Users\Jeff\AppData\Local\Temp TotalSize : 2815686 Extension : log Count : 7 Smallest : 134
Here's how this works.
Function Get-Extension { [cmdletbinding(DefaultParameterSetName="*")] Param ( [Parameter(Position=0)] [validateNotNullorEmpty()] [string]$Path=$env:temp, [Parameter(ParameterSetName="Filter")] [string[]]$Include, [Parameter(ParameterSetName="Filter")] [string[]]$Exclude, [Parameter()] [switch]$Recurse, [switch]$Force, [switch]$AsJob ) Write-Verbose "Starting $($myinvocation.mycommand)" #Automatically turn on recursion if -Include or -Exclude is used if ($Include -OR $Exclude) { $Recurse=$True } #test and see if path is valid if (Test-Path -Path $path) { #Verify we are using a FileSystem path if ((Get-Item -Path $path).PSProvider.Name -ne "FileSystem") { Write-Warning "$($path.ToUpper()) is not a valid file system path." Return } #define a variable for message to be displayed when the search begins $msg="Analyzing $path" #build a command string based on parameters $cmd="Get-ChildItem -Path $path" if ($recurse) { $msg+=" recursively" $cmd+=" -recurse" } if ($include) { $ofs="," $msg+=" include $($Include -as [string])" $cmd+=" -include $($Include -as [string])" } if ($exclude) { $ofs="," $msg+=" exclude $($Exclude -as [string])" $cmd+=" -exclude $($Exclude -as [string])" } if ($force) { $cmd+=" -force" } #wrap the core commands into a scriptblock so it can be executed directly #or used with Start-Job $sb={Param([string]$cmd,[string]$Path) Write-Host $msg -ForegroundColor Green #get all files but exclude folders Write-Verbose "Executing: $cmd" $files= Invoke-Expression $cmd | where {-NOT $_.PSIsContainer} #put files into groups based on extension $group=$files | Group-Object -Property Extension Write-Verbose "Found $($group.count) file extensions" foreach ($extension in ($group | Sort Name)) { #calculate statistics for each group Write-Verbose "Measuring $($extension.name)" $stats=$extension.group | measure-object -Average -Sum -Minimum -Maximum -Property length #trim off the period from the extension if it exists if ($extension.name -match "\.") { $ext=$extension.name.Substring(1) } else { $ext=$extension.name } #write a custom object to the pipeline New-Object -TypeName PSObject -Property @{ Count=$stats.count #trim off the period Extension=$ext TotalSize=$stats.sum Largest=$stats.maximum Smallest=$stats.minimum Average=$stats.average Path=$Path } } #foreach }#$sb if ($AsJob) { Write-Verbose "Creating background job" Start-Job -ScriptBlock $sb -ArgumentList $cmd,$path } else { Invoke-Command -ScriptBlock $sb -ArgumentList $cmd,$path } } #if else { Write-Warning "Failed to find $path" } Write-Verbose "Ending $($myinvocation.mycommand)" } #end function
The function uses a few parameters from Get-ChildItem, like -Include, -Exclude and -Force. If you use one of the filtering parameters, then you also need to use -Recurse. You can specify it, or the function will automatically enable it if it detects -Include or -Exclude.
if ($Include -OR $Exclude) { $Recurse=$True }
Obviously (I hope), this only works on the file system. But I went ahead and added some code to verify that the specified path is from the FileSystem provider.
if ((Get-Item -Path $path).PSProvider.Name -ne "FileSystem") { Write-Warning "$($path.ToUpper()) is not a valid file system path." Return }
Normally, I'm not a big fan of Return. But in this situation it is exactly what I want since I want to terminate the pipeline. I could have also thrown an exception here but decided not to get that wild. Assuming the path is valid, the function builds a command string based on the specified parameters.
#build a command string based on parameters $cmd="Get-ChildItem -Path $path" if ($recurse) { $msg+=" recursively" $cmd+=" -recurse" } if ($include) { $ofs="," $msg+=" include $($Include -as [string])" $cmd+=" -include $($Include -as [string])" } if ($exclude) { $ofs="," $msg+=" exclude $($Exclude -as [string])" $cmd+=" -exclude $($Exclude -as [string])" } if ($force) { $cmd+=" -force" }
The function will invoke this string using Invoke-Expression and filter out any folders since all I care about are files.
#get all files but exclude folders Write-Verbose "Executing: $cmd" $files= Invoke-Expression $cmd | where {-NOT $_.PSIsContainer} #put files into groups based on extension $group=$files | Group-Object -Property ExtensionThe results are then grouped using Group-Object. Each extension group is piped to Measure-Object to calculate the statistics based on the file's length property.
foreach ($extension in ($group | Sort Name)) { #calculate statistics for each group Write-Verbose "Measuring $($extension.name)" $stats=$extension.group | measure-object -Average -Sum -Minimum -Maximum -Property lengthLastly, the function creates a custom object representing each file extension using the New-Object cmdlet.
New-Object -TypeName PSObject -Property @{ Count=$stats.count #trim off the period Extension=$ext TotalSize=$stats.sum Largest=$stats.maximum Smallest=$stats.minimum Average=$stats.average Path=$Path }
Because I'm writing an object tot he pipeline you can further sort, filter, export or whatever. This is what makes PowerShell so flexible and valuable to IT Pros.
One thing I quickly realized, was that scanning a large folder such as Documents folder or a file share UNC, could take a long time. I could use Start-Job with my original function, but it was a bit awkward. So I decided to include -AsJob as a parameter and move the job command into the function itself. This works because I take the entire core command and wrap it in a script block.
$sb={Param([string]$cmd,[string]$Path) Write-Host $msg -ForegroundColor Green #get all files but exclude folders Write-Verbose "Executing: $cmd" $files= Invoke-Expression $cmd | where {-NOT $_.PSIsContainer} ...
Because of scope the scriptblock needs parameters so I can pass it my command string and the Path variable which are used within the scriptblock. After $sb has been defined, if -AsJob was specified, the function uses Start-Job to create a background job. Otherwise, it uses Invoke-Command to execute it interactively.
if ($AsJob) { Write-Verbose "Creating background job" Start-Job -ScriptBlock $sb -ArgumentList $cmd,$path } else { Invoke-Command -ScriptBlock $sb -ArgumentList $cmd,$path }
Use the normal job cmdlets to get the results and manage the job. But now I can run something like this:
PS C:\> Get-Extension $env:userprofile\documents -include *.doc*,*.ppt*,*.xls* -AsJob Id Name State HasMoreData Location Command -- ---- ----- ----------- -------- ------- 31 Job31 Running True localhost Param([string]$cmd)...
As always I hope you'll let me know how this works for you. The complete script has comment based help and an optional line to uncomment at the end to create an alias for the function.
Download Get-Extension.
Quick tip. If you want to replace . at the start of a string instead of doing this
if ($extension.name -match “\.”) {
$ext=$extension.name.Substring(1)
}
else {
$ext=$extension.name
}
You can do
$ext = $extension.name -replace “^\.”,””
Of course I’d write the bulk of it as one enormous pipe-line, but your way is easier to read 🙂