Since this is Halloween weekend in the United States, I thought I'd offer up a PowerShell solution to a scary task - finding zombie files. Ok, maybe these aren't really living dead files, but rather files with a 0-byte length. It is certainly possible that you may intentionally want a 0 length file. But perhaps they are artifacts or accidents that you'd like to clean up. Here's one approach you might take.
ManageEngine ADManager Plus - Download Free Trial
Exclusive offer on ADManager Plus for US and UK regions. Claim now!
Before we dig into this, let me emphasize something. Many of my posts are intended to be educational in nature. I often take a scenario, like finding 0-byte files, as a starting point to teach concepts and techniques. There are almost always multiple ways to achieve a task. My code is never intended to be run in production as-is. Rather, you should take away how I am doing something and why. The concepts and techniques are more important than solving the ostensible goal.
Using CIM
For this task, I thought I'd turn to using Get-CimInstance. You may not be aware of it, but there is a WMI class that you can query to find files on a computer called CIM_Datafile. In most situations, and to be honest I'm not sure of one where this wouldn't be the case, all files are registered in the CIM repository. Here's what one of these objects looks like:
Status : OK
Name : c:\work\samples\w.txt
Caption : c:\work\samples\w.txt
Description : c:\work\samples\w.txt
InstallDate : 10/9/2020 3:35:49 PM
AccessMask : 18809343
Archive : True
Compressed : False
CompressionMethod :
CreationClassName : CIM_LogicalFile
CreationDate : 10/9/2020 3:35:49 PM
CSCreationClassName : Win32_ComputerSystem
CSName : PROSPERO
Drive : c:
EightDotThreeFileName : c:\work\samples\w.txt
Encrypted : False
EncryptionMethod :
Extension : txt
FileName : w
FileSize : 0
FileType : Text Document
FSCreationClassName : Win32_FileSystem
FSName : NTFS
Hidden : False
InUseCount :
LastAccessed : 10/30/2020 11:49:34 AM
LastModified : 10/9/2020 6:56:31 PM
Path : \work\samples\
Readable : True
System : False
Writeable : True
Manufacturer :
Version :
PSComputerName :
Let's say I want to find all files in C:\work\samples that has a file size of 0. I can write a WMI filter like this:
"Drive='c:' AND path = '\\work\\samples\\' AND filesize=0"
When creating WMI/CIM queries, you need to escape back slashes thus the path value \work\samples\ becomes \\work\\samples\\. Also remember that the operators in a WMI/CIM query are the legacy, not PowerShell, operators.
In PowerShell, we stress the importance of filtering early. If a command offers a way to filter, then use it. Early filtering is almost always better than getting everything and then piping to Where-Object. This is not to imply that using Where -Object is a bad thing. Sometimes you have no choice, or intentionally need to pipe to Where-Object. Just be sure you know why you are using Where-Object.
Here's my simple test.
I know that my basic query works and I get the result I expect. Now, I want to take a step back and file all 0-length files in C:\Work.
Get-CimInstance -ClassName CIM_Datafile -Filter "Drive='c:' AND path = '\\work\\' AND filesize=0" | Select-Object Name,FileSize
Hmmm. This query found the file in the root of C:\Work, but it didn't find the other files. The more specific you can make your WMI queries, typically the faster they run. However, in this case, I need to extend the query a bit by using a wildcard. I want to find files where the path starts with \\work.
Get-CimInstance -ClassName CIM_Datafile -Filter "Drive='c:' AND path LIKE '\\work\\%' AND filesize=0" | Select-Object Name,FileSize
In a WMI/Cim query, use % as the wildcard character in place of the traditional *. Also note that I changed the operator to LIKE. The operators are not case-sensitive. I used upper-case so that they would stand out. One significant drawback to this approach is that it is much slower than using "equal to" comparisons. But it should eventually work.
Caveats and Tips
When using WMI/CIM queries there are a few things to keep in mind. First, there's no guarantee that every single object (in this case file) has been registered with the CIM repository. It is theoretically possible that your query will miss files. In this specific use-case, you can always enumerate folders with Get-ChildItem.
As I was working on this, I discovered that if the folder is actually using a reparse point, the query takes considerably longer and might not even work as expected. For example, my C:\Scripts folder is actually a link to OneDrive.
PS C:\> get-item c:\scripts | Select Target
Target
------
{D:\OneDrive\Scripts}
I found that if I used C:\Scripts in my query, I got incomplete results. But if I used D:\OneDrive\Scripts I got the results I expected much faster.
Another way to speed up the process is to limit the properties that Get-CimInstance needs to get. This is another type of filtering. The first time I searched C:\Work for 0-byte files it took 8 minutes. But since I only need a few properties, I can refine my command:
Get-CimInstance -ClassName CIM_Datafile -Filter "Drive='c:' AND path LIKE '\\work\\%' AND filesize=0" -property Name,FileSize | Select-Object Name,FileSize
I got the same results but this time in 39 seconds. I'll admit that large queries can sometimes take a long time and sometimes run faster than you expect. Although as a rule, using LIKE, always slows down the query.
Creating a Function
With all of this in mind, I put together a function to find 0-byte files in a given path.
Function Get-ZeroLengthFiles {
[CmdletBinding()]
[alias("gzf", "zombie")]
Param(
[Parameter(Position = 0)]
[ValidateScript( { Test-Path -Path $_ })]
[string]$Path = ".",
[switch]$Recurse
)
Begin {
Write-Verbose "[$((Get-Date).TimeofDay) BEGIN ] Starting $($myinvocation.mycommand)"
#select a subset of properties which speeds things up
$get = "Name", "CreationDate", "LastModified", "FileSize"
$cimParams = @{
Classname = "CIM_DATAFILE"
Property = $get
ErrorAction = "Stop"
Filter = ""
}
} #begin
Process {
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Using specified path $Path"
#test if folder is using a link or reparse point
if ( (Get-Item -path $Path).Target) {
$target = (Get-Item -path $Path).Target
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] A reparse point was detected pointing towards $target"
#re-define $path to use the target
$Path = $Target
}
#convert the path to a file system path
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Converting $Path"
$cPath = Convert-Path $Path
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Converted to $cPath"
#trim off any trailing \ if cPath is other than a drive root like C:\
if ($cpath.Length -gt 3 -AND $cpath -match "\\$") {
$cpath = $cpath -replace "\\$", ""
}
#parse out the drive
$drive = $cpath.Substring(0, 2)
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Using Drive $drive"
#get the folder path from the first \
$folder = $cpath.Substring($cpath.IndexOf("\")).replace("\", "\\")
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Using folder $folder (escaped)"
if ($folder -match "\w+" -AND $PSBoundParameters.ContainsKey("Recurse")) {
#create the filter to use the wildcard for recursing
$filter = "Drive='$drive' AND Path LIKE '$folder\\%' AND FileSize=0"
}
elseif ($folder -match "\w+") {
#create an exact path pattern
$filter = "Drive='$drive' AND Path='$folder\\' AND FileSize=0"
}
else {
#create a root drive filter for a path like C:\
$filter = "Drive='$drive' AND Path LIKE '\\%' AND FileSize=0"
}
#add the filter to the parameter hashtable
$cimParams.filter = $filter
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Looking for zero length files with filter $filter"
#initialize a counter to keep track of the number of files found
$i=0
Try {
Write-Host "Searching for zero length files in $cpath. This might take a few minutes..." -ForegroundColor magenta
#find files matching the query and create a custom object for each
Get-CimInstance @cimParams | ForEach-Object {
#increment the counter
$i++
#create a custom object
[PSCustomObject]@{
PSTypeName = "cimZeroLengthFile"
Path = $_.Name
Size = $_.FileSize
Created = $_.CreationDate
LastModified = $_.LastModified
}
}
}
Catch {
Write-Warning "Failed to run query. $($_.exception.message)"
}
if ($i -eq 0) {
#display a message if no files were found
Write-Host "No zero length files were found in $cpath." -ForegroundColor yellow
}
else {
Write-Verbose "[$((Get-Date).TimeofDay) PROCESS] Found $i matching files"
}
} #process
End {
Write-Verbose "[$((Get-Date).TimeofDay) END ] Ending $($myinvocation.mycommand)"
} #end
}
I hope that the comments explain what I am doing and why. I even added a fun command alias for the season.
In the function I'm writing a custom object to the pipeline so that I can use the results in PowerShell.
I ran the function for C:\ recursively and it took about 41 seconds to find 1774 files. That seems pretty efficient to me!
Summary
When approaching any task, always start with a simple command that you can run in the console. If you can't run a command successfully from a prompt, you will struggle to write a function or script around it. Even I didn't write the function first. This is especially true when using the CIM cmdlets and trying to come up with an elegant and efficient query.
If you'd like to learn more about PowerShell scripting, consider grabbing a copy of PowerShell Scripting and Toolmaking. If you have any questions about what I did or why, please feel free to leave a comment.
2 thoughts on “Finding Zombie Files with PowerShell”
Comments are closed.