I found myself tonight needing to build a function to search the contents of text files in a very large zip archive to find one containing a specific value. To handle the operation quickly, I wanted to perform the operation in memory. While that may be too specific to be useful for most, I thought at least an example of how to retrieve a file within a zip archive and parse its content might be interesting to a wider audience, and more importantly, a useful archive for myself.
P.S. – I am happy to share the more complex end result if anyone tells me it’s useful to them.
What you will need
While the new extract-archive and compress-archive cmdlets are handy for basic zip archive creation and extraction, they are not much help when you need to get down to the item level within a zip archive. The system.io.compression assemblies are required, which you must load explicitly, as they are not loaded by default. For this simple example, these are key enablers:
ZipFile class, to open the archive to parse members
GetEntry method, to retrieve individual file (or files) in the archive
StreamReader class, to read the file into memory and search content
In this example, we will open a zip archive named scripts.zip on the D:\ drive, retrieve a file named ConnectToAzure.txt and search for the value.
Select your zip archive.
$ZipArchive = "d:\scripts.zip"
Open archive for reading.
$ZipStream = [io.compression.zipfile]::OpenRead(“$ZipArchive”)
Select the item in the archive. Notice how you must reference the folder in the path to the file.
$ZipItem = $ZipStream.GetEntry('Scripts/ConnectToAzure.txt ')
Open the item from the archive.
$ItemReader = New-Object System.IO.StreamReader($ZipItem.Open())
Use Streamreader class and read into memory. $DocItemSet represents the contents of the file.
$DocItemSet = $ItemReader.ReadToEnd()
Search the file contents for desired value.