DEV Community

Cover image for Extract Zip/7Zip Archives w/o Disk Write
Mavaddat Javid
Mavaddat Javid

Posted on • Edited on

Extract Zip/7Zip Archives w/o Disk Write

Sometimes when traveling, you want one thing way at the bottom of your suitcase but you don't want to take everything else out.

Mary Poppins magic bag scene

Devs often want to do the same with a remote archive: Read the archive byte-stream from an endpoint and get one or two file(s) you need. The same strategy for zip files can save time when processing a large number of archives (remember that disk I/O is by far the latency bottleneck of all types of memory).

In particular, Magnar Myrtveit on SuperUser.com asked:

Using PowerShell 5.1, how can I download a tar.xz archive and extract it without writing it to disk first?
[…]

Challenge: Download, extract xz archive in PowerShell

I describe two approaches to achieve the objective. First, without writing to disk; second, writing to temporary files. (Bonus ordinary zip instructions at the end.)

Without writing anything to disk

Kids from Mary Poppins looking at amazement to Ms Poppins

By example, suppose we want to download/extract files from the mingw32-dev.tar.xz XZ archive available at MinGW - Minimalist GNU for Windows.

We can download without writing to disk as follows:

$response = Invoke-WebRequest -Uri 'https://mirrors.gigenet.com/OSDN//mingw/70554/mingwrt-5.2.1-mingw32-dev.tar.xz' -ContentType 'application/octet-stream'
Enter fullscreen mode Exit fullscreen mode

Our target archive is available to the shell as a byte array in $r.Content. How to extract this?

GitHub logo adoconnection / SevenZipExtractor

C# wrapper for 7z.dll

SevenZipExtractor

C# wrapper for 7z.dll (x86 and x64 included)

  • .NET Standard 2.0
  • .NET Framework 4.5

NuGet NuGet

Every single star makes maintainer happy!

NuGet

Install-Package SevenZipExtractor

Supported formats:

  • 7Zip
  • APM
  • Arj
  • BZip2
  • Cab
  • Chm
  • Compound
  • Cpio
  • CramFS
  • Deb
  • Dll
  • Dmg
  • Exe
  • Fat
  • Flv
  • GZip
  • Hfs
  • Iso
  • Lzh
  • Lzma
  • Lzma86
  • Mach-O
  • Mbr
  • Mub
  • Nsis
  • Ntfs
  • Ppmd
  • Rar
  • Rar5
  • Rpm
  • Split
  • SquashFS
  • Swf
  • Swfc
  • Tar
  • TE
  • Udf
  • UEFIc
  • UEFIs
  • Vhd (?)
  • Wim
  • Xar
  • XZ
  • Z
  • Zip

Examples

Extract all

using (ArchiveFile archiveFile = new ArchiveFile(@"Archive.ARJ"))
{
    archiveFile.Extract("Output"); // extract all
}
Enter fullscreen mode Exit fullscreen mode

Extract to file or stream

using (ArchiveFile archiveFile = new ArchiveFile(@"Archive.ARJ"))
{
    foreach (Entry entry in archiveFile.Entries)
    {
        Console.WriteLine(entry.FileName);
        // extract to file
        entry.Extract(entry.FileName);
Enter fullscreen mode Exit fullscreen mode

Using SevenZipExtractor C# wrapper for 7Zip, we extract as follows:

#Download and install from nuget.org
Find-Package -Source NuGet -Name SevenZipExtractor | Install-Package -Scope CurrentUser

#Add the SevenZip assembly to our current PowerShell session
(Get-Item (Join-Path (Split-Path (Get-Package SevenZipExtractor).Source) lib/netstandard*) |
  Sort-Object { [version] ($_.Name -replace '^netstandard') })[-1] |
    Get-ChildItem -Filter *.dll -Recurse |
      ForEach-Object { Write-Host "Adding ``$($_.Name)``"; Add-Type -LiteralPath $_.FullName }

Enter fullscreen mode Exit fullscreen mode

The SevenZipExtractor class includes, inter alia, the following overloaded constructor signatures:

public SevenZipExtractor(Stream archiveStream);
public SevenZipExtractor(Stream archiveStream, string password);
public SevenZipExtractor(Stream archiveStream, SevenZipFormat format);
public SevenZipExtractor(Stream archiveStream, string password, InArchiveFormat format);
Enter fullscreen mode Exit fullscreen mode

Here Stream means data-type System.IO.Stream and SevenZipFormat means type SevenZipExtractor.SevenZipFormat.

So we can use the SevenZipExtractor class by

# If $response.Content -is [byte[]] then it is sufficient to do 
$sevenZipStream = [System.IO.MemoryStream]::new(($r.Content))
# However, sometimes the file server does not respect the earlier ContentType specification
# In this case, we need to convert a string from encoding
if ($response.Content -isnot [byte[]])
{
    $encoding = $response.Encoding
    $bytes = $encoding.GetBytes($response.Content)
    $sevenZipStream = [System.IO.MemoryStream]::new($bytes)
}
else
{
    $sevenZipStream = [System.IO.MemoryStream]::new($response.Content)
}
$szExtractor = New-Object -TypeName SevenZipExtractor.ArchiveFile -ArgumentList @($sevenZipStream, )
$szExtractor.Extract("$env:TEMP",$False) # Instead of $env:TEMP, wherever you want the files to go
Enter fullscreen mode Exit fullscreen mode

Here is a working example I wrote that implements the above approach to download and install ffmpeg.

Writing to temp file to disk

If we were writing the file to disk, the objective is more simple to achieve using the 7Zip4Powershell module.

Install-Module -Name 7Zip4Powershell
Import-Module -Name 7Zip4Powershell -Global
Enter fullscreen mode Exit fullscreen mode

GitHub logo thoemmi / 7Zip4Powershell

Powershell module for creating and extracting 7-Zip archives

7Zip4Powershell

Powershell module for creating and extracting 7-Zip archives supporting Powershell's WriteProgress API.

Screenshot

Note

Please note that this repository is not maintained anymore. I've created it a couple of years ago to fit my own needs (just compressing a single folder). I love that lots of other users find my package helpful.

I really appreciated if you report issues or suggest new feature. However I don't use this package myself anymore, and I don't have the time to maintain it appropriately. So please don't expect me to fix any bugs. Any Pull Request is welcome though.

Usage

The syntax is simple as this:

Expand-7Zip
    [-ArchiveFileName] <string>
    [-TargetPath] <string>
    [-Password <string>] | [-SecurePassword <securestring>]
    [<CommonParameters>]
Compress-7Zip
    [-ArchiveFileName] <string>
    [-Path] 
Enter fullscreen mode Exit fullscreen mode

However, 7Zip4PowerShell does not implement all the overloaded method signatures of SevenZipExtractor.

Since the Windows 10 Preview Build 17063, bsdtar is included with PowerShell. To extract a tar.xz file, invoke the tar command with the -‍-‍extract (-‍x) option and specify the archive file name after the -‍f option:

tar -xf .\whatever.xz
Enter fullscreen mode Exit fullscreen mode

tar auto-detects the compression type and extracts the archive. For more verbose output, use the -‍v option. This option tells tar to display the names of the files being extracted on the terminal.

Ordinary zip file in memory

As a bonus, let's see how it works with ZIP. Here is an example in which I find aes.h inside the ffmpeg source code zip archive:

[System.Reflection.Assembly]::LoadWithPartialName('System.IO.Compression')
    $IWRresult = Invoke-WebRequest -Uri "https://ffmpeg.zeranoe.com/builds/win64/dev/ffmpeg-latest-win64-dev.zip"  -SslProtocol Tls12 -Method Get
    $zipStream = New-Object System.IO.Memorystream
    $zipStream.Write($IWRresult.Content,0,$IWRresult.Content.Length)
    $zipFile = [System.IO.Compression.ZipArchive]::new($zipStream)
    #OK, what's in the archive I just downloaded?
    #Write the archive contents to the shell output
    $zipFile.Entries | Select-Object -ExcludeProperty @('Archive','ExternalAttributes') | Format-Table #I don't care about 'Archive' or 'ExternalAttributes', so I instruct suppress those
    #oh, there's my `aes.h` inside `ffmpeg-latest-win64-dev/include/libavutil/`
    $entry =  $zipFile.GetEntry('ffmpeg-latest-win64-dev/include/libavutil/aes.h')
    #now we have a streamreader, we can do all the things
    #for example, let's output the content to the screen
    $reader = [System.IO.StreamReader]::new($entry.Open())
    Write-Host $reader.ReadToEnd()

Enter fullscreen mode Exit fullscreen mode

This is particularly useful if you want to store a small file or a long string as a compressed ZIP embedded as a base-64 string within a script (large files or text will flood the script even when compressed). Consider a large PowerShell object that we previous saved as a Cli Xml file. We may want to make this object available without also passing the XML. Imagine executing the following steps in a PowerShell session:

#Requires -Version 6.0
# Earlier, we saved the object to a file. Now we can read it back in.
[pscustomobject]@{ Greeting = "Hello World" } | Export-Clixml -Path $env:TEMP\helloworld.xml  # Imagine we did this earlier

# We can save this XML file to a zip file using Write-Zip and then encode the bytes as a base64 string
Write-Zip -Path "$env:TEMP\helloworld.xml" -OutputPath $env:TEMP\helloworld.zip -Level 9
$helloZipStream = [System.IO.FileStream]::new("$env:TEMP\helloworld.zip",[System.IO.FileMode]::Open)
$helloBytes = [byte[]]::new(($helloZipStream.Length))
$helloZipStream.Read($helloBytes, 0, $helloZipStream.Length)  # Will return the number of bytes read; note this
$helloZipStream.Close()
$helloZipStream.Dispose()  # Make the file available to the filesystem again
($helloBase64str = [System.Convert]::ToBase64String($helloBytes)) | Set-Clipboard  # We will paste this in the next script
Enter fullscreen mode Exit fullscreen mode

This produces the base64 encoded string UEsDBC0AAAAIAJurslaXhee3////AAAABAAEAUAAAAP0AAAAAAA==. This string is the 355-byte long content of our zip file. Now, in another script, we can load this zip file in memory like this:

# Now we want to read the object back in
if (Test-Path -Path "$env:TEMP\helloworld.xml")
{
    $helloWorld = Import-Clixml -Path "$env:TEMP\helloworld.xml"
}
else
{
    # For those who don't have the cli xml file on disk, provide them the base64 encoded zip file with the xml as content (increaces code complexity, but reduces the length of the string)
    # This is a base64 encoded zip file containing the helloworld.xml file
    $helloZip = [System.IO.Compression.ZipArchive]::new([System.IO.MemoryStream]::new([System.Convert]::FromBase64String('UEsDBC0AAAAIAJurslaXhee3//////////8OABQAaGVsbG93b3JsZC54bWwBABAAJwEAAAAAAAC9AAAAAAAAAG2OywrCMBBF94L/EPIBSSqupC2IC3VhFSO6rnW0lSaRzoj6945vUFdz5wznMvF0vUexhAar4BMZqUgZFUlxdrXHRJZEh57WWJTgclSuKpqAYUuqCE4fwom9Eupad4zpatOVabslxK1TzGE73iTSPBDDRfbDbjS1FyRwapL7fAcOPKn+kYLLiR9SMzs4Im/cCAXFevHH/L5xyl5xYj+CFVkihw0AVX4n0xH/HcQqNPUm1vbtPo1Ycy2n+8T0ClBLAQIzAC0AAAAIAJurslaXhee3//////////8OABQAAAAAAAAAAAAAAAAAAABoZWxsb3dvcmxkLnhtbAEAEAAnAQAAAAAAAL0AAAAAAAAAUEsFBgAAAAABAAEAUAAAAP0AAAAAAA=='), 0, 355))
    $helloWorld = [System.Management.Automation.PSSerializer]::Deserialize([System.IO.StreamReader]::new($dbZip.GetEntry('helloworld.xml').Open()).ReadToEnd())
}

# Now we can use the object
$helloWorld | Write-Host -ForegroundColor Cyan
Enter fullscreen mode Exit fullscreen mode

Another example that aims to download archive, run EXE from that archive all in memory:

[System.Reflection.Assembly]::LoadWithPartialName('System.IO.Compression')
$IWRresult=Invoke-WebRequest -Uri 'https://ffmpeg.zeranoe.com/builds/win64/static/ffmpeg-20200424-a501947-win64-static.zip' -Method Get -SslProtocol Tls12
$zipStream = New-Object System.IO.Memorystream
$zipStream.Write($IWRresult.Content,0,$IWRresult.Content.Length)
$zipFile = [System.IO.Compression.ZipArchive]::new($zipStream)
#OK, what did I just download?
#Write the contents to the shell output
$zipFile.Entries | Select-Object -ExcludeProperty @('Archive','ExternalAttributes') | Format-Table #I don't care about 'Archive' or 'ExternalAttributes', so I instruct suppress those
#I see there is 'ffmpeg-20200424-a501947-win64-static/bin/ffmpeg.exe' entry
$zipEntry = $zipFile.GetEntry('ffmpeg-20200424-a501947-win64-static/bin/ffmpeg.exe')
$binReader = [System.IO.BinaryReader]::new($zipEntry.Open())
#need external modules `PowerShellMafia/PowerSploit` to be able to run exe from memory (without writing to disk); see comments below this code block
Invoke-ReflectivePEInjection -PEBytes $binReader.ReadBytes() -ExeArgs "Arg1 Arg2 Arg3 Arg4"
Enter fullscreen mode Exit fullscreen mode

GitHub logo PowerShellMafia / PowerSploit

PowerSploit - A PowerShell Post-Exploitation Framework

This project is no longer supported

PowerSploit is a collection of Microsoft PowerShell modules that can be used to aid penetration testers during all phases of an assessment. PowerSploit is comprised of the following modules and scripts:

CodeExecution

Execute code on a target machine.

Invoke-DllInjection

Injects a Dll into the process ID of your choosing.

Invoke-ReflectivePEInjection

Reflectively loads a Windows PE file (DLL/EXE) in to the powershell process, or reflectively injects a DLL in to a remote process.

Invoke-Shellcode

Injects shellcode into the process ID of your choosing or within PowerShell locally.

Invoke-WmiCommand

Executes a PowerShell ScriptBlock on a target computer and returns its formatted output using WMI as a C2 channel.

ScriptModification

Modify and/or prepare scripts for execution on a compromised machine.

Out-EncodedCommand

Compresses, Base-64 encodes, and generates command-line output for a PowerShell payload script.

Out-CompressedDll

Compresses, Base-64 encodes, and outputs generated code to load a managed dll in…

Please see PowerShellMafia/PowerSploit for more information about how to run an EXE from the memory. Find examples here: GitHub Examples

Top comments (0)