Some Creativity

Weblog of Siddharth Uppal

How to check if a file is compressed in C#

with 2 comments

Problem:

.NET 2.0 introduced the GZipStream class to allow programmatic compression and decompression of files. However that class doesn’t provide an easy way for us to determine if a file is actually a compressed file. Decompressing the entire file just to determine if a file is actually compressed is wasteful.

Solution:

Most files have a sequence of bytes in the beginning of the file dedicated to holding information that can help us identify the type of the file. A list of these “magic numbers” is available online.

Files that are compressed using GZIP compression algorithm begin with 1F 8B 08 (that’s the value of the first 3 bytes from the file, written in hexadecimal), while those compressed using PK-ZIP algorithm begin with “50 4B 03 04” (again, these are the first 4 bytes written as hexadecimal).

Code examples:

Before I show you the code, here’re some examples of how it can be used.

If “testFile.gz” is a valid Gzip compressed file, isGzip variable will be set to “true”:

bool isGzip = FileChecker.CheckSignature(“testFile.gz”, 3, “1F-8B-08″);

If “testFile.zip” is a valid PK-Zip compressed file, isPKZip variable will be set to “true”:

bool isPKZip = FileChecker.CheckSignature(“testFile.zip”, 4, “50-4B-03-04″);

Caveat:

Actually a binary file might start with the same “magic number” as normal PK-Zip files do and in that case FileChecker.CheckSignature will still return true (i.e. a false positive). But all valid PK-Zip files “will” have that magic-number in the beginning. So, a more accurate way to rephrase the examples would be:

If “testFile.gz” is not a valid Gzip compressed file, isGzip variable will be set to “false”:

bool isGzip = FileChecker.CheckSignature(“testFile.gz”, 3, “1F-8B-08″);

If “testFile.zip” is not a valid PK-Zip compressed file, isPKZip variable will be set to “false”:

bool isPKZip = FileChecker.CheckSignature(“testFile.zip”, 4, “50-4B-03-04″);

Code:

Anyways, here’s the code:


public static class FileChecker
{

    public static bool CheckSignature(string filepath, int signatureSize, string expectedSignature)
    {
        if (String.IsNullOrEmpty(filepath))
            throw new ArgumentException("Must specify a filepath");
        if (String.IsNullOrEmpty(expectedSignature))
            throw new ArgumentException("Must specify a value for the expected file signature");

        using (FileStream fs = new FileStream(filepath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
        {

            if (fs.Length < signatureSize)
                return false;

            byte[] signature = new byte[signatureSize];
            int bytesRequired = signatureSize;
            int index = 0;

            while (bytesRequired > 0)
            {
                int bytesRead = fs.Read(signature, index, bytesRequired);
                bytesRequired -= bytesRead;
                index += bytesRead;
            }

            string actualSignature = BitConverter.ToString(signature);

            if (actualSignature == expectedSignature)
                return true;
            else
                return false;
        }

    }
}

For a complete list of files and the associated magic numbers, please see http://www.garykessler.net/library/file_sigs.html

kick it on DotNetKicks.com

Written by Sid

April 8, 2008 at 4:47 pm

Posted in .NET, General

Tagged with , ,

2 Responses to 'How to check if a file is compressed in C#'

Subscribe to comments with RSS or TrackBack to 'How to check if a file is compressed in C#'.

  1. hi,

    I have cange this to
    ///////////////////////////////////////////////////////////////////
    public string CheckSignature(Stream fs, int signatureSize)
    {
    StreamReader Reader = new StreamReader(fs);
    using (fs)
    {
    if (fs.Length 0)
    {
    int bytesRead = fs.Read(signature, index, signatureSize);
    signatureSize -= bytesRead;
    index += bytesRead;
    }
    fs.Dispose();

    return BitConverter.ToString(signature);
    }
    }
    ///////////////////////////////////////////////////////
    and i call CheckSignature as

    if (CheckSignature(MyUploadedFile.PostedFile.InputStream, 4) == “89-50-4E-47″)
    {
    // save MyPostedFile to directory
    }

    when i send a jpeg file
    if (CheckSignature(MyUploadedFile.PostedFile.InputStream, 4) == “89-50-4E-47″) is true

    my question is..

    why i can’t save MyPostedFile

    dnzsahin

    16 Sep 08 at 3:39 am

  2. It seems you’re attempting to use this code in an ASP.NET app. Make sure your ASP.NET account has adequate permissions to write to the location where you’re attempting to save the received file.

    Sid

    29 Sep 08 at 4:17 pm

Leave a Reply