Part two of deciding the Correct Zip program to use
AND
How to Test and verify it for Forensic usage!
Before reading this article,
please read ZIP-IT Part one.
I recently held a session on testing various software to see if it was forensically sound. The session was a success, as far as opening the attendees eyes to see that not all the software that is often recommended as being forensically sound really is. The software tests involved the attendees testing their software to see if it could pass three or four very simple but important forensic requirements. The requirements were those that I would challenge if I were in that position. The tests involved basic file hashing, forensic file copying, and zipping/unzipping in a forensic evidentiary environment. These three categories seemed to me to be the basics behind a sound forensic analysis, reporting, and evidentiary delivery of any evidence found.
At the end of the session I realized that because of time constraints, the attendees did not complete the file zip/unzip portion of the software testing. As a result I decided to do two things. First, I solicited a number of colleagues and asked them to continue with and finish the software testing of the zip products. To date, none of the testers have reported any zip program capable of passing all the tests I have set up. But I have.
Second, I decided to continue with the tests myself. I must admit, I had already tested the items before the session, and knew how the software performed, or where it was failing my test requirements. However I decided to continue and redo the tests again for this article. The tests I designed and performed were totally non-scientific. But were in my opinion realistic. Meaning the processes I tested were those which you might find everyday in the file systems you are processing in your forensic analysis and those which you might commonly forget to check for.
This short article was written as a result of my performing tests on some of the recognized file zipping software programs. They include programs that are routinely recommended and used by most people processing evidence for retention, attorney discovery, or court adjudication. So as not to point fingers at those failing my tests, I am not going to mention their names here.
The basic parameters were set up as follows. All these parameters and assumptions were designed around a Windows Operating System environment. Other OS's such as Linux and the various MAC systems would not have similar requirements. A majority of business operations and forensic analysis is done on Windows platforms, making it an ideal platform for these tests.
First, because I personally think that if you are running a secure shop, your operating system should have the last access update of files turned on. (This is done via a registry key). In most Windows environments, it is off by default. When you are working for a large organization, and are concerned that someone might copy and walk off with the keys to the kingdom, or any other operation that may involve who or how they accessed sensitive files, this last access date may come into play. In my prior life, I often used last access to see when a file was accessed, which might point to the time it was copied/moved, printed, or stolen. That being said, lets set last access update of files to ON. This was requirement number one.
Second parameter was that some of the test data files being used were located in paths made up of Long Filenames (LFN). For those not recognizing this term, it means path/filenames greater than 255 characters. For this reason, the test file system was mandated to be an NTFS file system. For those of you who routinely use many of the forensic suites out there, you know that when exporting files from images, or otherwise, the suites have absolutely no problem exporting long filenames. However, Windows Explorer has its own problems with LFN'S. Often when exporting evidence files you end up with an evidentiary folder length rivaling the length of War and Peace. So long filename processing ability was another requirement.
Third and final major parameter requirement was that the process (program) be able to handle Alternate Data Streams (ADS). Quite simply, this is a file that is attached to a primary file in the file structure. I call the primary file the parent, which is what you see when you look at the files in explorer or other general file listing process. The ADS is effectively (for lack of a better, non-technical term) a child or hitchhiker. It is attached to the parent, but very few processes display the occurrence of an ADS. However, Alternate Data Streams can be made up of any kind of file. They can be executable programs, virus programs, key logger programs, and they can even hold the original URL of downloaded images. Bet you didn't know that. Great for an investigator of porn sites. So lets consider retention of ADS's also important.
Each of these three requirements seem simple, and in most cases are completely unnecessary in most evidentiary situations, but would you want to be the one explaining to the attorneys why your software missed one or all of the requirements set above. Or why, a few years down the road, you unzipped the evidence folder, and realized that some key evidence relating to one or all of the above parameters was now missing? I know I wouldn't.
Now the tests begin. First I designed a simple set of about 150 test files. Nothing large or complicated. Just spread out among about 20 directories of both long filenames and regular (less than 255 character filenames). Some of the files in both the LFN set, and normal set had one or more ADS associated with them. ALL the files had the MAC dates set to: 01/01/2019 12:34:56:789c 01/01/2019 12:34:56:789w 01/01/2019 12:34:56:789a Just for fun, I managed to set the times to 12:34:56:789.
Now I took the recognized file zip/unzip programs and tested against the following parameters as mentioned before. Determine if it: 1: Altered/updated last access date of the source during the operation. This would be a major problem. Why did you alter the original evidence date? 2: Restored all the original three MAC dates to the destination upon unzipping. Do the restored dates reflect original values? Hope so. 3: Could find and properly zip/unzip all the LFN (Long Filename) files. 4: Could find and include all the ADS along with their parent files
If it failed any or all of the above four tests, I considered it unworthy to be called a forensically sound zipping program.
Some could zip LFNs but couldn't unzip them. Beats the #$%* out of me, how it could zip it, and fail to unzip it. Others worked fine on finding ADS files in the short files, but failed on the LFN's. Others failed to "reset" the original last access date of the source file, and allowed for the last access of the restored destination to be set to the current date. This means they allowed the operating system to alter their original evidence. Replacing or maintaining original last access of the source would be a primary concern. Especially when zipping original data from say, a live server. One had an interface so cumbersome and messed up, I could hardly conduct the tests in an efficient manner. Needless to say I was very disappointed with all but one of the programs tested. For that reason, I am now going to reveal the name of the program I tested which passed all the tests I have set up.
The program which passed all the tests was WINRAR, and more specifically the command line version called RAR. The command line version is more verbose, batch capable, and easy to set up. After all, if you use it in a batch file, even if you are doing something wrong (and we hope you don't), at least you can testify, that you always do it this way. HA HA!
There were some initial stumbling blocks when dealing with the RAR program. Initially it failed to maintain the original last access date of the files. But after a short email exchange with the programmers, and explaining the evidentiary reason for needing last access date maintenance, within two days an updated beta version was out. The current version of WINRAR/RAR now has that option available. Along with dozens of other options, which I will not even attempt to list or mention. I learned which options I need, and routinely use those in a "standard" command line located in (guess what) a batch file, and a comment.txt file (explained below). As it turns out, only about four options are routinely needed to ensure compliance with my test requirements. Your needs may be different. So review, practice and use whatever options you find useful.
An item which is currently implemented is the capability of setting default options both in an INI file, and a file called comment.txt. A "very" few forensic options are capable of residing in the comment.txt file which will take effect upon extraction. This means, that not all of my forensic options are available in the comment.txt. Thus, the command line of the executable may require additional options to extract all the data perfectly. With a little practice, a suitable extraction command line is possible. Where all Paths, MAC dates, LFN's, and ADS are properly restored. Personally, I formulate the command line, then provide it in a batch script, or provide it to the recipient in an email. That way the recipient has all the information to properly extract a good forensic file structure.
Developing a suitable command line for the executable is advisable in order to allow the recipient to properly extract the data. Sending the .exe out of your control with a correct command line will hopefully ensure correct extraction options. An executable sent for remote extraction with proper command line options is, in my opinion the best way to go.
I have also come to use the ability to create encrypted self-extracting files. The self-extraction process makes it not only self contained, but the recipient doesn't need the WINRAR/RAR program to extract the data. The encryption capability (available in most zipping products nowadays) is an added security step, and benefit.
Remember, don't take my word for it. Regardless of which product you use, obtain, test, and retest every program and set up a routine process that ensures consistency. Know which options work, and don't work. Because the other side knows how to challenge your process. (Why did you perform this xxx process on my clients data, and not on other persons data? Are you prejudiced against my client?)
Again, in my testing of the recognized zip/unzip software out there, I found only the WINRAR/RAR to pass all my forensic evidentiary requirements. My requirements may be more strict for testing purposes, and may not be the same requirements you have. But do you want to leave yourself open to evidentiary challenges? It is possible that RAR may fail when performing in a way that you may need it, however, it passed my test requirements.
Bottom Line: Test your zip/unzip software and make sure it meets or exceeds all your evidentiary and storage requirements. Dont' take anyones word (not even mine) that it works. Test it yourself so you can answer questions about its operations when you are on the witness stand.
Associated articles and programs of interest: hash program to calculate hash values. HASH_IT_OUT an article discussing forensic hashing of evidence. COPY_THAT an article discussing forensic copying of evidence. ZIP_IT an article regarding use of zipping software for forensics.