Many students ask why do we have to learn Computer Science from the ground up. You have repeatedly heard me say that we need to learn to “look under the hood” as our systems have in some sense become too user-friendly and security starts with a core understanding and any gap in our understanding is a security vulnerability.
Hardware Knowledge + Operating Systems Knowledge => Forensics
MS-DOS
File Formats
A file format is a standard way specific to an application that information is encoded for storage in a computer file (structure) and tells a program how to display/use its contents. It specifies how bits are used to encode information in a digital storage medium. This information is comprised of data and meta-data (headers, control information, EOF, etc.). File formats may be either proprietary or free and may be either unpublished or open. Here is a pdf on General Binary File Format Analysis.
Here is the MS Word binary file format used in the demonstration below and used by developers who need to work with or export to .doc format (e.g. Google Docs, Apple Pages, etc.). Note it is 500+ pages and the pages also contain links to other specifications: [MS-DOC]: Word (.doc) Binary File Format | Microsoft Docs
File Systems
A deep understanding of File Systems and their File Allocation methodology is very important so additional reading as time permits is located here: http://en.wikipedia.org/wiki/File_system
Disk Architecture
Recall, if we have a 1GB USB drive and use FAT16, we cannot address the drive at the sector level ~1,000,000,000/512 => 195312 which is much bigger than 2**16 (65536) so we have to combine sectors into clusters to fit the address space (4 sectors => cluster). Now, of course, our hard drives are much larger so cluster size is even larger.
RAM, File & Disk Slack
Now having introduced all this, what is RAM Slack and File Slack?
RAM Slack is space between the file’s end of file (eof) and remaining space in the last sector (note new 4K sector standards)
File Slack is the empty sectors to the end of the cluster(block) following the file’s last sector.
A cluster is a group of sectors (block) on a hard disk drive that is addressed as one logical unit by the operating system. In computer file systems, a cluster is the unit of disk space allocation for files and directories
Disk Slack is the total amount of File Slack in a storage device. This is waste and may or may not be an issue.
http://whereismydata.wordpress.com/2009/04/25/forensics-ram-slack-and-file-slack/
Steganography
Steganography is the technique of hiding secret data within an ordinary, non-secret, file or message (e.g. files, pictures, network streams, etc.) in order to avoid detection; the secret data is then extracted at its destination.
Cryptography
Cryptography is a method of protecting information and communications through the use of codes so that only those for whom the information is intended can read and process it.
Prof JGL Digital Forensics File Formats Demo
Now a real quick “look under the hood”. Recall from the lecture and reading we have logical and physical perspectives and we have become a nation full of users who only access the top level logical perspective so let’s look at some Windows SysAdmin features and files and the file system from the physical perspective.
Here is a nice resource on binary formats and their analysis (i.e. reverse engineering): Binary File Format Analysis