By default, the program uses the configuration files in the directory where The Sleuth Kit was installed. Those can be overruled with run-time options. There is a standard configuration file for all file system types and then a specific one for a given operating system.
The options are as follows:
The program then runs the ’fls’ tool in The Sleuth Kit to identify the files in the file system image. Each identified file is viewed using the ’icat’ tool. If a hash database is given, the hash of the file is calculated and looked up. If it is found in an ’alert’ database, then it is added to a special ’alert.txt’ file. If it is found in the NSRL or ’exclude’ database, then it is ignored as a known good file. Excluded files are recorded in an ’exclude’ file for future reference but it is not saved in the category files.
The ’file’ command is then run to identify the file type (based on header information). The configuration file rules are used to identify which category it belongs to. An entry is added to the corresponding category file (in the ’-d dir’ directory). If the ’-s’ flag is given, then a copy of the file is saved in a subdirectory of the same name as the category. If the HTML format is used, then hyper-links will allow one to easily view saved files and view what is in each category.
Files that do not have a category are recorded in the ’unknown’ category and the ’data’ category. ’data’ is for files with a structure that ’file’ does not know and ’unknown’ is for files with a structure that ’file’ knows about. These are saved for future reference, but the unknown category can be ignored by using the ’-U’ flag.
A copy of the files can be saved by using the ’-s’ flag. If so, then the files are saved in a subdirectory that is named with the category name. Each file is named using the file system image name followed by the meta data address and the original file extension. The category index file can be used to translate the actual name to the saved name. The HTML format makes viewing easier as there are links to each file from the category index file.
The program will also consult the rules about the file extension. If the file has an extension at the end of it (anything after a ’.’), it will be compared to the rules. If the extension is not found in the rules as a valid extension for the file type, it will be added to the file of ’mismatch’. If the file does not have an extension it will not be entered even if the file type has valid extensions. This check is done even if the file is found in one of the known good hash databases. If it is found in one of those, it will be added to a special file. Files of type ’data’ have no extension checks done by default (as they have an unknown structure).
The program repeats the above procedures using the output of the ’ils’ command as well. This allows ’sorter’ to examine the contents of unallocated files that still have pointers to the data units (not all file systems will produce data from this step).
The ’default.sort’ file is used by any file system type. It contains entries for common file types. A specific operating system file also exists, which is useful for extensions that are specific to a given OS. By default, the default file and the OS specific one will be used. Using the ’-c’ flag, an additional file can be used. If the ’-C’ flag is used, then only the supplied configuration file is used.
There are two rule types in the configuration files. Each rule starts with a header that specifies which rule type it is (category or ext). Both rule types have two additional columns that can be separated by any white space.
The category rule has the category name as the second column and a Perl regular expression in the third column. The category name can not have any spaces in it and can only be letters and numbers. The regular expression is used to examine the output of ’file’. The regular expression will be used case insensitive. More than one rule can exist for a category, but only one category can exist for a given file output. For example:
This saves all file output with ’image data’ anywhere
in it to the ’images’ category: category images image
data
This saves all file output that has ’ASCII’ followed by anything and then
’text’ to be saved to the ’text’ category: category text
ASCII(.*?)text
This saves all file output that is just ’data’ to the ’data’ category (the
^ and $ define the boundaries in Perl). The ’data’ value is common in the
output of file for unknown binary data. category data
^data?
There is a special category of ’ignore’ that is used to skip over files of this type. This is mainly a time and space saver.
The extension rule is similar except that the second column has the value extensions for the file output. Multiple rules can exist for the same file type. The comparison will be done case insensitive. If no extension is valid for the file type, a rule does not need to be made. That is already assumed.
For example, the ASCII is used for several file extensions so the following rules could exist:
ext txt,log ASCII(.*?)text
ext c,cpp,h,js ASCII(.*?)text
Please email me any rules that you find useful for standard investigations and I will incorporate them into future releases (carrier at sleuthkit dot org).
# sorter -f ntfs -d data/sorter images/hda1.dd
# sorter -d data/sorter images/hda1.dd
# sorter -i raw -f ntfs -o 63 -d data/sorter images/hda.dd
To include the NSRL, an exclude, and an alert hash database:
# sorter
-f ntfs -d data/sorter -a /usr/hash/rootkit.db
-x /usr/hash/win2k.db -n /usr/hash/nsrl/NSRLFile.txt
images/hda1.dd
To just identify images using the supplied ’images.sort’ file:
# sorter
-f ntfs -C /usr/local/sleuthkit/share/sort/images.sort
-d data/sorter -h -s images/hda1.dd
Send documentation updates to <doc-updates at sleuthkit dot org>