Before the database can be used by ’hfind’, an index file must be created with the ’-i’ option.
This tool is needed for efficiency. Most text-based databases do not have fixed length entries and are sometimes not sorted. The hfind tool will create an index file that is sorted and has fixed-length entries. This allows for fast lookups using a binary search algorithm instead of a linear search such as ’grep’.
The resulting index file will be named based on the database file name. The name will have the original name following by the hash type (sha1 or md5) followed by ’.idx’. For example, creating an MD5 hash index of the NIST NSRL results in ’NSRLFile.txt-md5.idx’ and the SHA-1 index results in ’NSRLFile.txt-sha1.idx’.
The file has two columns. Each entry is sorted by the first column, which is the hash value. The second column has the byte offset of the corresponding entry in the original file. So, when a hash is found in the index, the offset is recorded and then ’hfind’ seeks to the entry in the original database.
The following input types are valid. For NSRL, ’nsrl-md5’ and ’nsrl-sha1’ can be used. The difference is which hash value the index is sorted by. The ’md5sum’ value can also be used to sort and index "home made" databases. ’hfind’ can take data in both common formats:
MD5 (test.txt) = 76b1f4de1522c20b67acc132937cf82e
and
76b1f4de1522c20b67acc132937cf82e test.txt
# hfind -i nsrl-md5
/usr/local/hash/nsrl/NSRLFile.txt
To lookup a value in the NSRL:
# hfind /usr/local/hash/nsrl/NSRLFile.txt
76b1f4de1522c20b67acc132937cf82e
76b1f4de1522c20b67acc132937cf82e Hash Not Found
You can even do both SHA-1 and MD5 if you want:
# hfind -i nsrl-sha1 /usr/local/hash/nsrl/NSRLFile.txt
# hfind /usr/local/hash/nsrl/NSRLFile.txt
76b1f4de1522c20b67acc132937cf82e
80001A80B3F1B80076B297CEE8805AAA04E1B5BA
76b1f4de1522c20b67acc132937cf82e Hash Not Found
80001A80B3F1B80076B297CEE8805AAA04E1B5BA thrdcore.cpp
To make a database of critical binaries of a trusted system, use ’md5sum’:
# md5sum /bin/* /sbin/* /usr/bin/* /usr/bin/* /usr/local/bin/* /usr/local/sbin/*
> system.md5
# hfind -i md5sum system.md5
To look entries up, the following will work:
# hfind system.md5 76b1f4de1522c20b67acc132937cf82e
76b1f4de1522c20b67acc132937cf82e Hash Not Found
or
# md5sum -q /bin/* | hfind system.md5
928682269cd3edb1acdf9a7f7e606ff2 /bin/bash
<...>
or
# md5sum -q /bin/* > bin.md5
# hfind -f bin.md5 system.md5
928682269cd3edb1acdf9a7f7e606ff2 /bin/bash
<...>
The NIST National Software Reference Library (NSRL) can be found at www.nsrl.nist.gov.
Send documentation updates to <doc-updates at sleuthkit dot org>