|
Autopsy
4.21.0
Graphical digital forensics platform for The Sleuth Kit and other tools.
|
Inherits org.sleuthkit.autopsy.ingest.FileIngestModule.
Classes | |
| enum | IngestStatus |
| enum | StringsExtractOptions |
Public Member Functions | |
| ProcessResult | process (AbstractFile abstractFile) |
| void | shutDown () |
| void | startUp (IngestJobContext context) throws IngestModuleException |
Private Member Functions | |
| BlackboardAttribute | checkAttribute (BlackboardAttribute.ATTRIBUTE_TYPE attrType, String value) |
| void | cleanup () |
| void | createMetadataArtifact (AbstractFile aFile, Map< String, String > metadata) |
| boolean | extractStringsAndIndex (AbstractFile aFile) |
| boolean | extractTextAndSearch (Optional< TextExtractor > extractorOptional, AbstractFile aFile, Map< String, String > extractedMetadata) throws IngesterException |
| Optional< TextExtractor > | getExtractor (AbstractFile abstractFile) |
| Reader | getTikaOrTextExtractor (Optional< TextExtractor > extractorOptional, AbstractFile aFile, Map< String, String > extractedMetadata) throws TextExtractor.InitReaderException |
| boolean | isLimitedOCRFile (AbstractFile aFile, String mimeType) |
| void | postIndexSummary () |
| void | searchFile (Optional< TextExtractor > extractor, AbstractFile aFile, String mimeType, boolean indexContent) |
| boolean | searchTextFile (AbstractFile aFile) |
Static Private Member Functions | |
| static void | putIngestStatus (long ingestJobId, long fileId, IngestStatus status) |
Private Attributes | |
| IngestJobContext | context |
| FileTypeDetector | fileTypeDetector |
| Ingester | ingester = null |
| boolean | initialized = false |
| int | instanceNum = 0 |
| long | jobId |
| final IngestServices | services = IngestServices.getInstance() |
| final KeywordSearchJobSettings | settings |
| Lookup | stringsExtractionContext |
Static Private Attributes | |
| static final String | IMAGE_MIME_TYPE_PREFIX = "image/" |
| static final Map< Long, Map< Long, IngestStatus > > | ingestStatus = new HashMap<>() |
| static final AtomicInteger | instanceCount = new AtomicInteger(0) |
| static final int | LIMITED_OCR_SIZE_MIN = 100 * 1024 |
| static final Logger | logger = Logger.getLogger(KeywordSearchIngestModule.class.getName()) |
| static final Map< String, Pair< BlackboardAttribute.ATTRIBUTE_TYPE, Integer > > | METADATA_TYPES_MAP |
| static final ImmutableSet< String > | OCR_DOCUMENTS |
| static final IngestModuleReferenceCounter | refCounter = new IngestModuleReferenceCounter() |
An ingest module on a file level Performs indexing of allocated and Solr supported files, string extraction and indexing of unallocated and not Solr supported files Index commit is done periodically (determined by user set ingest update interval) Runs a periodic keyword / regular expression search on currently configured lists for ingest and writes results to blackboard Reports interesting events to Inbox and to viewers
Definition at line 105 of file KeywordSearchIngestModule.java.
|
private |
Create a metadata blackboard attribute based on specified content.
| attrType | The attribute type. |
| key | The key for the attribute. |
| value | The value of the attribute. |
Definition at line 752 of file KeywordSearchIngestModule.java.
References org::sleuthkit::datamodel::BlackboardAttribute::TSK_BLACKBOARD_ATTRIBUTE_VALUE_TYPE.DATETIME.
|
private |
Common cleanup code when module stops or final searcher completes
Definition at line 510 of file KeywordSearchIngestModule.java.
|
private |
This map will map the attribute type to a pair of the priority (lower number value is higher priority), and the string value for the attribute.
Get best matched metadata for each attribute type found in metadata map by bumping out lower priority.
Definition at line 686 of file KeywordSearchIngestModule.java.
References org::sleuthkit::datamodel::SleuthkitCase.getBlackboard(), org.sleuthkit.autopsy.casemodule.Case.getCurrentCaseThrows(), org::sleuthkit::datamodel::AbstractContent.getName(), org::sleuthkit::datamodel::AbstractFile.getParentPath(), org.sleuthkit.autopsy.casemodule.Case.getSleuthkitCase(), org::sleuthkit::datamodel::AbstractFile.newDataArtifact(), org::sleuthkit::datamodel::Blackboard.postArtifacts(), and org::sleuthkit::datamodel::BlackboardArtifact::ARTIFACT_TYPE.TSK_METADATA.
|
private |
Extract strings using heuristics from the file and add to index.
| aFile | file to extract strings from, divide into chunks and index |
Definition at line 804 of file KeywordSearchIngestModule.java.
References org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.context, org.sleuthkit.autopsy.ingest.IngestJobContext.fileIngestIsCancelled(), org::sleuthkit::datamodel::AbstractContent.getId(), org::sleuthkit::datamodel::AbstractContent.getName(), org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.SKIPPED_ERROR_INDEXING, and org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.STRINGS_INGESTED.
|
private |
File indexer, processes and indexes known/allocated files, unknown/unallocated files and directories accordingly Extract text with Tika or other text extraction modules (by streaming) from the file Divide the file into chunks and index the chunks
| extractorOptional | The textExtractor to use with this file or empty. |
| aFile | file to extract strings from, divide into chunks and index |
| extractedMetadata | Map that will be populated with the file's metadata. |
| IngesterException | exception thrown if indexing failed |
Definition at line 631 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 600 of file KeywordSearchIngestModule.java.
References org.sleuthkit.autopsy.ingest.IngestJobContext.fileIngestIsCancelled(), org.sleuthkit.autopsy.textextractors.TextExtractorFactory.getExtractor(), org.sleuthkit.autopsy.keywordsearch.KeywordSearchJobSettings.isOCREnabled(), and org.sleuthkit.autopsy.textextractors.configs.ImageConfig.setOCREnabled().
|
private |
Definition at line 652 of file KeywordSearchIngestModule.java.
References org.sleuthkit.autopsy.textextractors.TextExtractor.getMetadata(), and org.sleuthkit.autopsy.textextractors.TextExtractor.getReader().
|
private |
Returns true if file should have OCR performed on it when limited OCR setting is specified.
| aFile | The abstract file. |
| mimeType | The file mime type. |
Definition at line 525 of file KeywordSearchIngestModule.java.
References org::sleuthkit::datamodel::TskData::TSK_DB_FILES_TYPE_ENUM.DERIVED, org::sleuthkit::datamodel::AbstractFile.getSize(), and org::sleuthkit::datamodel::AbstractFile.getType().
|
private |
Posts inbox message with summary of text_ingested files
Definition at line 541 of file KeywordSearchIngestModule.java.
References org.sleuthkit.autopsy.ingest.IngestMessage.createMessage(), org.sleuthkit.autopsy.coreutils.MessageNotifyUtil.Notify.error(), org.sleuthkit.autopsy.ingest.IngestMessage.MessageType.INFO, org.sleuthkit.autopsy.ingest.IngestServices.postMessage(), and org.sleuthkit.autopsy.coreutils.MessageNotifyUtil.Notify.warn().
| ProcessResult org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.process | ( | AbstractFile | file | ) |
Processes a file. Called between calls to startUp() and shutDown(). Will be called for each file in a data source.
IMPORTANT: In addition to returning ProcessResult.OK or ProcessResult.ERROR, modules should log all errors using methods provided by the org.sleuthkit.autopsy.coreutils.Logger class. Log messages should include the name and object ID of the data being processed and any other information that would be useful for debugging. If an exception has been caught by the module, the exception should be sent to the logger along with the log message so that a stack trace will appear in the application log.
| file | The file to analyze. |
Implements org.sleuthkit.autopsy.ingest.FileIngestModule.
Definition at line 411 of file KeywordSearchIngestModule.java.
References org.sleuthkit.autopsy.ingest.IngestJobContext.fileIngestIsCancelled(), org::sleuthkit::datamodel::AbstractContent.getId(), org::sleuthkit::datamodel::AbstractFile.getKnown(), org.sleuthkit.autopsy.modules.filetypeid.FileTypeDetector.getMIMEType(), org::sleuthkit::datamodel::AbstractContent.getName(), org::sleuthkit::datamodel::AbstractFile.getType(), org.sleuthkit.autopsy.keywordsearch.KeywordSearchJobSettings.isOCREnabled(), org::sleuthkit::datamodel::TskData::FileKnown.KNOWN, org.sleuthkit.autopsy.ingest.IngestModule.ProcessResult.OK, org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.SKIPPED_ERROR_INDEXING, and org::sleuthkit::datamodel::TskData::TSK_DB_FILES_TYPE_ENUM.VIRTUAL_DIR.
|
staticprivate |
Records the ingest status for a given file for a given ingest job. Used for final statistics at the end of the job.
| ingestJobId | id of ingest job |
| fileId | id of file |
| status | ingest status of the file |
Definition at line 273 of file KeywordSearchIngestModule.java.
|
private |
Adds the file to the index. Detects file type, calls extractors, etc.
| extractor | The textExtractor to use with this file or empty if no extractor found. |
| aFile | File to analyze. |
| mimeType | The file mime type. |
| indexContent | False if only metadata should be text_ingested. True if content and metadata should be index. |
Extract unicode strings from unallocated and unused blocks and carved text files. The reason for performing string extraction on these is because they all may contain multiple encodings which can cause text to be missed by the more specialized text extractors used below.
Definition at line 830 of file KeywordSearchIngestModule.java.
References org::sleuthkit::datamodel::TskData::TSK_DB_FILES_TYPE_ENUM.CARVED, org.sleuthkit.autopsy.ingest.IngestJobContext.fileIngestIsCancelled(), org::sleuthkit::datamodel::AbstractContent.getId(), org::sleuthkit::datamodel::AbstractContent.getName(), org::sleuthkit::datamodel::AbstractFile.getNameExtension(), org::sleuthkit::datamodel::AbstractFile.getSize(), org::sleuthkit::datamodel::AbstractFile.getType(), org::sleuthkit::datamodel::AbstractFile.isDir(), org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.METADATA_INGESTED, org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.SKIPPED_ERROR_INDEXING, org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.SKIPPED_ERROR_TEXTEXTRACT, org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.TEXT_INGESTED, org::sleuthkit::datamodel::TskData::TSK_DB_FILES_TYPE_ENUM.UNALLOC_BLOCKS, and org::sleuthkit::datamodel::TskData::TSK_DB_FILES_TYPE_ENUM.UNUSED_BLOCKS.
|
private |
Adds the text file to the index given an encoding. Returns true if indexing was successful and false otherwise.
| aFile | Text file to analyze |
Definition at line 944 of file KeywordSearchIngestModule.java.
References org::sleuthkit::datamodel::AbstractContent.getId(), org::sleuthkit::datamodel::AbstractContent.getName(), org.sleuthkit.autopsy.textextractors.TextFileExtractor.getReader(), and org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.IngestStatus.TEXT_INGESTED.
| void org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.shutDown | ( | ) |
After all files are ingested, execute final index commit and final search Cleanup resources, threads, timers
Implements org.sleuthkit.autopsy.ingest.IngestModule.
Definition at line 466 of file KeywordSearchIngestModule.java.
References org.sleuthkit.autopsy.ingest.IngestModuleReferenceCounter.decrementAndGet(), org.sleuthkit.autopsy.ingest.IngestJobContext.fileIngestIsCancelled(), org.sleuthkit.autopsy.ingest.IngestJobContext.getJobId(), org.sleuthkit.autopsy.keywordsearch.KeywordSearch.getServer(), org.sleuthkit.autopsy.keywordsearch.Server.queryNumIndexedChunks(), and org.sleuthkit.autopsy.keywordsearch.Server.queryNumIndexedFiles().
| void org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.startUp | ( | IngestJobContext | context | ) | throws IngestModuleException |
Initializes the module for new ingest run Sets up threads, timers, retrieves settings, keyword lists to run on
Implements org.sleuthkit.autopsy.ingest.IngestModule.
Definition at line 302 of file KeywordSearchIngestModule.java.
References org.sleuthkit.autopsy.ingest.IngestMessage.createWarningMessage(), org.sleuthkit.autopsy.casemodule.Case.getCaseDirectory(), org.sleuthkit.autopsy.casemodule.Case.getCaseType(), org.sleuthkit.autopsy.casemodule.Case.getCurrentCaseThrows(), org.sleuthkit.autopsy.ingest.IngestJobContext.getJobId(), org.sleuthkit.autopsy.keywordsearch.Server.getMultiUserServerProperties(), org.sleuthkit.autopsy.keywordsearch.KeywordSearch.getServer(), org.sleuthkit.autopsy.ingest.IngestModuleReferenceCounter.incrementAndGet(), org.sleuthkit.autopsy.casemodule.Case.CaseType.MULTI_USER_CASE, org.sleuthkit.autopsy.ingest.IngestServices.postMessage(), org.sleuthkit.autopsy.keywordsearch.Server.queryNumIndexedDocuments(), org.sleuthkit.autopsy.textextractors.configs.StringsConfig.setExtractUTF16(), org.sleuthkit.autopsy.textextractors.configs.StringsConfig.setExtractUTF8(), org.sleuthkit.autopsy.textextractors.configs.StringsConfig.setLanguageScripts(), and org.sleuthkit.autopsy.keywordsearchservice.KeywordSearchService.tryConnect().
|
private |
Definition at line 252 of file KeywordSearchIngestModule.java.
Referenced by org.sleuthkit.autopsy.keywordsearch.KeywordSearchIngestModule.extractStringsAndIndex().
|
private |
Definition at line 241 of file KeywordSearchIngestModule.java.
|
staticprivate |
Definition at line 217 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 240 of file KeywordSearchIngestModule.java.
|
staticprivate |
Definition at line 263 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 247 of file KeywordSearchIngestModule.java.
|
staticprivate |
Definition at line 249 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 250 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 248 of file KeywordSearchIngestModule.java.
|
staticprivate |
Definition at line 107 of file KeywordSearchIngestModule.java.
|
staticprivate |
Definition at line 238 of file KeywordSearchIngestModule.java.
|
staticprivate |
A mapping of the Tika metadata key to the corresponding attribute type and the priority of that key versus other related keys (lower integer value is higher priority).
Definition at line 153 of file KeywordSearchIngestModule.java.
|
staticprivate |
Definition at line 220 of file KeywordSearchIngestModule.java.
|
staticprivate |
Definition at line 251 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 239 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 246 of file KeywordSearchIngestModule.java.
|
private |
Definition at line 245 of file KeywordSearchIngestModule.java.
Copyright © 2012-2024 Sleuth Kit Labs. Generated on: Mon Mar 17 2025
This work is licensed under a
Creative Commons Attribution-Share Alike 3.0 United States License.