By Elizabeth Thede, Special for The Times USA
When most people think of text retrieval, they think of word searching: alphabet soup or banana pie. And that is usually the main focus of text retrieval like dtSearch, which instantly search terabytes of Microsoft Office files, PDFs, emails with multilevel attachments, databases, Internet and Intranet data, and the like.
For traditional text retrieval, dtSearch has over 25 different search types: phrase searching, Boolean and/or/not searching, proximity searching, wildcard searching, fuzzy searching to sift through typographical errors, phonic searching, concept or thesaurus searching, metadata searching, multicolor hit-highlighting, etc. For enterprises and developers, dtSearch adds efficient multithreaded searching, faceted or “drill down” searching, granular data classification and other advanced options.
In all cases, dtSearch works by building an index that holds each unique word in the data, and the location of that word in data. To build the index, just point dtSearch to an entire folder or folder tree or even an entire drive and dtSearch will index anything in that folder or drive. (There is no need to tell dtSearch what is in that folder tree or drive, be it Microsoft Word, PowerPoint or One Note files, Microsoft Access databases, Microsoft Excel spreadsheets, PDFs, emails with nested attachments, etc. dtSearch will figure all of that out for itself.)
Once dtSearch has built an index – or even a series of terabyte-data holding indexes – dtSearch can instantly search across all of that data for any combination of words. But in addition to words, an index can also hold non-word items.
dtSearch can search for emojis. A field of study called sentiment analysis looks to determine if a group of employees is happy or sad. And one easy way to make that determination is to search for smiley faces or frowny faces.
dtSearch can also search for specific numbers. But it can do much more with numeric data. For example, dtSearch can search for numeric ranges, finding any number between 128 and 412. dtSearch can also identify dates in data, regardless of what format they appear in.
Credit card searching looks for any sequence of numbers that satisfies the criteria for a valid credit card number issued by one of the major credit card issuers, regardless of the pattern of spaces between the numbers. Social security numbers do not include the same type of verification options as credit card numbers. However, dtSearch can use numeric pattern searching to locate social security numbers and other numeric patterns.
Also, every document has a unique hash value which is often relevant in forensics work. dtSearch can generate the unique hash value for the documents as it is indexing them, and subsequently search on the specific hash value.
As a final thought, while enterprises with extremely large data sets (government agencies, 4 out of 5 of the Fortune 500’s largest Aerospace and Defense companies, 3 out of 4 of the “Big 4” accounting firms and the like) use dtSearch to search for words, numbers and emojis, even if you just want to search your own PC, you can download a fully-functional 30-day evaluation version of dtSearch Desktop anytime at dtSearch.com.