Knowledgebase & FAQ
What is the text search estimate? Why does it eliminate results?
RecFind's metadata full text search functionality is based on a complex signature file technique whereby we set bits in an array to identify words, numbers and dates, etc and then 'point' to the location of each record that contains the data.
The basic principle of this methodology is that you ALWAYS get everything you ask for BUT you may also get more than you ask for (because the result is based on a complex mathematical algorithm and in some instances, two or more words may result in the same bit pattern).
When the search module returns more than we ask for they are called 'false drops' and we eliminate these from the results with 'post processing'.
For example, when performing a text search, the first result will be "An estimate of 1,344 records match the search criteria". This is an estimate that may contain possible false drops. At this point, the user decides whether "This is too many I need to click Cancel and redefine my search criteria to reduce the number of hits" or "This is too few, I need to click Cancel and refine my search criteria to produce more hits".
The major advantage of using this technique is that it is a flat time search. The search time is based on the number of 'words' in the search criteria, not the size of your database; and we do not need to read the database, all results are provided by our signature file.
To provide a true count of matching records, we would need to read each 'hit' and see if it is in fact a 'false drop'. By doing so it would take time to read the database. It is only after the user says, "show me the hits" that we actually read the database (the time consuming bit) and 'double check' each record to see if in fact it does meet the search criteria. This is when we eliminate the false drops.
There is another reason for a false drop. Although the indexing algorithm writes the bit patterns for new words, dates, phrases etc, it does not unwrite the bit patterns for deleted words, dates, phrases etc. Only the "clear index" process where we recreate the entire signature file will remove deleted records.
The final reason for eliminating records is that the user doesn't have the appropriate security to view the record, or the additional criteria (ie. dates ranges, department codes, etc) remove the record from the search result.
In summary, the reasons for eliminating a result are:
- the complex mathematical algorithm returned additional results,
- the signature file is out of date and the record no longer exists in the database, or
- the record doesn't meet the action officer security or the selection criteria.