This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
Old locate(1) programs still works with the new database format, print
some garbage for 8 bit characters, but don't core (maybe except char 30).
7-Bit Puritan should not notice any difference. Same speed,
Same database size if the database contain only ASCII characters.
Reviewed by: ache
Bigram does not remove newline at end of filename. This
break particulary the bigram algorithm and /var/db/locate.database
grow up 15 %.
Bigram does not check for characters outside 32-127.
The bigram output is silly and need ~1/2 CPU time of
database rebuilding.
old:
locate.bigram < $filelist | sort | uniq -c | sort -nr
^^^^^^^^^^^^^^
this can easy made bigram
new:
bigram < $filelist | sort -nr
code
Code does not check for char 31.
Use a lookup array instead a function. 3 x faster.
updatedb
rewritten
sync with bigram changes
read config file /etc/locate.rc if exists
submitted by: guido@gvr.win.tue.nl (Guido van Rooij)
concatdb - concatenate locate databases
mklocatedb - build locate database