June 20 2019 04:22:08
· Home
· CV
· Articles
· Links
· News Categories
· Media Gallery
· Search


Forgotten your password?
Request a new one here.
Search Engine
SoftwareI finished my search engine yesterday. It didn't go as well as planned. The String Trie that I had written functioned fine, but as soon as it became loaded (with 40,000+ nodes) it became intolerably slow. I'm blaming this on Java's collection implementations because I had no such issues when I created the search engine for RuneScape. This means that I have omitted the trie from the implementation meaning that no wildcarded searches can be performed (not a _huge_ loss, but still frustrating). It still supports phrase searches, stray/ stemmed searches and "NOT" terms along with result highlighting. My original implementation for Jagex took about 4 weeks to complete so I'm quite pleased that it only took me 2 days this time around. Maybe if I feel the need for wildcarded searches I will write my only collection implementations but I'd rather not deviate from the Java libraries too much!

So, what next? I'm not too sure. I think I'll add web-based monitoring to the SearchEngine so I can view which words are indexed, average search times, number of searches etc... After that I might toy around with a system to send data in UDP packets between programs (kind of like RMI but simpler).


As a shot in the dark, I tried upping the heapspace available to the JVM and this fixed the problems of slowness when indexing with the StringTrie! I've just spent the past hour or so putting the StringTrie back in place and making persistence more efficient. I can do wildcarded searches on the entire works of Shakespeare in 19 milliseconds now! Initial indexing takes a little while (under a minute for all of Shakespeare's work), reloading from a persisted state takes about 9 seconds.
760,338 unique visits