Linux Desktop Search

| | | | |

Searching in Linux starts those venerable command line favorites: find, grep, and locate. These tools are very powerful and can easily be integrated into scripts, but for many users, this usefulness is also one of their key weaknesses. These users require a graphical interface in order to be comfortable with a program. They don't want to have to remember syntax and drop out of another app to type commands. The new breed of Linux desktop search is all about bringing the functionality of the tried and true originals into the graphical world.

Classics

I'll start with a brief overview for those of you unfamiliar with the three programs I already mentioned. find is a very versatile app that allows you to search for any file by name, content, date, etc. It's very useful for small amounts of files, but can take a long time because it doesn't use a pregenerated index. locate addresses that issue and offers much of find's search functionality, but not all. grep is the classic script programmers tool because it looks within a file and returns the line a word, phrase, or expression appears on instead of the filename. All three programs are capable of using regular expressions as their search phrase.

Beagle

One of the most publicized of the new breed of Linux desktop search tools is Beagle. It is being developed as part of the GNOME project using Mono and a port of the Lucene indexer. Beagle also supports a wide variety of data sources. Some key ones are:

In addition to those data sources, a large number of file formats are supported. Check here for details on the latest for both.

Beagle's interface is very simple, starting with a few menus and a Find bar. The menus have options to narrow the types of files to include in the query (i.e. documents, images, media, etc.), perferences, and sort criteria. Results are displayed in the rest of the window and are grouped by file type. Only the titles and length of each are shown with some additional details displayed at the bottom of the window for highlighted items. Note that I ran Beagle under KDE so it might appear a little different under GNOME.

GNOME Integration

Deskbar Applet can easily be enabled in GNOME to provide a search bar in the panel at the top of the screen, similar to Apple's Spotlight. Search results popup just under the panel and list results in categories like Beagle's own interface does. Applications can even be launched directly from the search results. It's even possible to use Google or Yahoo! web search through it.

KDE Integration

If Beagle's interface isn't quite what you are looking for, a Qt based alternative called Kerry Beagle is the next stop. The basic options are similar to Beagle's, but arranged differently. File types are picked from a drop down box instead of a menu and thumbnails of each file found are displayed along with the file name, date modified, and an excerpt that shows what text the match was found in.

As any readers of some of my earlier blog entries can attest to, I'm a big fan of KIO slaves. Fortunately there are quite a few that have been written to address search. None are perfect, but they comprise some nice tools if you'd rather stay within Konqueror.

kio-beagle, kio-locate, and kio-clucene provide the ability to run a search within Konqueror or any KDE dialog box. Typing any search term in the Konqueror address bar will automatically call kio-locate and execute the search. Individual files are shown if there is only one within a directory, otherwise a folder that lists the number of files matched within is shown instead.

kio-beagle is launched by typing beagle:/ followed by a query into the address bar. Unlike kio-locate, it returns all files in a single folder view. One thing I didn't figure out is how to restrict a kio-beagle search to a file type. The Beagle and Kerry GUIs both provide this feature. I also don't know if it is possible to limit searches to a specific directory.

Sometimes saving a common query can be very useful and Konqueror makes it easy. After executing the query all you need to do is select Add Bookmark from the Bookmarks menu or select the tab and drag it to the Bookmarks sidebar.

SearchMonkey

SearchMonkey is a different kind of search tool from the rest. It features a graphical user interface, but also aims to be almost as fully featured as the find and grep commands. This provides much more control in fine tuning the search results, while still maintaining ease of use. There are two modes, basic and expert, but if you are using this program you really need to use the expert mode to get much more functionality that GNOME's and KDE's built in find dialogs. Casual users will likely prefer Beagle or Kerry Beagle, but if those don't get the job done, SearchMonkey should be your next step.

Others

Kat is a search engine similar to Beagle. It looked promising, but unfortunately the project web page has been closed for a while now and it's unclear if anything will ever come of it. It's a shame because it was the project I was most excited about and they appeared to be considering an interface that integrated well with Konqueror. Everyone's usage needs differ, but for me having desktop search integrated with my file manager is the best solution, allowing me to browse, manage, search, and preview seamlessly.

One other project of note is Tenor. Tenor is a part of KDE's Appeal project and involves the creation of a Contextual Linkage Engine (CLE). The CLE would have broader impact than the user driven searches of Beagle and the others because it would run as a service that collects metadata, full text indexes, and linkages and contextual relationships between files. Applications could then be built that utilize the engine to provide real-time search results based upon what the user is doing at any given time. Instead of searching for a document, Tenor aims to have it already waiting for you. The project is still in development, but there may be something ready for KDE 4 when it gets released next year. Tenor definitely has promise and I'll be one of the many people eagerly waiting to see how it turns out.

Conclusion

A user friendly touch for Beagle and Kerry was to offer some quick tips that show how to search for phrases, exclude words, and search for specific extensions. I was disappointed that partial word searches did not work with either Beagle or Kerry. I'm not sure if there is a setting for this somewhere, but the behavior was a surprise to me. Neither offer search as you type capability either.

If your search would benefit from narrowing down by file type, category, or limiting searches to your home directory and a few select others, then Beagle or one of its derivatives is the place to start. Picking which interface is just a matter of personal preference. If you need to go beyond the directories that Beagle is set to index, Locate becomes a valuable second option. It doesn't search within all of the file types that Beagle does, but it is very quick at matching full or partial matches of file names, folders, and the contents of text files.

Still haven't found what you need? SearchMonkey can allow you to more easily search by date, regular expression, or other file criteria.

Linux desktop search is still very much in its infancy when it comes to polished interfaces, but the basics are in place. Development continues on each of the projects and I expect we'll see significant advances in the next year.

Other alternatives ?

How about a follow-up article comparing Beagle, Tracker, Strigi, Docco, Pinot, Recoll, etc... ?

Search follow-up

I'll add that to my to-do list. I'm particularly interested in Strigi and Tracker right now because I've heard talks that Strigi might become the KDE 4 search tool and that Tracker may be the default in the next Ubuntu release.

Sun's comparison

Late last year, a couple Sun researchers did a comparsion of Beagle, JIndex, Tracker, and Strigi. It contains some interesting information.

You can find it here.

kio-beagle filetype restriction

One thing I didn't figure out is how to restrict a kio-beagle search to a file type.

You just have to add ext:[filetype] to your query, so

beagle:something ext:html

would search for "something" in .html-files only.Works fine for kio-beagle-0.3.1 and Konqueror-3.5.5.

That makes sense

That makes sense. I figured it was something like that, but I didn't know the syntax. Thanks for the tip.

Partial word searches also

Partial word searches also work: part* would match partial.

Here is a useful search syntax explanation.

(btw: I also like searching via beagle-query when I'm using the shell)