[Hampshire] Why I love Open Source

Top Page

Reply to this message
Author: Peter Salisbury
Date:  
To: Hampshire LUG Discussion List
Subject: [Hampshire] Why I love Open Source
Hi, I thought I'd share this real life example of why Open Source is
so fantastic. This sort of thing means far more to me than merely
being free. Here's what happened:

I lost a file. I knew it was there in my PC somewhere, but I had no
idea what it was called, only a couple of key words it's text would
contain. It might have been a pdf, a doc, an odt, an sxw or a plain
text file.

I used find, but I didn't know enough about the file name to get it.

So I fired up aptitude and found a program for finding files based on
their contents: I chose recoll

I installed recoll and ran it. It indexed the contents of all my files
and after a couple of quick searches in the recoll GUI I found my lost
file. Yay!!

During the run I noticed that recoll was taking its time looking at
image files even though I didn't have the relevant 'helper' libraries
installed, so it was trying and failing to index each image file. Not
a huge problem, but time consuming. Also there were no 'recommends'
for the helper libraries, again not a big issue.

I posted a wish list bug (500690) to the Debian bts and forgot all about it.

A day or two later the author of recoll asked me to run a couple of
tests to get timings with and without the helper libraries.

I did the tests and posted the results.

I got the following reply:
-------------------
Thanks a lot for running these tests and sending the results.

It's quite reassuring that initial indexing works as expected.

About later indexing passes, I had another look at how Recoll *really*
works (as opposed to how I thought it worked :) ) and in fact, for file
types with missing helper applications, indexing is always retried (so that
it succeeds as soon as the helper is installed). Trying to execute the
filters wastes quite a lot of time.

This explains why the times go down after the helper is installed: the
files get indexed the first time, then nothing further happens if they stay
unchanged.

Recoll 1.11 has been modified to work slightly differently: executing a
missing filter is only tried once per indexing pass. The program then
remembers the failure and doesn't retry.

The files still get indexed at the first indexing pass following helper
installation, and there is almost no performance penalty for missing
helpers, best of both worlds (hopefully).

Thanks again for prompting me to implement this well-needed change.

Regards,
J.F. Dockes
-------------------

As you will all know, this is one tiny example, among hundreds of
thousands, showing how responsive and constructive the Open Source
process can be.

Exercise for the reader: try that with a commercial app (after you've
paid for it of course).

ATB, Peter