Women in Technology

Hear us Roar



Article:
  Introduction to Text Indexing with Apache Jakarta Lucene
Subject:   performance
Date:   2003-01-15 23:58:01
From:   anonymous2
I have been looking at lucene for a couple of years, but I havent tied it in a large scale setting...so I wonder, how does it perform???


Any clues on how long i will take to search, say 1000 documents???


How effective is the search algorithm, compared to for instance Alta Vista???


thanks

Full Threads Newest First

Showing messages 1 through 8 of 8.

  • performance
    2003-01-16 13:34:59  drunk_injun [View]

    Lucene performs extremeley well, given the correct configuration. We run a few lucene indexes, the largest of which contains well over 3 million records, and our site averages close to 2MM daily pageviews.

    Our average query response time (95th percentile) is ~80ms, and we serve thousands of requests per day using a RAM-based servlet configuration in Caucho Resin.

    The search algorithm is a standard TFIDF algorithm (with some additional gravy), which can be easily extended or replaced given that the source code is well designed. I would highly recommend this package, having used it over the past 3 years in a variety of applications.
    • performance
      2003-01-16 13:36:25  drunk_injun [View]

      Caveat to last message - the response time of 80ms includes our own modifications to lucene to enable additional sorting and filtering parameters. Raw query times generally average < 25ms.
      • performance
        2003-01-17 05:02:57  anonymous2 [View]

        sounds promising...Im currently struggling trying to integrate Alta Vista on a customers project. AV has very bad documentation and is written in c, making interop pretty difficult..I will see if i can get the time to try out Lucene in this setting...

        thanks for your reply..
      • performance
        2003-01-21 00:35:06  brwnx [View]

        thanks for your reply. Im currently struggling on implementing alta vista on a customer project. It being written in C and with a lousy java api makes interop very difficult. I will see if i cn get the time to test out lucene instead..
    • performance
      2003-04-25 02:02:19  anonymous2 [View]

      Hi drunk_injun,

      I am using lucene for text searches. Actually I am trying to make the searches faster.
      I just wanted some inputs regarding the configuration changes that you have made to lucene. What are h/w configurations you are using etc.
      I would be very glad to receive your replies at swati_virmani@yahoo.com

      TIA,
      -Swati
      • performance
        2004-10-18 19:01:17  vfinity [View]

        Hi, anyone encounter too many open files in MAC OS X when using lucene?

        I suspect that when too many search is in doing , it will cause the whole machine goes down with "too many open files" error.

        thanks
  • performance
    2004-04-30 22:36:27  ravitiru [View]

    I am testing Lucene with 40 Milion Articles. It is talking 4 to 5 days to index articles and the search time is around 2 to 5 seconds. I am still working on optimizing the index.
    • performance
      2008-06-04 03:35:39  Rohit Arora [View]


      Hi,

      I am new to Lucene can you please help me to creating multiple indexes in Solr Lucene.

      with regards
      Rohit Arora