Date: Sat, 1 May 2004 10:44:15 +1000
From: Lachlan Andrew <lha@users.sourceforge.net>
To: Joe R. Jah <jjah@cloud.ccsf.cc.ca.us>
Cc: htdig-dev@lists.sourceforge.net
Subject: Next major speedup

Greetings all,

Thanks for the new timings, Joe.  It's good to see that it is a bit 
faster than before exclude_perform.1 was applied.  (Any idea why it 
was so slow last time?)

It looks like we still have a lot of work to do to get performance 
like 3.1.6!!

I've finally managed to get some profiles that I trust, by linking 
statically (although they still don't match Joe's).  It showed that 
flushing the words in DocumentRef::AddDescription() was taking half 
of the time.  Does anyone know why that was done?

I've commented out the offending flush (see attached patch) and the 
database compares identical to the previous, but it takes about 1/3 
of the CPU time.  I didn't measure wall-clock time, but I expect a 
similar improvement, since the database isn't being thrashed so much.

Could people please check over (or test) the patch?

Thanks,
Lachlan

On Wed, 28 Apr 2004 12:10 pm, Joe R. Jah wrote:
>
> I randig again today.  Here is the time comparison with other
> versions and/or patches:
>
> htdig-3.2.0b5:
>
>   Total dig time:  01:37:35 == 5255 seconds
>
>  With exclude_perform.0
>
>   Total dig time:  00:58:48 == 3528 seconds, or ~33% less time
>
>  With exclude_perform.1
>
>   Total dig time:  00:57:35 == 3455 seconds, or ~34% less time
>
>  With exclude_perform.1 and store_phrases.0
>   store_phrases: true
>   Total dig time:  01:00:17 == 3617 seconds, or ~31% less time
>   store_phrases: false
>   Total dig time:  01:00:17 == 2941 seconds, or ~44% less time
>
> And since store_phrases.0 is meant to recreate digging speeds
> comparable to 3.1.6:
>
> htdig-3.1.6:
>
>   Total dig time:  00:14:59 == 899 seconds, or ~83% less time
>
> The profiles are in
> ftp://ftp.ccsf.org/htdig-patches/3.2.0b5/0Profiles/
>
>  htdig.gmon.exclude_perform.1-store_phrases.0-false.gz
>  htdig.gmon.exclude_perform.1-store_phrases.0-true.gz
>
> Regards,
>
> Joe

-- 
lha@users.sourceforge.net
ht://Dig developer DownUnder  (http://www.htdig.org)

--- ../cvs/htdig/htcommon/DocumentRef.cc	2004-01-17 16:17:49.000000000 +1100
+++ htcommon/DocumentRef.cc	2004-05-01 09:25:43.000000000 +1000
@@ -518,7 +518,7 @@
     }
 
     // And let's flush the words! (nice comment hu :-)
-    words.Flush();
+//    words.Flush();
     
     // Now are we at the max_description limit?
     if (descriptions.Count() >= max_descriptions)
