ircd-hybrid-7.1beta1 vs ircd-ratbox-2.0.8 hash algorithm comparisons

Wed Apr 27 07:19:26 EDT 2005

Well, as I mentioned in the previous mail, the algorithm isn't the best, 
so we're not going to gloss over the thing.  I would expect a maximum 
chain length of something between 20 and 40 when using a table with a 
size of 2^15.
Anyway, we have just introduced the FNV algorithm so you might be 
interested in downloading the latest snapshot.  Btw, I will test Bob 
Jenkins' algorithm, if I have some more time.  It's also a fast 
algorithm but possibly more effective.

-Michael

Rachel Llorenna wrote:
> Hrm, I thought it might be something like that. I'll recompile without
> --enable-small-net and try again. But, I've enabled that on both
> ircd-ratbox and ircd-hybrid; but as AndroSyn mentioned on the
> ircd-ratbox mailing list, it currently does not reduce the table size
> even when that flag is used. I'll resubmit my new results to both
> lists.
> 
> On 4/24/05, Michael Wobst <michael at elite-web.de> wrote:
> 
>>The current algorithm we use is by far not the best one, but it looks
>>like you enabled the small-net feature.  The size of the table is then
>>4096 instead of 32768, that's why you have such long chains.
>>
>>-Michael
>>
>>
>>Rachel Llorenna wrote:
>>
>>>I connected several (150,000) fake clients via siege.pl, a lightweight ircd
>>>stress testing script I wrote. Drones were connected with very similar nicknames
>>>(userXXXXXX where XXXXXX is the client number that ranged between 1 and 150000).
>>>The UID's were pretty similar too, sequentially they went from 0AZAAAAAA to
>>>0AZAAINXF, but this is expected behavior. Although nicknames are not expected to
>>>be too close to each other during normal operation, UID's can reasonably assumed
>>>as such.
>>>
>>>The main hub (hybrid/bin/ircd) ran ircd-hybrid-7.1beta. The test leaves utilized
>>>ircd-hybrid-7.1beta (hybrid2/bin/ircd) and ircd-ratbox-2.0.8 (ratbox/bin/ircd)
>>>
>>>Conclusions:
>>>ircd-hybrid showed the least memory consumption, but was extremely slow in terms
>>>of being able to generate statistics (one server took 351.56 seconds to respond
>>>and the other took 442.25). With such similar UID's, the distribution became
>>>extremely poor, indicating a potential denial of service attack. I don't think
>>>that ircd-hybrid7.1beta is suitable for EFnet at time of writing for this
>>>reason. UID's are similar by their nature, so the creation of such large chains
>>>(the maximum chain length was 237 client pointers), which would have to be
>>>walked when seeking a specific client. Hopefully this will be corrected before
>>>7.1 becomes a release candidate; it's definitely not ready for a stable release
>>>with its current algorithm.
>>>
>>>ircd-ratbox's distribution is much better, at the cost of consuming substantial
>>>amounts of memory compared to ircd-hybrid (about 3.5 times more, in fact). Most
>>>nodes end up having only one or two entries. It should be noted that ircd-ratbox
>>>uses the Fowler/Noll/Vo (FNV) hashing algorithm, which is proven to be fast
>>>while maintaining a low collision rate, even with keys that are as similar as
>>>UID's are. With such short chains, the FNV algorithm seems to be working nicely
>>>for EFnet.
>>>
>>>cerberus used by far the most memory but presumably had almost no collisions
>>>(with 150,000 clients, 113,000 nodes in use meant that most contained one or two
>>>clients). Perl's hashing algorithm (from 5.8 and on) has an extra shift to avoid
>>>collisions, at the expense of much more memory consumption. As well, the number
>>>in %MEM includes other hashes, like nickname hashes and server hashes, too.
>>>
>>>Evidence:
>>>USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
>>>frequen    767  0.0  7.9 74424 22748 ?       Ss   Apr14   4:13 hybrid/bin/ircd
>>>entries: 150010 buckets: 1565 max chain: 237
>>>
>>>frequen    773  0.0  1.1 74280 3400 ?        Ss   Apr14   2:09 hybrid2/bin/ircd
>>>entries: 150007 buckets: 1586 max chain: 212
>>>
>>>frequen  31505  0.0 25.7 82288 73248 ?       Ss   Apr22   0:03 ratbox/bin/ircd
>>>Size: 65536 Empty: 5967 (9.105%)
>>>Average depth: 2.000/2.000 Highest depth: 11
>>>Nodes with 0 entries: 5967
>>>Nodes with 1 entries: 15259
>>>Nodes with 2 entries: 18075
>>>Nodes with 3 entries: 13697
>>>Nodes with 4 entries: 7565
>>>Nodes with 5 entries: 3287
>>>Nodes with 6 entries: 1174
>>>Nodes with 7 entries: 364
>>>Nodes with 8 entries: 118
>>>Nodes with 9 entries: 23
>>>Nodes with 10 entries: 7
>>>
>>>frequen  31532  0.2 40.1 122468 114460 pts/2 S+   Apr22   2:26
>>>cerberus: cerberus.int
>>>113269 of 262144 buckets in use
>>>
>>>Suitability:
>>>ircd-ratbox seems well geared towards TS6-compatible networks such as EFnet,
>>>while ircd-hybrid seems to have several features geared towards smaller IRC
>>>networks, where such a collision rate is acceptable. ircd-hybrid might be
>>>desirable when memory consumption is a concern (public shell providers, etc)
>>>
>>>If you have any comments or anything to add (maybe I just happened to
>>>find a particularly bad test case?), feel free to reply back to the
>>>list. I would especially appreciate commentary from the developers of
>>>ircd-hybrid in case this issue has been fixed for new versions.
>>
> 
>