ircd-hybrid-7.1beta1 vs ircd-ratbox-2.0.8 hash algorithm comparisons

Sun Apr 24 05:50:57 EDT 2005

The current algorithm we use is by far not the best one, but it looks 
like you enabled the small-net feature.  The size of the table is then 
4096 instead of 32768, that's why you have such long chains.

-Michael

Rachel Llorenna wrote:
> I connected several (150,000) fake clients via siege.pl, a lightweight ircd
> stress testing script I wrote. Drones were connected with very similar nicknames
> (userXXXXXX where XXXXXX is the client number that ranged between 1 and 150000).
> The UID's were pretty similar too, sequentially they went from 0AZAAAAAA to
> 0AZAAINXF, but this is expected behavior. Although nicknames are not expected to
> be too close to each other during normal operation, UID's can reasonably assumed
> as such.
> 
> The main hub (hybrid/bin/ircd) ran ircd-hybrid-7.1beta. The test leaves utilized
> ircd-hybrid-7.1beta (hybrid2/bin/ircd) and ircd-ratbox-2.0.8 (ratbox/bin/ircd)
> 
> Conclusions:
> ircd-hybrid showed the least memory consumption, but was extremely slow in terms
> of being able to generate statistics (one server took 351.56 seconds to respond
> and the other took 442.25). With such similar UID's, the distribution became
> extremely poor, indicating a potential denial of service attack. I don't think
> that ircd-hybrid7.1beta is suitable for EFnet at time of writing for this
> reason. UID's are similar by their nature, so the creation of such large chains
> (the maximum chain length was 237 client pointers), which would have to be
> walked when seeking a specific client. Hopefully this will be corrected before
> 7.1 becomes a release candidate; it's definitely not ready for a stable release
> with its current algorithm.
> 
> ircd-ratbox's distribution is much better, at the cost of consuming substantial
> amounts of memory compared to ircd-hybrid (about 3.5 times more, in fact). Most
> nodes end up having only one or two entries. It should be noted that ircd-ratbox
> uses the Fowler/Noll/Vo (FNV) hashing algorithm, which is proven to be fast
> while maintaining a low collision rate, even with keys that are as similar as
> UID's are. With such short chains, the FNV algorithm seems to be working nicely
> for EFnet.
> 
> cerberus used by far the most memory but presumably had almost no collisions
> (with 150,000 clients, 113,000 nodes in use meant that most contained one or two
> clients). Perl's hashing algorithm (from 5.8 and on) has an extra shift to avoid
> collisions, at the expense of much more memory consumption. As well, the number
> in %MEM includes other hashes, like nickname hashes and server hashes, too.
> 
> Evidence:
> USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
> frequen    767  0.0  7.9 74424 22748 ?       Ss   Apr14   4:13 hybrid/bin/ircd
> entries: 150010 buckets: 1565 max chain: 237
> 
> frequen    773  0.0  1.1 74280 3400 ?        Ss   Apr14   2:09 hybrid2/bin/ircd
> entries: 150007 buckets: 1586 max chain: 212
> 
> frequen  31505  0.0 25.7 82288 73248 ?       Ss   Apr22   0:03 ratbox/bin/ircd
> Size: 65536 Empty: 5967 (9.105%)
> Average depth: 2.000/2.000 Highest depth: 11
> Nodes with 0 entries: 5967
> Nodes with 1 entries: 15259
> Nodes with 2 entries: 18075
> Nodes with 3 entries: 13697
> Nodes with 4 entries: 7565
> Nodes with 5 entries: 3287
> Nodes with 6 entries: 1174
> Nodes with 7 entries: 364
> Nodes with 8 entries: 118
> Nodes with 9 entries: 23
> Nodes with 10 entries: 7
> 
> frequen  31532  0.2 40.1 122468 114460 pts/2 S+   Apr22   2:26
> cerberus: cerberus.int
> 113269 of 262144 buckets in use
> 
> Suitability:
> ircd-ratbox seems well geared towards TS6-compatible networks such as EFnet,
> while ircd-hybrid seems to have several features geared towards smaller IRC
> networks, where such a collision rate is acceptable. ircd-hybrid might be
> desirable when memory consumption is a concern (public shell providers, etc)
> 
> If you have any comments or anything to add (maybe I just happened to
> find a particularly bad test case?), feel free to reply back to the
> list. I would especially appreciate commentary from the developers of
> ircd-hybrid in case this issue has been fixed for new versions.