danudey at gmail.com
Tue Feb 22 20:07:41 EST 2005
With case-sensitivity and also only using capital letters, you only
have to do a pass for [A-Z] (26 possibilities per character), as
opposed to [A-Za-z] (52 possibilities per character) - if it's not
A-Z, throw it out and move on, instead of looking it up to see if it's
Basically, yes, there are more possible combinations total, but there
are less possibilities per character, which speeds up the speed of
looping through the characters.
Your idea wouldn't be faster by virtue of being case-sensitive,
because the server already parses it case-sensitive, but for a string
of the same length, would be slower because it would have to go
through more possibilities to compare. Not sure what the practical
performance hit of this would be.
On Tue, 22 Feb 2005 19:57:32 -0500, Rachel Llorenna <rachies at gmail.com> wrote:
> Based on what you just noted, the current system is pretty wasteful
> then, since there are obviously so many more combinations than
> necessary. The amount of bandwidth that you save is minimal in
> comparison to proper compression techniques, but that was merely
> stated as a side effect. You say that case-sensitive is faster, which
> means that my idea is therefore faster. But according to what you said
> afterwards, I assume you meant case-insensitive. That makes me wonder,
> is it because of the tries in use for message parsing?
> On Tue, 22 Feb 2005 14:23:39 -1000, Bill Bierman <bill at mu.org> wrote:
> > Rachel Llorenna wrote:
> > >I'm curious as to why the TS6 documentation
> > >(http://www.leeh.co.uk/ircd/TS6.txt) chooses to use only capital
> > >letters and numbers for ID's, effectively making them case
> > >insensitive. It would greatly increase the number of available unique
> > >ID's to use both upper- and lowercase, would it not? Then, you could
> > >reduce the number of characters and maybe save a few bytes of
> > >bandwidth per user.
> > >
> > >Current ID implementation (6 bytes):
> > >[A-Z][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9][A-Z0-9]
> > >26*36*36*36*36*36
> > >1572120576 users per server
> > >(This of course assumes that I understand ID's correctly, in that ID's
> > >need to be unique to servers only, since they will always be unique to
> > >the network when prefixed with the SID)
> > >
> > >My proposed idea (4 bytes):
> > >[a-zA-Z][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]
> > >52*62*62*62
> > >12393056 users per server
> > >This is a huge amount and should be sufficient on a per-server basis
> > >for many, many years to come. Also, remember that these are only ID's,
> > >full UID's will be 3 characters longer since they have the SID
> > >prefixed.
> > >
> > >We could also modify the SID's to use only two characters:
> > >[0-9][a-zA-Z0-9]
> > >10*62
> > >620 servers (It's unrealistic to have more than this on a production network..)
> > >
> > >[0-9][a-zA-Z0-9][a-zA-Z][a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]
> > >10*62*52*62*62*62
> > >7683694720 users per network
> > >
> > >Perhaps, though, this has something to do with the way in which the ID
> > >hashes are set up. Still, I somehow doubt that we would ever need
> > >enough ID's for 1572120576 users per server.
> > >
> > >
> > Currently, there are 12960 possibilities for SIDs. This is certainly
> > more than enough.
> > There are also 1572120576 possibilities for users. This gives a
> > possible user capacity of 20374682664960 users. (Rougly 3200 times the
> > population of the world). Certainly this is plenty.
> > The reason for all upper case (if I remember correctly) is actually two
> > reasons. First, consistency. The original RFC and historic IRC Daemons
> > accepted commands only in upper case. Secondly, in terms of parsing,
> > case sensitive is much faster. If you need not deal with both A-Z and
> > a-z, and distinguish between them (which you would have to do, under
> > what you're suggesting), things will go much faster.
> > If you are concerned with bandwidth, I would suggest using compression.
> > Bill
> Rachel Llorenna (frequency)
More information about the hybrid