Comparison based sorting needs Ω(lg*n*!)=Ω(*n* lg*n*)
comparisons because each comparison yields one bit of information
and there are *n*! permutations. If you are sorting integers then
you can do better, and not only asymptotically. Radix sort works
as follows:

**int**> bucket[

**2**][

**256**];

**void**rsort(vector<

**int**>& v) {

**int**i, j, k;

**for**(i =

**0**; i < v.size(); ++i)

bucket[

**0**][v[i]&

**0x000000ff**].push_back(v[i]);

**for**(i =

**0**; i <

**256**; ++i) {

**while**(!bucket[

**0**][i].empty()) {

**int**&e = bucket[

**0**][i].front();

bucket[

**1**][(e&

**0x0000ff00**)>>

**8**].push_back(e);

bucket[

**0**][i].pop_front();

}

}

**for**(i =

**0**; i <

**256**; ++i) {

**while**(!bucket[

**1**][i].empty()) {

**int**&e = bucket[

**1**][i].front();

bucket[

**0**][(e&

**0x00ff0000**)>>

**16**].push_back(e);

bucket[

**1**][i].pop_front();

}

}

**for**(i =

**0**; i <

**256**; ++i) {

**while**(!bucket[

**0**][i].empty()) {

**int**&e = bucket[

**0**][i].front();

bucket[

**1**][(e&

**0xff000000**)>>

**8**].push_back(e);

bucket[

**0**][i].pop_front();

}

}

k =

**0**;

**for**(i =

**0**; i <

**256**; ++i) {

**while**(!bucket[

**1**][i].empty()) {

v[k++] = bucket[

**1**][i].front();

bucket[

**1**][i].pop_front();

}

}

}

This straightforward (and somewhat repetitive) code is faster than
*std::sort* if you sort more than one thousand integers.

Whenever someone says that radix sort works in linear time there
is someone that feels compelled to point out that saying “linear
time” is cheating. I really don’t care whether it “works in linear
time” or not, but I can tell you that it is about two to three times
faster than *std::sort*. We can still analyze the runtime and
get a better idea of just how fast it is, but we need to be a bit
more precise.

Let’s say we work on a computer with 2^{k} bit words. Typical computers
nowadays have *k*=5 or *k*=6, and the code above assumes *k*=5. We
group the bits of a word in *digits*, each with 2^{w} bits.
In the example above *w*=3. We have *b*=2^{2w} *buckets*
and *d*=2^{k−w} digits. The number of operations of the code above
is proportional to *n*+(*b*+*n*)*d*=*n*+(2^{2w}+*n*)2^{k−w}. We can’t
change how many numbers we must sort (*n*) or the machine on which
the program runs (*k*). But we can change *w*. What is the optimum
value? If we let the derivative with respect to *w* equal 0 we get
2^{2w+w}ln2=2^{w}+*wn*. That looks horrible. But it has one interesting
feature: It doesn’t contain *k*, which means that *the optimum
number of bits per digit does not depend on the machine on which
we do the sorting*. It can be a 8 bit machine, a 32 bit machine,
a 64 bit machine, or even a 8192 bit machine, we should still
have the same number of bits per digit for a certain size of the
list that we are sorting.

We now turn to approximations. It’s probably safe to assume 2^{w}≫ *w*
and *wn*≫2^{w}. We are left with 2^{2w}ln2=*wn*, which is still
horrible. Since the left side varies *much* faster than the
right one we can replace the *w* on the right with a (yet unknown)
suitable constant *w*′. We get that 2^{w} varies as *O*(lg*n*).
(Note that this is an *upper* bound.) If we plug this back
into the running time formula we finally get *O*(*n* 2^{k}/lg*n*).
Here 2^{k} is how many bits the machine has per word and lg*n*
is how many bits we need to write down the length of the list we
are sorting. So radix sort should be quite good when we have to sort
lots of numbers. And indeed, as I said, it clearly beats *std:sort*.

Now, if you have to sort *really* lots of numbers, then you
should probably look at *funnelsort*.

## No comments:

## Post a Comment

Note: (1) You need to have third-party cookies enabled in order to comment on Blogger. (2) Better to copy your comment before hitting publish/preview. Blogger sometimes eats comments on the first try, but the second works. Crazy Blogger.