QuickSort Using Linked Lists
By Bill McDaniel
The literature abounds with various versions of the 'Quicksort' algorithm
first described by C.A.R. Hoare [1][2]. The algorithm is an 'in place' sort
and is faster than most other methods. A simple version, that follows, is in
two basic parts. Part 1 is the PutFirst algorithm which accepts a group of
numbers and places that first (leftmost) number, which is usually called the
'pivot', into a position such that all of the numbers to the left are less
than the pivot and all of the numbers to the right are greater than the pivot.
The algorithm, in Pascal, is as follows:
1 procedure PutFirst (first,last : integer; var pos : integer);
2 var
3 p,l,r : integer;
4 begin
5 l := first;
6 r := last;
7 p := l;
8 while l < r do
9 begin
10 while (l < r) and (Ar[p] <= Ar[r]) do r := r - 1;
11 swap(Ar[r],Ar[p]);
12 p := r;
13 while (l < r) and (Ar[p] >= Ar[l]) do l := l + 1;
14 swap(Ar[l],Ar[p]);
15 p := l;
16 end;
17 pos := p
18 end;
(Note: the line numbers are not part of pascal, but have been included for
clarity.) The logic is straightforward. Line 10 searches for a number
to the right of pivot that is less than pivot. When that number is found,
the number is swapped with the pivot. Line 13 searches for a number to the
left of pivot that is greater than pivot. When that number is found, the
number is swapped with the pivot. When there are no more numbers 'on the
wrong side' of the pivot, the routine is done, and the output variable 'Pos'
is assigned the position of the pivot.
When control returns to the calling routine, pos will point to a number
that is in it's final position (i.e. it will never have to be moved again).
The group of numbers to the left of the pivot can then be sorted, and the
group of numbers to the right of the pivot can also be sorted, after which the
entire group of numbers will be sorted. The code of the calling program is as
follows:
19 procedure quick_sort(first,last : integer);
20 var
21 pos : integer;
22 begin
23 put_first(first,last,pos);
24 if pos-first > opt then quick_sort(first,pos-1);
25 if last-pos > opt then quick_sort(pos+1,last)
26 end;
An inductive proof suffices to show that any group of numbers can be sorted
using this method.
One of the properties of PutFirst is that all of the elements of array
Ar have to be directly addressable (see lines 10,11,13 and 14) and that the
maximum number of numbers to be sorted have to be known, a priori. Sometimes
this is not easily accomplished. If the items to be sorted are large in
number then it might not be possible to load all of the items into memory at
the same time. Also, due to the proliferation of micro-computers, especially,
those that are based on the 8086 chip, and due to the addressing constraints
imposed by that hardware, it may not be possible for a programmer to declare
a very large array in his/her program. Many languages limit the amount of
declared storage (in the users workspace) to be approximately 65535 bytes.
And yet, a typical PC has from 256K (K = 1024) bytes to 640K bytes available.
Accessability to this extra storage greatly expands the capability of a user
program. Typically, the user will select a language that supports dynamic
storage allocation using pointer varibles. Several popular languages are
currently available that, indeed, have this support, among them are Pascal,
Ada, and C.
This paper describes an algorithm that will accomplish the equivalent
of a QuickSort by using dynamic storage allocation techniques instead of
a fixed size array.
The technique described uses Circular Linked Lists with tail pointers.
A record called 'node', which is indirectly recursively defined, contains
an information portion and a link portion. The link points to another
instance of this record type. Each linked list will be made up of zero or
more instances of these records.
27 type
28 ptr = ^ node;
29 node =
30 record
31 info : integer;
32 link : ptr
33 end;
Two linked list support routines are included: append and remove.
The former will append a list (of zero or more records) onto the end of
another list (of zero or more records), and the latter will remove a single
record from the front of a list of one or more records.
These two routines support a queue.
34 procedure append (var p,q : ptr);
35 { append q onto the end of p - set q to nil }
36 var
37 t : ptr;
38 begin
39 if p = nil then
40 begin
41 p := q;
42 q := nil
43 end
44 else
45 begin
46 if q = nil then { p is already nil - do nothing }
47 else
48 begin
49 t := q^.link;
50 q^.link := p^.link;
51 p^.link := t;
52 p := q;
53 q := nil
54 end
55 end
56 end;
57 procedure remove(var r,s : ptr);
58 { in : r^list;
59 out : r^list; s^node }
60 var
61 t : ptr;
62 begin
63 if r = nil then s := nil
64 else
65 begin
66 if r^.link = r then
67 begin
68 s := r;
69 r := nil
70 end
71 else
72 begin
73 s := r^.link;
74 r^.link := r^.link^.link;
75 s^.link := s
76 end
77 end
78 end;
With the availability of these two routines, the logic of the sort
procedure is as follows:
procedure sort ( var r : ptr )
get the pivot (the first item in the list pointed to by r)
split on the pivot (r1,r2)
if r1 is not empty then sort(r1)
if r2 is not empty then sort(r2)
combine r1,pivot,r2
r = r1
The crux of the above algorithm is the line that contains the phrase 'split
on the pivot'. That one line of code expands to a while loop that removes,
in turn, each item from the list pointed to by r, and places that item into
one of two linked lists depending on the relationship between that item and
the pivot. If the number is less than the pivot then the number goes into the
first list, if the number is greater than the pivot then the number goes into
the second list.
while r <> nil do
begin
remove(r,t);
b := t^.info > pivot^.info;
count[b] := count[b] + 1;
append(tail[b],t);
end;
When the while loop is finished, the numbers will have been partitioned into
two groups. All of the numbers in the first group are less than then pivot,
and all of the numbers in the second group are greater than the pivot. The
two groups are then sorted, in turn, then they are merged together along with
the pivot in the following order: all of the items in group one are followed
by the pivot which is then followed by all of the items in group two. Since
group one is sorted (and all of the numbers are less than the pivot), and
since group two is sorted (and all of those numbers are greater than the
pivot), then the entire group is sorted. The entire sorting procedure
follows:
79 procedure sort ( var r : ptr);
80 var
81 t,
82 pivot : ptr;
83 b : boolean;
84 tail : array [false..true] of ptr;
85 count : array [false..true] of integer;
86 begin
87 remove(r,pivot);
88 for b := false to true do
89 begin
90 tail[b] := nil;
91 count[b] := 0
92 end;
93 while r <> nil do
94 begin
95 remove(r,t);
96 b := t^.info > pivot^.info;
97 count[b] := count[b] + 1;
98 append(tail[b],t);
99 end;
100 for b := false to true do
101 if count[b] > 1 then
102 begin
103 if tail[b] <> nil then sort(tail[b]);
104 end;
105 append(tail[false],pivot);
106 append(tail[false],tail[true]);
107 r := tail[false]
108 end;
Line 87 removes the pivot, lines 88-92 do some initialization, lines 93-99
do the split, lines 100-104 sort the two split groups (if necessary), and
lines 105-106 tie everything together into one (1) linked list.
The order of the above algorithm is comparable with quicksort. The
worst case behavior is n*log2(n). If the numbers are 'almost' sorted upon
entry, then the order is n*n (the same as quicksort). However, the
performance can be greatly improved by checking to see if the current number
(in the current output list) is greater than the previous number (in the
current output list). If if this relation holds for all the numbers in any
one list, then the sublist does not have to be sorted (at line 103). This
modification capitalizes on sublists that are already in order.
A variation of this algorithm can be used to sort records on a disk.
Rather than use linked lists, one can read records to a file and send records
to two output files. The time requirement is greatly increased (because
of the slowness of the disk media) but the number of records to be sorted
is limited only by the size of the disk.
Conclusions
This paper describes a sorting algorithm based on the Quicksort, but
implemented with linked lists. The order of the algorithm is the same as
Quicksort but the storage requirements are different. More storage must
be used to hold the link pointers; however, this is offset by the general
usefulness both on a PC and on any computer using this method to sort records
on a disk.
B i b l i o g r a p h y
[1] Hoare, C.A.R. "Quicksort", Computer Journal, April 1962, pp. 10-15.
This is the earliest reference to Quicksort. Nearly any book on
Data Structures would contain a description of this algorithm.
[2] Knuth, D.E. The Art of Computer Programming. Volume 3: Sorting and
Searching. Addison-Wesley, 1973. This book was the first hardcover
publication of a variety of sorting algorithms of which the Quicksort
is referred to as 'The Partition Exchange Sort'.