Handout 31

Sorting II*


Shell Sort

The idea of Shellsort is the following:

  1. arrange the data sequence in a two-dimensional array
  2. sort the columns of the array

The effect is that the data sequence is partially sorted. The process above is repeated, but each time with a narrower array, i.e. with a smaller number of columns. In the last step, the array consists of only one column. In each step, the sortedness of the sequence is increased, until in the last step it is completely sorted. However, the number of sorting operations necessary in each step is limited, due to the presortedness of the sequence obtained in the preceding steps.

Example: Let  3 7 9 0 5 1 6 8 4 2 0 6 1 5 7 3 4 9 8 2  be the data sequence to be sorted. First, it is arranged in an array with 7 columns (left), then the columns are sorted (right):

3
7
9
0
5
1
6
8
4
2
0
6
1
5
7
3
4
9
8
2
 

   

 

 arrow 

   

3
3
2
0
5
1
5
7
4
4
0
6
1
6
8
7
9
9
8
2
 

Data elements 8 and 9 have now already come to the end of the sequence, but a small element (2) is also still there. In the next step, the sequence is arranged in 3 columns, which are again sorted:

3
3
2
0
5
1
5
7
4
4
0
6
1
6
8
7
9
9
8
2
 

   

 

 

 

 arrow 

   

0
0
1
1
2
2
3
3
4
4
5
6
5
6
8
7
7
9
8
9
 

Now the sequence is almost completely sorted. When arranging it in one column in the last step, it is only a 6, an 8 and a 9 that have to move a little bit to their correct position.

Shell sort requires asymptotically fewer than O(n²) comparisons and exchanges in the worst case. Although it is easy to develop an intuitive sense of how this algorithm works, it is very difficult to analyze its execution time, but estimates range from O(nlog2 n) to O(n1.5) depending on implementation details.
Another Shell Sort Example

Binary Tree Sort

A binary search tree (BST) is a binary tree which has the following properties:


A binary search tree can be used to implement a simple but inefficient sorting algorithm. Similar to insertion sort, we insert all the values we wish to sort into a new ordered data structure, in this case a binary search tree, then traverse it in order, building our result.
The worst-case time of build_binary_tree is O(n2) � if you feed it a sorted list of values, it chains them into a linked list with no left subtrees. For example, build_binary_tree([1, 2, 3, 4, 5]) yields the tree (None, 1, (None, 2, (None, 3, (None, 4, (None, 5, None))))).

Merge Sort

Conceptually, merge sort works as follows:

Mergesort is a O(nlog(n)) sorting algorithm. It is easy to implement merge sort such that it is stable, meaning it preserves the input order of equal elements in the sorted output. It is an example of the divide and conquer algorithmic paradigm.

Heapsort

A heap is a specialized tree-based data structure that satisfies the heap property. Its base datatype (used for node keys) must be an ordered set.

If B is a child node of A, then the heap property is that:

key(A) >= key(B)

This implies that the greatest element is always in the root node, and such a heap is sometimes called a max heap. (Alternatively, if the comparison is reversed, the smallest element is always in the root node, which results in a min heap.) This is why heaps are used to implement priority queues. The efficiency of heap operations is crucial in several graph algorithms.

Example of a complete binary max heap


Although somewhat slower in practice on most machines than a good implementation of quicksort, it has the advantages of worst-case O(n log n) runtime and being an in-place algorithm. Heapsort is not a stable sort.

Heapsort example

QuickSort

Quicksort sorts by employing a divide and conquer strategy to divide a list into two sub-lists.

The steps are:

Quicksort on average, makes O(n log n) comparisons to sort n items. However, in the worst case, it makes O(n2) comparisons. Typically, quicksort is significantly faster in practice than other O(n log n) algorithms, because its inner loop can be efficiently implemented on most architectures, and in most real-world data it is possible to make design choices which minimize the possibility of requiring quadratic time. Quicksort is a comparison sort and, in efficient implementations, is not a stable sort.

A Comparison of Sorts

In this table, n is the number of records to be sorted and k is the average length of the keys. The columns "Best", "Average", and "Worst" give the time complexity in each case; estimates that do not use k assume k to be constant. "Memory" denotes the amount of auxiliary storage needed beyond that used by the list itself. These are all comparison sorts.

Name

Best

Average

Worst

Memory

Stable

Method

Shell sort

---

---

O(n1.5)

O(1)

No

Insertion

Binary tree sort

O(nlog(n))

O(nlog(n))

O(nlog(n))

O(n)

Yes

Insertion

Merge sort

O(nlog(n))

O(nlog(n))

O(nlog(n))

O(n)

Yes

Merging

Heapsort

O(nlog(n))

O(nlog(n))

O(nlog(n))

O(1)

No

Selection

Quicksort

O(nlog(n))

O(nlog(n))

O(n2)

O(log n)

No

Partitioning

Stability

Stable sorting algorithms maintain the relative order of records with equal keys (i.e. values). That is, a sorting algorithm is stable if whenever there are two records R and S with the same key and with R appearing before S in the original list, R will appear before S in the sorted list.


* portions from here.