0% found this document useful (0 votes)
53 views

06 SortingB MergeSort

Uploaded by

lukiluki
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

06 SortingB MergeSort

Uploaded by

lukiluki
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 79

CS2040S

Data Structures and Algorithms

Welcome!
Sorting, Part I
Sorting algorithms
o BubbleSort
o SelectionSort
o InsertionSort
o MergeSort

Properties
o Running time
o Space usage
o Stability
Admin
Tutorials and Recitations start this week
o Find materials on Coursemology
o Find links on Coursemology
o Do work on the questions before class
Chinese New Year
Next week
o Happy new year!
o University holiday: Feb 1/2 (Tuesday, Wednesday)
o Monday: class as usual
o Wednesday: no class
o Tutorials: rescheduled (talk to tutor)
Admin
Covid Issues
o If you test positive and are well, can attend a Zoom
session. (And can swap to a Zoom session.)
o If you are unwell, please rest and recover!
o For F2F, must take FET/ART (and have green pass) as
per NUS rules. And must use NUS attendance system.
o Please do not attend F2F if positive or any symptoms.

Let your tutor know if you cannot make it, if you are
missing lecture, etc. They will handle any issues.
Admin
For tutors and TAs:
• If they test positive, we will adapt accordingly.
• Already, we will have one replacement TA this week…
Admin
Video of the Week
o Random video posted
each week
o Selected by the tutor
team as something “fun”
o Sometimes related to
class, sometimes a little
bit different
o Not just another lecture…

(Nominate videos to your tutor!)


Sorting
Problem definition:
Input: array A[1..n] of words / numbers
Output: array B[1..n] that is a permutation of A
such that:
B[1] ≤ B[2] ≤ … ≤ B[n]

Example:
A = [9, 3, 6, 6, 6, 4] ® [3, 4, 6, 6, 6, 9]
MergeSort
Divide-and-Conquer
1. Divide problem into smaller sub-problems.

2. Recursively solve sub-problems.

3. Combine solutions.
MergeSort
Divide-and-Conquer Sorting
1. Divide: split array into two halves.

2. Recurse: sort the two halves.

3. Combine: merge the two sorted halves.


MergeSort
Divide-and-Conquer Sorting
1. Divide: split array into two halves.

2. Recurse: sort the two halves.

3. Combine: merge the two sorted halves.

Advice:

When thinking about recursion, do not “unroll” the recursion.


Treat the recursive call as a magic black box.
(But don’t forget the base case.)
MergeSort Step 1:
Divide array into two pieces.
MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(A[1..n/2], n/2);
Y ¬MergeSort(A[n/2+1, n], n/2);
return Merge (X,Y, n/2);
MergeSort Step 2:
Recursively sort the two halves.
MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(A[1..n/2], n/2);
Y ¬MergeSort(A[n/2+1, n], n/2);
return Merge (X,Y, n/2);

Sort Sort
MergeSort Step 3:
Merge the two halves into
one sorted array.
MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(A[1..n/2], n/2);
Y ¬MergeSort(A[n/2+1, n], n/2);
return Merge (X,Y, n/2);
Merge
MergeSort
Base case
MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(A[1..n/2], n/2);
Y ¬MergeSort(A[n/2+1, n], n/2);
return Merge (X,Y, n/2);
Recursive “conquer” step

Combine solutions
The only “interesting” part is merging!
MergeSort
Divide-and-Conquer Sorting
1. Divide: split array into two halves.

2. Recurse: sort the two halves.

3. Combine: merge the two sorted halves.

Advice:

When thinking about recursion, do not “unroll” the recursion.


Treat the recursive call as a magic black box.
(But don’t forget the base case.)
Divide-and-Conquer

7 3 9 5 7 1 6 2
Merging

7 3 9 5 7 1 6 2

3 7 5 9 1 7 2 6

3 5 7 9 1 2 6 7

1 2 3 5 6 7 7 9
Source: Wikipedia
Merging Two Sorted Lists

Key subroutine: Merge


– How to merge?
– How fast can we merge?
Merging Two Sorted Lists

20 12
sorted
13 11 from
smallest
7 9 to
biggest
2 1
Merging Two Sorted Lists

20 12
13 11
7 9
2 1

1
Merging Two Sorted Lists

20 12 20 12
13 11 13 11
7 9 7 9
2 1 2

1 2
Merging Two Sorted Lists

20 12 20 12 20 12
13 11 13 11 13 11
7 9 7 9 7 9
2 1 2

1 2 7
Merging Two Sorted Lists

20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2

1 2 7 9
Merging Two Sorted Lists

20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11
7 9 7 9 7 9
2 1 2

1 2 7 9 11 12 13 20
Merge: Running Time

Given two lists:


– A of size n/2
– B of size n/2

Total running time: ??

is open
Merge: Running Time

Given two lists:


– A of size n/2
– B of size n/2

Total running time: O(n) = cn


– In each iteration, move one element to final list.
– After n iterations, all the items are in the final list.
– Each iteration takes O(1) time to compare two
elements and copy one.
Merge-Sort Analysis
Let T(n) be the worst-case running time for an
array of n elements.

MergeSort(A, n)
if (n=1) then return; q(1)
else:
X ¬Merge-Sort(…); T(n/2)
Y ¬Merge-Sort(…); T(n/2)
return Merge (X,Y, n/2); q(n)
MergeSort Analysis

Let T(n) be the worst-case running time for an


array of n elements.

T(n) = q(1) if (n=1)

= 2T(n/2) + cn if (n>1)

is open
Techniques for Solving Recurrences

1. Guess and verify (via induction).

2. Draw the recursion tree.

3. Use the Master Theorem (see CS3230) or


the Akra–Bazzi Method, or other advanced
techniques.
MergeSort: Recurse “downwards”

7 3 9 5 7 1 6 2

7 3 9 5 7 1 6 2

7 3 9 5 7 1 6 2

7 3 9 5 7 1 6 2

32
MergeSortAnalysis
T(n) = 2T(n/2) + cn
merge
recursive recursive
cn
sort sort
T(n/2) T(n/2)
MergeSortAnalysis
T(n) = 2T(n/2) + cn

cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
MergeSortAnalysis
T(n) = 2T(n/2) + cn

cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
MergeSort Analysis
T(n) = 2T(n/2) + cn

cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4

T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8) T(n/8)


MergeSort Analysis
T(n) = 2T(n/2) + cn

cn
cn/2 cn/2
cn/4 cn/4 cn/4 cn/4

cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8

Base case
MergeSort Analysis
T(n) = 2T(n/2) + cn

cn =cn

cn/2 cn/2 =cn

cn/4 cn/4 cn/4 cn/4 =cn

cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 =cn


MergeSort Analysis
T(n) = 2T(n/2) + cn

cn =cn

cn/2 cn/2 =cn

cn/4 cn/4 cn/4 cn/4 =cn

cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 =cn

Key question: how many levels?


MergeSort Analysis
T(n) = 2T(n/2) + cn

level number
0 1
1 2 number = 2 level
2 4
3 8
4 16
… …
h ??
MergeSort Analysis
T(n) = 2T(n/2) + cn

level number
0 1
1 2 number = 2 level
2 4
3 8 n = 2h
4 16
… …
log n = h
h n
MergeSort Analysis
T(n) = 2T(n/2) + cn

cn =cn

cn/2 cn/2 =cn

cn/4 cn/4 cn/4 cn/4 =cn

cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 cn/8 =cn

cn log n
MergeSortAnalysis
T(n) =O(n log n)

MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(…);
Y ¬MergeSort(…);
return Merge (X,Y, n/2);
Techniques for Solving Recurrences

1. Guess and verify (via induction).

2. Draw the recursion tree.

3. Use the Master Theorem (see CS3230) or


the Akra–Bazzi Method, or other advanced
techniques.
Guess: T(n) = O(n log n)

Recurrence being analyzed:


T(n) = 2T(n/2) + c∙n
T(1) = c
Guess: T(n) = c∙n log n More precise guess:
Fix constant c.

Recurrence being analyzed:


T(n) = 2T(n/2) + c∙n
T(1) = c
Guess: T(n) = c∙n log n Induction:
Base case

T(1) = c

Recurrence being analyzed:


T(n) = 2T(n/2) + c∙n
T(1) = c
Guess: T(n) = c∙n log n Induction:
Assume true for all smaller values.

T(1) = c

T(x) = c∙x log x for all x < n.

Recurrence being analyzed:


T(n) = 2T(n/2) + c∙n
T(1) = c
Guess: T(n) = c∙n log n Induction:
Prove for n.

T(1) = c

T(x) = c∙x log x for all x < n.

T (n) = 2T (n/2) + cn
= 2(c(n/2) log(n/2)) + cn
= cn log(n/2) + cn
= cn log(n) cn log(2) + cn
= cn log(n)
Recurrence being analyzed:
T(n) = 2T(n/2) + c∙n
T(1) = c
Guess: T(n) = c∙n log n

T(1) = c

T(x) = c∙x log x for all x < n.

T (n) = 2T (n/2) + cn
= 2(c(n/2) log(n/2)) + cn
= cn log(n/2) + cn
= cn log(n) cn log(2) + cn
= cn log(n)
Recurrence being analyzed:
Induction: T(n) = 2T(n/2) + c∙n
It works! T(1) = c
Top-Down vs. … Step 1:
Divide array into two pieces.
MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(A[1..n/2], n/2);
Y ¬MergeSort(A[n/2+1, n], n/2);
return Merge (X,Y, n/2);
Top-Down vs. … Step 2:
Recursively sort the two halves.
MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(A[1..n/2], n/2);
Y ¬MergeSort(A[n/2+1, n], n/2);
return Merge (X,Y, n/2);

Sort Sort
Top-Down vs. … Step 3:
Merge the two halves into
one sorted array.
MergeSort(A, n)
if (n=1) then return;
else:
X ¬MergeSort(A[1..n/2], n/2);
Y ¬MergeSort(A[n/2+1, n], n/2);
return Merge (X,Y, n/2);
Merge
Source: Wikipedia
MergeSort, Bottom Up

1 2 3 4 5 6 7 8 9 10 11 12 13 15 15 16

2 4 6 7 9 12 13 15 1 3 5 8 10 11 14 16

2 7 9 15 4 6 12 13 1 5 8 10 3 11 14 16

7 15 2 9 6 12 4 13 1 8 5 10 3 14 11 16

15 7 9 2 6 12 13 4 1 8 10 5 3 14 11 16
How much does it matter?
Comparing words in two files:
Version Change Running Time
Version 1 4,311.00s
Version 2 Better file handling 676.50s
Version 3 Mergesort replaces 6.59s
SelectionSort
Version 4 Hashing replaces 2.35s
sorting

Algorithm:
1. Read all text in both files.
2. Sort words.
3. Count how many times each word appears in each file.
real world performance

http://www.cs.toronto.edu/~jepson/csc148/2007F/notes/sorting.html]
When is it better to use InsertionSort
instead of MergeSort?
A. When there is limited space?
B. When there are a lot of items to sort?
C. When there is a large memory cache?
D. When there are a small number of items?
E. When the list is mostly sorted?
F. Always
G. Never
MergeSort

When the list is mostly sorted:


– InsertionSort is fast!
– MergeSort is O(n log n)

How “close to sorted” should a list be


for InsertionSort to be faster?

How would you check?


MergeSort

Small number of items to sort:


– MergeSort is slow!
– Caching performance, branch prediction, etc.
– User InsertionSort for n < 1024, say.

Base case of recursion:


Run an experiment
– Use slower sort. and post on the forum
what the best switch-over
point is for your machine.
MergeSort

Space usage…
– Need extra space to do merge.
– Merge copies data to new array.
Space Complexity

Question:
How much space is allocated during a call to
MergeSort?

Note:
Measure total allocated space.
We will not model garbage
collection or other Java details.
Space Complexity

Question:
How much space is allocated during a call to
MergeSort?

Key subroutine: Merge


Merging Two Sorted Lists

20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2

1 2 7 9
Need temporary array of size n.
Space Analysis
Let S(n) be the worst-case space allocated for
an array of n elements.

MergeSort(A, n)
if (n=1) then return; q(1)
else:
X ¬Merge-Sort(…); S(n/2)
Y ¬Merge-Sort(…); S(n/2)
return Merge (X,Y, n/2); n
S(n) = 2S(n/2) + n
S(n) = ?

A. O(log n)
B. O(n)
C. O(n log n)
D. O(n2)
E. O(n2 log n)
F. O(2n)

is open
Space Analysis

Let S(n) be the worst-case space for an array


of n elements.

S(n) = q(1) if (n=1)

= 2S(n/2) + n if (n>1)
= O(n log n)
MergeSort

1 2 3 4 5 6 7 8 9 10 11 12 13 15 15 16

2 4 6 7 9 12 13 15 1 3 5 8 10 11 14 16

2 7 9 15 4 6 12 13 1 5 8 10 3 11 14 16

7 15 2 9 6 12 4 13 1 8 5 10 3 14 11 16

15 7 9 2 6 12 13 4 1 8 10 5 3 14 11 16
Challenge of the Day:

Design a version of MergeSort that minimizes the


amount of extra space needed.

Hint: Do not allocate any new space during the


recursive calls!
Stability

Is MergeSort stable?

is open
MergeSort
Stability:
– MergeSort is stable if “merge” is stable.
– Merge is stable if carefully implemented.
Sorting Analysis
Summary: Also:
BubbleSort: O(n2) The power of
SelectionSort: O(n2) divide-and-conquer!

InsertionSort: O(n2) How to solve recurrences…

MergeSort: O(n log n)

Properties: time, space, stability


Slowest Sorting Algorithm?
Step 1:
– Generate all the permutations of the input.

Step 2:
– Sort the permutations (by number of inversions).

Step 3:
– Return the first element in the sorted list of
permutations.
Slowest Sorting Algorithm?
Step 1:
– Generate all the permutations of the input.

Step 2:
– Sort the permutations (by number of inversions).
Use BogoSort!

Step 3: Roughly: O((n!)!))

– Return the first element in the sorted list of


permutations.
Slowest Sorting Algorithm?
Step 1:
– Generate all the permutations of the input.

Step 2:
– Sort the permutations (by number of inversions).

Recurse!
Step 3: Recursive instance is larger than original!

– Return the first element in the sorted list of


permutations.
Slowest Sorting Algorithm?
Step 1:
– Generate all the permutations of the input.

Step 2:
– Sort the permutations (by number of inversions).

Recurse!
Step 3: After n! recursions, use QuickSort for the “base case”.

– Return the first element in the sorted list of


permutations.
Ingrassia-Kurtz Sort
Step 1:
– Generate all the permutations of the input.

Step 2:
– Sort the permutations (by number of inversions).

Recurse!
Step 3: After n! recursions, use QuickSort for the “base case”.

– Return the first element in the sorted list of


permutations.
Sorting, Part II
QuickSort
– Divide-and-Conquer
– Paranoid QuickSort
– Randomized Analysis

(Warning: PS3 opens today and depends on


QuickSort, but you can get started without that.)
Summary

Name Best Case Average Worst Extra Stable?


Case Case Memory
Bubble Sort
O(n) O(n2) O(n2) O(1) Yes
Selection Sort
O(n2) O(n2) O(n2) O(1) No
Insertion Sort
O(n) O(n2) O(n2) O(1) Yes
Merge Sort
O(n log n) O(n log n) O(n log n) O(n log n) Yes

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy