Your Progress

9/30 completed

30% complete. Continue learning to master DSA!

Merge Sort Algorithm

Merge Sort is an efficient, comparison-based, divide-and-conquer sorting algorithm. With a consistent O(n log n) time complexity, it's one of the most widely used sorting algorithms for large data sets.

💡 Key Takeaways

Merge Sort has a time complexity of O(n log n) in all cases
It follows the divide-and-conquer paradigm by splitting the array, sorting, and merging
Merge Sort is stable, preserving the relative order of equal elements
It requires O(n) extra space, making it less memory-efficient than in-place algorithms
Merge Sort is excellent for large data sets and linked lists

How Merge Sort Works

Merge Sort divides the input array into smaller sub-arrays, sorts them, and then merges them together. The algorithm can be broken down into three main steps:

Divide: Split the array into two equal halves recursively until each sub-array contains a single element.
Conquer: An array with a single element is considered sorted.
Combine: Merge the sorted sub-arrays recursively to produce the final sorted array.

Divide and Conquer

Merge Sort is a classic example of the divide-and-conquer algorithm design paradigm, breaking down a problem into smaller, more manageable sub-problems, solving them independently, and then combining the solutions.

Visual Representation

Initial Array

Divide

Further Division

Merge Sub-arrays

Merge Larger Sub-arrays

Final Sorted Array

Visual representation of the Merge Sort algorithm step by step

Note: The merging step is the most critical part of the algorithm. It involves comparing elements from two sorted sub-arrays and placing them in the correct order in a temporary array, which is then copied back to the original array.

Merge Sort Implementation

Below are implementations of the Merge Sort algorithm in multiple programming languages:

1#include <stdio.h>
2#include <stdlib.h>
3
4// Merge two subarrays of arr[]
5// First subarray is arr[left..mid]
6// Second subarray is arr[mid+1..right]
7void merge(int arr[], int left, int mid, int right) {
8    int i, j, k;
9    int n1 = mid - left + 1;
10    int n2 = right - mid;
11    
12    // Create temporary arrays
13    int* L = (int*)malloc(n1 * sizeof(int));
14    int* R = (int*)malloc(n2 * sizeof(int));
15    
16    // Copy data to temporary arrays L[] and R[]
17    for (i = 0; i < n1; i++)
18        L[i] = arr[left + i];
19    for (j = 0; j < n2; j++)
20        R[j] = arr[mid + 1 + j];
21    
22    // Merge the temporary arrays back into arr[left..right]
23    i = 0; // Initial index of first subarray
24    j = 0; // Initial index of second subarray
25    k = left; // Initial index of merged subarray
26    
27    while (i < n1 && j < n2) {
28        if (L[i] <= R[j]) {
29            arr[k] = L[i];
30            i++;
31        } else {
32            arr[k] = R[j];
33            j++;
34        }
35        k++;
36    }
37    
38    // Copy the remaining elements of L[], if any
39    while (i < n1) {
40        arr[k] = L[i];
41        i++;
42        k++;
43    }
44    
45    // Copy the remaining elements of R[], if any
46    while (j < n2) {
47        arr[k] = R[j];
48        j++;
49        k++;
50    }
51    
52    // Free the allocated memory
53    free(L);
54    free(R);
55}
56
57// Main function that sorts arr[left..right] using merge()
58void mergeSort(int arr[], int left, int right) {
59    if (left < right) {
60        // Same as (left+right)/2, but avoids overflow for large left and right
61        int mid = left + (right - left) / 2;
62        
63        // Sort first and second halves
64        mergeSort(arr, left, mid);
65        mergeSort(arr, mid + 1, right);
66        
67        // Merge the sorted halves
68        merge(arr, left, mid, right);
69    }
70}
71
72// Function to print an array
73void printArray(int arr[], int size) {
74    for (int i = 0; i < size; i++)
75        printf("%d ", arr[i]);
76    printf("\n");
77}
78
79int main() {
80    int arr[] = {38, 27, 43, 3, 9, 82, 10};
81    int n = sizeof(arr) / sizeof(arr[0]);
82    
83    printf("Original array: \n");
84    printArray(arr, n);
85    
86    mergeSort(arr, 0, n - 1);
87    
88    printf("Sorted array: \n");
89    printArray(arr, n);
90    
91    return 0;
92}

Time and Space Complexity Analysis

Complexity	Best Case	Average Case	Worst Case
Time Complexity	O(n log n)	O(n log n)	O(n log n)
Space Complexity	O(n)	O(n)	O(n)

Time Complexity Explanation

The time complexity of Merge Sort is O(n log n) in all cases (best, average, and worst), where n is the number of elements in the array. This is because:

Dividing the array: The array is divided into two halves at each step, resulting in log n levels (the height of the recursion tree).
Merging the subarrays: At each level, merging all the subarrays takes O(n) time in total (each element is compared and placed exactly once).
Total complexity: O(n) operations for each of the log n levels results in O(n log n) time complexity.

Recursion Tree for Merge Sort

[38, 27, 43, 3, 9, 82, 10]

[38, 27, 43, 3]

[9, 82, 10]

[38, 27]

[43, 3]

[9, 82]

[10]

The recursion tree has log n levels, and each level involves n operations in total.

Space Complexity Explanation

The space complexity of Merge Sort is O(n) because:

Temporary arrays: During the merge step, we need to create temporary arrays to store the elements of the two subarrays being merged. These temporary arrays take O(n) space.
Recursion stack: The recursion stack also takes O(log n) space due to the recursive calls. However, the O(n) space for temporary arrays dominates, making the overall space complexity O(n).

Note: The O(n) space requirement is a drawback of Merge Sort compared to in-place sorting algorithms like Quick Sort (which requires O(log n) space on average). However, there are variations of Merge Sort that can reduce the space requirement at the cost of increased implementation complexity.

Applications of Merge Sort

Merge Sort is used in various applications due to its stability and predictable performance:

External Sorting

When data is too large to fit in memory, Merge Sort is effective for external sorting. It's used in database systems for sorting large data sets that don't fit in RAM, as it can efficiently merge pre-sorted chunks of data from disk.

Linked List Sorting

Merge Sort is particularly well-suited for linked lists, as it doesn't require random access to elements. The merging process can be implemented without extra space by rearranging pointers, making it more efficient than other algorithms for linked lists.

Inversion Count Problems

Merge Sort can be adapted to count inversions in an array (pairs of elements that are out of order). This is useful in various computational problems and data analysis applications where determining how "out of order" a dataset is becomes important.

Stable Sorting Requirements

In applications where maintaining the relative order of equal elements is important (like sorting a list of objects by multiple criteria), Merge Sort's stability makes it a preferred choice over unstable algorithms like Quick Sort.

Merge Sort vs. Other Sorting Algorithms

Feature	Merge Sort	Quick Sort	Heap Sort	Bubble Sort
Time Complexity (Worst)	O(n log n)	O(n²)	O(n log n)	O(n²)
Space Complexity	O(n)	O(log n)	O(1)	O(1)
Stability	Yes	No	No	Yes
In-place	No	Yes	Yes	Yes
Adaptive	No	No	No	Yes

Next Steps

Now that you understand Merge Sort, here are some related sorting algorithms and advanced topics to explore:

Quick Sort-Another divide-and-conquer sorting algorithm with O(n log n) average-case complexity
Heap Sort-A comparison-based sorting algorithm using binary heap data structure
Counting Sort-A non-comparison based sorting algorithm with O(n+k) complexity

Introduction to DSA

Basic Data Structures

Advanced Data Structures

Searching Algorithms

Sorting Algorithms

Advanced Algorithms