填坑-排序算法优化

排序算法优化

之前挖的一个坑，填一下

冒泡排序优化

普通的冒泡排序

时间复杂度： O(n^2)，简单粗暴

#include <algorithm>
void bubbleSort(int arr[], int len) {
    for (int i = len - 1; i > 0; i--) {
        for (int j = 0; j < i; j++) {
            if (arr[j] > arr[j+1]) {
                swap(arr[j], arr[j+1]);
            }
        }
    }
}

带有检查的冒泡排序

如果经过一轮的检查，没有发生任何的swap交换活动，则表明已经有序了，此时就可以直接跳出循环表明排序完成了。

#include <algorithm>
void bubbleSort(int arr[], int len) {
	bool swapped = true;
	int j = 0;
	while (swapped) {
		swapped = false;
		j++;
		for (int i = 0; i < len - j; i++) {
			if (arr[i] > arr[i+1]) {
				swap(arr[i], arr[i+1]);
				swapped = true;
			}
		}
	}
}

鸡尾酒排序

普通的冒泡排序太依赖初始的顺序了，大的元素冒泡的很快，小的数字沉底却很慢
例如：
乌龟排序 {2, 3, 4, 5, 1} 1是那个乌龟，需要O(n^2)的时间复杂度
兔子排序 {6, 1, 2, 3, 4, 5} 6是那个兔子，需要O(n)的时间复杂度

要解决这个问题则需要鸡尾酒排序

鸡尾酒排序排序又叫作双向冒泡排序，算是冒泡排序的一种轻微改进版本。普通的冒泡排序只能每次从前往后进行一个次序遍历，而鸡尾酒排序每次遍历包括两个方向，先从前往后遍历记录最后发生交换的两个元素位置，然后从这个位置开始从后往前遍历，这种双向交替比较不仅会使小的浮上水面，也会使大的沉到水底，因而效率会比较高。

#include <algorithm>
void shakerSort(int arr[], int len) {
	int left = 0, right = len - 1;
	while (left < right) {
		// from l to r
		for (int i = left; i < right; ++i) {
			if (arr[i] > arr[i+1]) {
				swap(arr[i], arr[i+1]);
			}
		}
		--right;
		for (int j = right; j > left; --j) {
			if (arr[j] < arr[j-1]) {
				swap(arr[j], arr[j-1]);
			}
		}
		++left;
	}
}

选择排序

#include <algorithm>
void selectionSort(int arr[], int len) {
	for (int i = 0; i < len - 1; i++) {
		int minIndex = i;
		for (int j = i + 1; j < len; j++) {
			if (arr[j] < arr[minIndex]) {
				minIndex = j;
			}
		}
		if (minIndex != i) {
			swap(arr[i], arr[minIndex]);
		}
	}
}

插入排序

直接插入排序

直接插入排序的基本操作是将一个记录插入到已经排好序的有序表中，从而得到一个新的、记录数增加1的有序表。
实际使用的时候可以在序列的头部添加一个哨兵，将i的数据放到哨兵后就空出一个位置，便于后面数据的挪动，找到空位后将哨兵位置的原数据插入进去就可以了。

void insertSort(int arr[], int len) {
	int j, temp;
	for (int i = 0; i < len; i++) {
		temp = arr[i];
		for (j = i - 1; j >= 0; --j) {
			if (arr[j] > temp) {
				arr[j+1] = arr[j];
			}
			else {
				break;
			}
		}
		arr[j+1] = temp;
	}
}

二分插入排序

二分插入排序又叫折半插入排序，算是前面直接插入排序的变种改进，主要是用了折半查找的思路进行了优化。

void insertSort(int arr[], int len) {
	for (int i = 1; i < len; ++i) {
		int left = 0;
		int elem = arr[i];
		int right = i - 1;
		
		while (left <= right) {
			int mid = (left + right) / 2;
			if (elem > arr[mid]) {
				left = mid + 1;
			}
			else {
				right = mid - 1;
			}
		}
		for (int j = i; j > left; --j) {
			arr[j] = arr[j-1];
		}
		arr[left] = elem;
	}
}

希尔排序

希尔排序的思路是：将待排序列分割成若干个子序列( 一般是一半)，此时每个子序列待排序的记录个数比较少，可以在这些子序列内分别进行直接插入排序，当整个序列都基本有序时，注意只是基本有序时，再对全体记录进行一次直接插入排序。

但是这里的分组不是简单相邻的分组，而是将相隔某个“增量/increment”的记录组成一个子序列，实现跳跃式的交换移动，所以使得排序的效率提高。随着增量的不断减少，跳跃移动的步伐慢慢变小，而整个系列也变的更为的“基本有序”。还需要注意要确保最终的increment=1来实现最后一次精细的排序，然后整个序列就变的有序了。

void shellSort(int arr[], int len) {
	int gap = len / 2;
	while (gap) {
		// 间隔为gap的为一组，第一个元素不排序，所以跳过gap
		for (int i = gap; i < len; ++i) {
			int elem = arr[i];
			int j = i;
			while (j >= gap && elem < arr[j-gap]) {
				arr[j] = arr[j-gap];
				j -= gap;
			}
			arr[j] = elem;
		}
		gap /= 2;
	}
}

快速排序

普通的2-way快排

#include <algorithm>
void quickSort(int arr[], int left, int right) {
	int i = left, j = right;
	// choose a pivot
	int pivot = arr[(left + right) / 2];
	// partition
	while (i <= j) {
		while (i <= j && arr[i] < pivot)
			++i;
		while (i <= j && arr[j] > pivot)
			--j;
		if (i <= j) {
			swap(arr[i], arr[j]);
			++i;
			--j;
		}
	}
	// recursion
	if (i < right)
		quickSort(arr, i, right);
	if (j > left)
		quickSort(arr, left, j);
}

快排C++11

快速排序一一顺序实现版

#include <list>
#include <future>
template <typename T>
std::list<T> sequential_quick_sort(std::list<T> input) {
	if (input.empty())
		return input;
    std::list<T> result;
    // 使用splice 将输入的首个元素（轴值）放入结果列表中
    result.splice(result.begin(), input, input.begin());
    // 使用引用，避免过多的拷贝
    T const& pivot = *result.begin();
    // 使用std::partition 将序列中的值分成大于轴值和小于轴值的两部分，同样使用引用
    auto divide_point = std::partition(input.begin(), input.end(), [&] (T const& t) { return t < pivot; });
    std::list<T> lower_part;
    // 把小于的部分移动到表lower_part，其他的继续留在input
    lower_part.splice(lower_part.end(), input, input.begin(), divide_point);
    //递归调用对两部分进行排序
    auto new_lower(sequential_quick_sort(std::move(lower_part)));
    auto new_higher(sequential_quick_sort(std::move(input)));
    //再次使用splice()以正确的顺序拼接
    result.splice(result.end(), new_higher);
    result.splice(result.begin(), new_higher.get());
    return result;
}

快排并行版

快速排序一一“期望”并行版

#include <list>
#include <future>
template <typename T>
std::list<T> parallel_quick_sort(std::list<T> input) {
    if (input.empty())
        return input;
    std::list<T> result;
    result.splice(result.begin(), input, input.begin());
    T const& pivot = *result.begin();
    auto divide_point = std::partition(input.begin(), input.end(), [&] (T const& t) { return t < pivot; });
    std::list<T> lower_part;
    lower_part.splice(lower_part.end(), input, input.begin(), divide_point);
    // 不对小于中间值的部分排序，使用std::async在另一线程对其进行排序
    std::future<std::list<T> > new_lower(std::async(&parallel_quick_sort<T>, std::move(lower_part)));
    // 大于部分如同之前一样，使用递归排序，递归两次，4个线程；三次，8个线程...
    auto new_higher(parallel_quick_sort(std::move(input)));
    result.splice(result.end(), new_higher);
    result.splice(result.begin(), new_higher.get());
    return result;
}

优化的快排

快排改进的方案主要有三种：

一种是调整标值的选择方式，降低在某些特定的输入序列的情况下，快速排序性能接近其最坏性能的问题；

另一种是解决其递归问题，采用模拟递归栈的方式，用循环代替递归，减少对系统内存栈资源的使用，提高性能；
还有一种是在待排序序列已基本排序时，换用其他更为简介紧凑的排序方式，如果插入排序等，提高效率

堆排序

#include <algorithm>
void buildHeap(int arr[], int first, int last) {
	int father = first;
	int son = 2 * father + 1;
	while (son < last) {
		// choose bigger son
		if (son + 1 < last && arr[son] < arr[son+1]) {
			son++;
		}
		// if father > son , break
		if (arr[father] > arr[son]) {
			break;
		}
		// swap the father and son, and go to child
		else {
			swap(arr[father], arr[son]);
			father = son;
			son = 2 * father + 1;
		}
	}
}
void heapSort(int arr[], int len) {
	// build heap
	for (int i = len / 2 - 1; i >= 0; --i) {
		buildHeap(arr, i, len);
	}
	// get the biggest then build heap
	for (int i = len - 1; i > 0; --i) {
		swap(arr[0], arr[i]);
		buildHeap(arr, 0, i);
	}
}

归并排序

void merge(int arr[], int first, int last, int mid) {
	int index_1 = first, index_2 = mid + 1;
	int index_s = 0;
	int temp[last-first+1];
	while (index_1 <= mid || index_2 <= last) {
		if (index_1 > mid) {
			while (index_2 <= last) {
				temp[index_s++] = arr[index_2++];
			}
		}
		else if (index_2 > last) {
			while (index_1 <= mid) {
				temp[index_s++] = arr[index_1++];
			}
		}
		else {
			if (arr[index_1] < arr[index_2]) {
				temp[index_s++] = arr[index_1++];
			}
			else {
				temp[index_s++] = arr[index_2++];
			}
		}
	}
	// save result
	for (int i = 0; i < last-first+1; ++i) {
		arr[first+i] = temp[i];
	}
}
void mergeSort(int arr[], int first, int last) {
	if (first < last) {
		int mid = (first + last) / 2;
		mergeSort(arr, first, mid);
		mergeSort(arr, mid+1, last);
		merge(arr, first, last, mid);
	}
}

计数排序

计数排序基本算法：

　　(1). 取出待排序列中的最小值n1和最大值n2，建立一个n2-n1+1长度的数组；
　　(2). 依次遍历待排元素，根据待排元素的值将对应数组项计数自增；
　　(3). 这步比较关键，统计数组计数，每项保存前N项和count_arr[k] += count_arr[k-1];这其实是进行了值到最终排列位置的映射关系；
　　(4). 再依次遍历待排元素，根据元素值查找count_arr中对应的最终排序位置，计入到排序结果中。

缺点是空间要求比较大。

桶排序

桶排序是计数排序的升级版，通过一个映射函数将待排数据分配到各个桶里面，然后桶内部如果大于一个元素可以采用快速排序等操作方式；最后实现桶数据的合并；

　　(1). 设置桶的数目：bucketCount = Math.floor((maxValue - minValue) / bucketSize) + 1;
　　(2). 待排元素和桶的映射关系：buckets[Math.floor((arr[i] - minValue) / bucketSize)].push(arr[i]);
　　(3). 对桶进行从小到大的合并操作：arr.push(buckets[i][j]);
　　桶排序的f(k)值的计算，其作用就相当于快排中划分，已经把大量数据分割成了基本有序的数据块(桶)，然后只需要对桶中的少量数据做先进的比较排序即可。在内存足够的情况下桶的数目越多越好，确保每个桶中元素尽可能少甚至一个元素。

基数排序

基数排序包括：从高位开始进行排序(MSD)和从低位开始进行排序(LSD)，大部分的例子都是LSD来排序的。其主要思路是从按照低位到高位，依次进行多次的桶排序，当最高位桶排序结束后，整个数据就是有序的了。

参考

数据结构和算法（四）：主流内排序算法