NumPy - Sorting, Searching and Counting
The NumPy package contains a number of functions which provides all the functionality required for sorting, searching and counting. Below mentioned are the most frequently used functions to perform such operations on an array.
Function | Description |
---|---|
sort() | Returns a sorted copy of an array. |
argsort() | Returns the indices that would sort an array. |
lexsort() | Perform an indirect stable sort using a sequence of keys. |
argmax() | Returns the indices of the maximum values along an axis. |
argmin() | Returns the indices of the minimum values along an axis. |
where() | Return elements depending on condition. |
nonzero() | Return the indices of the elements that are non-zero. |
extract() | Return the elements of an array that satisfy some condition. |
Lets discuss these functions in detail:
numpy.sort() function
The NumPy sort() function returns a sorted copy of the specified array.
Syntax
numpy.sort(a, axis=-1, kind=None)
Parameters
a |
Required. Specify the array (array_like) to be sorted. |
axis |
Optional. Specify the axis along which to sort. If None, the array is flattened before sorting. The default is -1, which sorts along the last axis. |
kind |
Optional. Specify sorting algorithm. It can take values from {'quicksort', 'mergesort', 'heapsort', 'stable'}. Default: 'quicksort' |
Example:
In the example below, sort() function is used to sort elements of a 2-D array. As the axis parameter is not provided, the sorting is done along the last axis (row-wise).
import numpy as np Arr = np.array([[1,20,5],[21, 4, 3],[11, 5, 50]]) SortedArr = np.sort(Arr) print("Original Array:") print(Arr) print("\nSorted Array:") print(SortedArr)
The output of the above code will be:
Original Array: [[ 1 20 5] [21 4 3] [11 5 50]] Sorted Array: [[ 1 5 20] [ 3 4 21] [ 5 11 50]]
Example:
To sort the array column-wise, axis parameter can be set to 0. When axis=None is used, the array is flattened before sorting. Consider the example below:
import numpy as np Arr = np.array([[1,20,5],[21, 4, 3],[11, 5, 50]]) SortedArr1 = np.sort(Arr, axis=0) SortedArr2 = np.sort(Arr, axis=None) print("Original Array:") print(Arr) print("\nSorted Array1:") print(SortedArr1) print("\nSorted Array2:") print(SortedArr2)
The output of the above code will be:
Original Array: [[ 1 20 5] [21 4 3] [11 5 50]] Sorted Array1: [[ 1 4 3] [11 5 5] [21 20 50]] Sorted Array2: [ 1 3 4 5 5 11 20 21 50]
numpy.argsort() function
The NumPy argsort() function returns the indices that would sort an array.
Syntax
numpy.argsort(a, axis=-1, kind=None, order=None)
Parameters
a |
Required. Specify the array (array_like) to be sorted. |
axis |
Optional. Specify the axis along which to sort. If None, the array is flattened before sorting. Default is -1, which sorts along the last axis. |
kind |
Optional. Specify sorting algorithm. It can take values from {'quicksort', 'mergesort', 'heapsort', 'stable'}. Default: 'quicksort' |
order |
Optional. Specify string or list of strings containing fields. When a is an array with fields defined, this argument specifies the order in which the fields need to the compared. |
Example:
In the example below, argsort() function is used to get the indices which is further used to yield the sorted array.
import numpy as np Arr = np.array([50, 40, 10, 60, 30, 20]) #displaying the array print("The original array:") print(Arr) #getting the indices from argsort() x = np.argsort(Arr) print("\nIndices from argsort():") print(x) #yielding sorted array print("\nSorted array:") print(Arr[x])
The output of the above code will be:
The original array: [50 40 10 60 30 20] Indices from argsort(): [2 5 4 1 0 3] Sorted array: [10 20 30 40 50 60]
Example:
The order parameter can be used to specify which field need to sort first. Consider the example below:
import numpy as np Arr = np.array([(20,60), (20,50), (20,55), (10,75), (10,25), (10,50)], dtype=[('x', '<i4'), ('y', '<i4')]) #displaying the array print("The original array:") print(Arr) #yielding sorted array indices = np.argsort(Arr, order=('x','y') ) print("\nSorted array:") print(Arr[indices])
The output of the above code will be:
The original array: [(20, 60) (20, 50) (20, 55) (10, 75) (10, 25) (10, 50)] Sorted array: [(10, 25) (10, 50) (10, 75) (20, 50) (20, 55) (20, 60)]
numpy.lexsort() function
The NumPy lexsort() function performs an indirect stable sort using a sequence of keys.
When multiple sorting keys are provided, it can be interpreted as columns, lexsort() returns an array of integer indices that describes the sort order by multiple columns. The last key in the sequence is used for the primary sort order, the second-to-last key for the secondary sort order, and so on. The keys argument must be a sequence of objects that can be converted to arrays of the same shape. If a 2D array is provided for the keys argument, its rows are interpreted as the sorting keys and sorting is according to the last row, second last row etc.
Syntax
numpy.lexsort(keys, axis=-1)
Parameters
keys |
Required. Specify The k different 'columns' to be sorted. The last column (or row if keys is a 2D array) is the primary sort key. |
axis |
Optional. Specify the axis to be indirectly sorted. By default, sort over the last axis. |
Example:
In the example below, lexsort() function is used to sort by x column first then by y column.
import numpy as np #x column - First column x = np.array([10, 20, 10, 20, 10, 25, 10]) #y column - Second column y = np.array([40, 10, 45, 60, 50, 25, 30]) #getting the array of indices that sorts #x column first, y column second indices = np.lexsort((y, x)) #displaying the indices print("Array of indices to sort columns") print(indices) #using indices to sort columns print("\nSorted x and y columns:") for i in indices: print(x[i], y[i])
The output of the above code will be:
Array of indices to sort columns [6 0 2 4 1 3 5] Sorted x and y columns: 10 30 10 40 10 45 10 50 20 10 20 60 25 25
numpy.argmax() and numpy.argmin() functions
The NumPy argmax() and argmin() functions returns the indices of the maximum and minimum values along an axis respectively. It is calculated over the flattened array by default, otherwise over the specified axis.
Syntax
numpy.argmax(a, axis=None, out=None) numpy.argmin(a, axis=None, out=None)
Parameters
a |
Required. Specify the input array (array_like). |
axis |
Optional. Specify axis or axes along which the indices of the maximum/minimum values are computed. The default is to compute it over the flattened array. |
out |
Optional. Specify output array for the result. The default is None. If provided, it must have the same shape as output. |
Example:
In the example below, functions is used to find out index of the maximum/minimum value in the whole array.
import numpy as np Arr = np.arange(12).reshape(3,4) print("Array is:") print(Arr) #index of maximum value idx1 = np.argmax(Arr) print("\nIndex of maximum value is:", idx1) #index of minimum value idx2 = np.argmin(Arr) print("Index of minimum value is:", idx2)
The output of the above code will be:
Array is: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] Index of maximum value is: 11 Index of minimum value is: 0
Example:
When axis parameter is provided, index of maximum/minimum value is calculated over the specified axes. Consider the following example.
import numpy as np Arr = np.arange(12).reshape(3,4) print("Array is:") print(Arr) #Index of maximum value along axis=0 print("\nIndex of maximum value along axis=0") print(np.argmax(Arr, axis=0)) #Index of maximum value along axis=1 print("\nIndex of maximum value along axis=1") print(np.argmax(Arr, axis=1)) #Index of minimum value along axis=0 print("\nIndex of minimum value along axis=0") print(np.argmin(Arr, axis=0)) #Index of minimum value along axis=1 print("\nIndex of minimum value along axis=1") print(np.argmin(Arr, axis=1))
The output of the above code will be:
Array is: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] Index of maximum value along axis=0 [2 2 2 2] Index of maximum value along axis=1 [3 3 3] Index of minimum value along axis=0 [0 0 0 0] Index of minimum value along axis=1 [0 0 0]
numpy.where() function
The NumPy where() function returns elements chosen from x or y depending on condition. When only condition is provided, the function returns the indices of elements of the given array which satisfies the condition.
Syntax
numpy.where(condition, x, y)
Parameters
condition |
Required. Specify array_like, bool. Where True, yield x, otherwise yield y. |
x, y |
Optional. Specify array_like values from which to choose. x, y and condition need to be broadcastable to some shape. |
Example:
In the example below, where() function is used to replace all negative elements with 0 from an array.
import numpy as np x = np.arange(-2, 5) #replacing all negative elements with 0 y = np.where(x > 0, x, 0) #displaying the content of x and y print("x contains:", x) print("y contains:", y)
The output of the above code will be:
x contains: [-2 -1 0 1 2 3 4] y contains: [0 0 0 1 2 3 4]
Example:
In this example, where() function is used to choose elements from two array based on a given condition.
import numpy as np x = np.asarray([[10, 20], [30, 40]]) y = np.asarray([[15, 15], [25, 25]]) #applying where condition z = np.where(x > y, x, y) #displaying the content of x, y and z print("x =") print(x) print("\ny =") print(y) print("\nz =") print(z)
The output of the above code will be:
x = [[10 20] [30 40]] y = [[15 15] [25 25]] z = [[15 20] [30 40]]
numpy.nonzero() function
The NumPy nonzero() function returns the indices of the elements that are non-zero. Returned value is a tuple of arrays, one for each dimension of a, containing the indices of the non-zero elements in that dimension. The values in a are always tested and returned in row-major, C-style order.
Syntax
numpy.nonzero(a)
Parameters
a |
Required. Specify the input array (array_like). |
Example:
In the example below, nonzero() function is used to find out index of all elements which are non-zero.
import numpy as np Arr = np.array([[1, 2, 0], [0, 5, 0], [7, 0, 9]]) print("Array is:") print(Arr) #index of all elements which are non-zero x = np.nonzero(Arr) print("\nIndices of nonzero elements:") print(x) #displaying all nonzero elements print("\nAll nonzero elements:") print(Arr[x])
The output of the above code will be:
Array is: [[1 2 0] [0 5 0] [7 0 9]] Indices of nonzero elements: (array([0, 0, 1, 2, 2]), array([0, 1, 1, 0, 2])) All nonzero elements: [1 2 5 7 9]
Example:
A condition can be applied to nonzero() function as shown in the example below.
import numpy as np Arr = np.array([90, 80, 10, 20, 50]) print("Array is:") print(Arr) #index of all elements which are greater than 25 x = np.nonzero(Arr > 25) print("\nIndices of elements greater than 25:") print(x) #displaying all elements greater than 25 print("\nAll elements greater than 25:") print(Arr[x])
The output of the above code will be:
Array is: [90 80 10 20 50] Indices of elements greater than 25: (array([0, 1, 4]),) All elements greater than 25: [90 80 50]
numpy.extract() function
The NumPy extract() function returns the elements of an array that satisfy some condition. If condition is boolean the function is equivalent to arr[condition].
Syntax
numpy.extract(condition, arr)
Parameters
condition |
Required. Specify an array whose nonzero or True entries indicate the elements of arr to extract. |
arr |
Required. Specify the array (array_like) of the same size as condition. |
Example:
In the example below, extract() function is used to extract elements from an array based on given condition.
import numpy as np Arr = np.arange(12).reshape((3, 4)) #displaying the array print("The original array:") print(Arr) #defining condition condition = np.mod(Arr, 3) == 0 print("\nCondition is:") print(condition) #applying condition on array print("\nExtracting elements based on condition:") print(np.extract(condition, Arr))
The output of the above code will be:
The original array: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] Condition is: [[ True False False True] [False False True False] [False True False False]] Extracting elements based on condition: [0 3 6 9]