Pandas DataFrame - nunique() function
The Pandas DataFrame nunique() function counts the number of distinct elements in specified axis. This function returns a Series with number of distinct elements.
Syntax
DataFrame.nunique(axis=0, dropna=True)
Parameters
axis |
Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', counts of distinct elements are generated for each column. If 1 or 'columns', counts of distinct elements are generated for each row. Default: 0 |
dropna |
Optional. Specify False to include NaN in the counts. Default is True. |
Return Value
Returns a Series with number of distinct elements.
Example: using nunique() column-wise on whole DataFrame
In the example below, a DataFrame df is created. The nunique() function is used to get the count of distinct elements in each column.
import pandas as pd import numpy as np df = pd.DataFrame({ "x": [5, 5, 2, 2, 7], "y": [10, 5, 5, 10, 5]}, index= ["a", "b", "c", "d", "e"] ) print("The DataFrame is:") print(df) #getting the count of distinct #elements in each column print("\ndf.nunique() returns:") print(df.nunique())
The output of the above code will be:
The DataFrame is: x y a 5 10 b 5 5 c 2 5 d 2 10 e 7 5 df.nunique() returns: x 3 y 2 dtype: int64
Example: using nunique() row-wise on whole DataFrame
To perform the operation row-wise, the axis parameter can be set to 1.
import pandas as pd import numpy as np df = pd.DataFrame({ "a": [5, 10], "b": [5, 5], "c": [2, 5], "d": [2, 10], "e": [7, 5]}, index= ["x", "y"] ) print("The DataFrame is:") print(df) #getting the count of distinct #elements in each row print("\ndf.nunique(axis=1) returns:") print(df.nunique(axis=1))
The output of the above code will be:
The DataFrame is: a b c d e x 5 5 2 2 7 y 10 5 5 10 5 df.nunique(axis=1) returns: x 3 y 2 dtype: int64
Example: using nunique() on selected column
Instead of whole DataFrame, the nunique() function can be applied on selected columns. Consider the following example.
import pandas as pd import numpy as np df = pd.DataFrame({ "x": [5, 5, 2, 2, 7], "y": [10, 5, 5, 10, 5], "z": [1, 1, 1, 1, 1]}, index= ["a", "b", "c", "d", "e"] ) print("The DataFrame is:") print(df) #count of distinct elements in a single column print("\ndf['z'].nunique() returns:") print(df["z"].nunique()) #count of distinct elements in multiple columns print("\ndf[['x', 'z']].nunique() returns:") print(df[["x", "z"]].nunique())
The output of the above code will be:
The DataFrame is: x y z a 5 10 1 b 5 5 1 c 2 5 1 d 2 10 1 e 7 5 1 df['z'].nunique() returns: 1 df[['x', 'z']].nunique() returns: x 3 z 1 dtype: int64
❮ Pandas DataFrame - Functions