Pandas - DataFrame Comparison Functions
The Pandas package contains a number of comparison functions which provides all the functionality required for various comparison operations on a DataFrame and Series. Below mentioned are the most frequently used such functions.
Functions | Description |
---|---|
lt() | Get less than of dataframe and argument, element-wise. |
gt() | Get greater than of dataframe and argument, element-wise. |
le() | Get less than equal to of dataframe and argument, element-wise. |
ge() | Get greater than equal to of dataframe and argument, element-wise. |
eq() | Get equal to of dataframe and argument, element-wise. |
ne() | Get not equal to of dataframe and argument, element-wise. |
Lets discuss these functions in detail:
Comparison Functions
Comparison operations can be performed on a given DataFrame, element-wise using lt(), gt(), le(), ge(), eq() and ne() functions. It is equivalent to using operator like <, >, <=, >=, == or != but with support to substitute a fill_value for missing data as one of the parameters. The syntax for using these functions are given below:
Syntax
DataFrame.lt(other, axis='columns', level=None) DataFrame.gt(other, axis='columns', level=None) DataFrame.le(other, axis='columns', level=None) DataFrame.ge(other, axis='columns', level=None) DataFrame.eq(other, axis='columns', level=None) DataFrame.ne(other, axis='columns', level=None)
Parameters
other |
Required. Specify any single or multiple element data structure, or list-like object. |
axis |
Optional. Specify whether to compare by the index (0 or 'index') or columns (1 or 'columns'). |
level |
Optional. Specify int or label to broadcast across a level, matching Index values on the passed MultiIndex level. Default is None. |
Example:
In the example below, a DataFrame df is created. The different comparison functions are used with the given DataFrame.
import pandas as pd import numpy as np df = pd.DataFrame({ "Bonus": [5, 4, 2], "Salary": [60, 62, 65]}, index= ["John", "Marry", "Sam"] ) print("The DataFrame is:") print(df) #comparing for less than for all entries #of the DataFrame by 4 print("\ndf.lt(4) returns:") print(df.lt(4)) #comparing all entries of Bonus column by 4 #comparing all entries of Salary column by 62 print("\ndf.gt([4,62]) returns:") print(df.gt([4,62])) #comparing for less than equal to for #all entries of the DataFrame by 4 print("\ndf.le(4) returns:") print(df.le(4)) #comparing all entries of Bonus column by 4 #comparing all entries of Salary column by 62 print("\ndf.ge([4,62]) returns:") print(df.ge([4,62]))
The output of the above code will be:
The DataFrame is: Bonus Salary John 5 60 Marry 4 62 Sam 2 65 df.lt(4) returns: Bonus Salary John False False Marry False False Sam True False df.gt([4,62]) returns: Bonus Salary John True False Marry False False Sam False True df.le(4) returns: Bonus Salary John False False Marry True False Sam True False df.ge([4,62]) returns: Bonus Salary John True False Marry True True Sam False True
Example:
Similarly, eq() and ne() functions can be used on a DataFrame. Consider the example below.
import pandas as pd import numpy as np df = pd.DataFrame({ "Bonus": [5, 4, 2], "Salary": [60, 62, 65]}, index= ["John", "Marry", "Sam"] ) print("The DataFrame is:") print(df) #comparing for equal to for all #entries of the DataFrame by 4 print("\ndf.eq(4) returns:") print(df.eq(4)) #comparing all entries of Bonus column by 4 #comparing all entries of Salary column by 62 print("\ndf.ne([4,62]) returns:") print(df.ne([4,62]))
The output of the above code will be:
The DataFrame is: Bonus Salary John 5 60 Marry 4 62 Sam 2 65 df.eq(4) returns: Bonus Salary John False False Marry True False Sam False False df.ne([4,62]) returns: Bonus Salary John True True Marry False False Sam True True