Pandas Series - diff() function
The Pandas Series diff() function calculates the difference of a Series element compared with another element in the Series (default is element in previous row).
Syntax
Series.diff(periods=1)
Parameters
periods |
Optional. Specify the period to shift for calculating difference (negative values can also be used). Default: 1 |
Return Value
Returns the Series with first (or specified period) discrete difference of element.
Example: using diff() on a Series
In the example below, the diff() function is used to get the specified discrete difference of element.
import pandas as pd import numpy as np x = pd.Series([1.5, 2.5, 3.5, 1.5, 2.5, -1]) print("The Series contains:") print(x) #first discrete difference of element print("\nx.diff() returns:") print(x.diff()) #second discrete difference of element print("\nx.diff(2) returns:") print(x.diff(2))
The output of the above code will be:
The Series contains: 0 1.5 1 2.5 2 3.5 3 1.5 4 2.5 5 -1.0 dtype: float64 x.diff() returns: 0 NaN 1 1.0 2 1.0 3 -2.0 4 1.0 5 -3.5 dtype: float64 x.diff(2) returns: 0 NaN 1 NaN 2 2.0 3 -1.0 4 -1.0 5 -2.5 dtype: float64
Example: using diff() on selected series in a DataFrame
Similarly, the diff() function can be applied on selected series/column of a given DataFrame. Consider the following example.
import pandas as pd import numpy as np df = pd.DataFrame({ "GDP": [1.5, 2.5, 3.5, 1.5, 2.5, -1], "GNP": [1, 2, 3, 3, 2, -1], "HPI": [2, 3, 2, np.NaN, 2, 2]}, index= ["2015", "2016", "2017", "2018", "2019", "2020"] ) print("The DataFrame is:") print(df) #first discrete difference of 'GDP' Series print("\ndf['GDP'].diff() returns:") print(df['GDP'].diff())
The output of the above code will be:
The DataFrame is: GDP GNP HPI 2015 1.5 1 2.0 2016 2.5 2 3.0 2017 3.5 3 2.0 2018 1.5 3 NaN 2019 2.5 2 2.0 2020 -1.0 -1 2.0 df['GDP'].diff() returns: 2015 NaN 2016 1.0 2017 1.0 2018 -2.0 2019 1.0 2020 -3.5 Name: GDP, dtype: float64
❮ Pandas Series - Functions