Pandas Tutorial Pandas References

Pandas Series - diff() function



The Pandas Series diff() function calculates the difference of a Series element compared with another element in the Series (default is element in previous row).

Syntax

Series.diff(periods=1)

Parameters

periods Optional. Specify the period to shift for calculating difference (negative values can also be used). Default: 1

Return Value

Returns the Series with first (or specified period) discrete difference of element.

Example: using diff() on a Series

In the example below, the diff() function is used to get the specified discrete difference of element.

import pandas as pd
import numpy as np

x = pd.Series([1.5, 2.5, 3.5, 1.5, 2.5, -1])

print("The Series contains:")
print(x)

#first discrete difference of element
print("\nx.diff() returns:")
print(x.diff())

#second discrete difference of element
print("\nx.diff(2) returns:")
print(x.diff(2))

The output of the above code will be:

The Series contains:
0    1.5
1    2.5
2    3.5
3    1.5
4    2.5
5   -1.0
dtype: float64

x.diff() returns:
0    NaN
1    1.0
2    1.0
3   -2.0
4    1.0
5   -3.5
dtype: float64

x.diff(2) returns:
0    NaN
1    NaN
2    2.0
3   -1.0
4   -1.0
5   -2.5
dtype: float64

Example: using diff() on selected series in a DataFrame

Similarly, the diff() function can be applied on selected series/column of a given DataFrame. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "GDP": [1.5, 2.5, 3.5, 1.5, 2.5, -1],
  "GNP": [1, 2, 3, 3, 2, -1],
  "HPI": [2, 3, 2, np.NaN, 2, 2]},
  index= ["2015", "2016", "2017", 
          "2018", "2019", "2020"]
)

print("The DataFrame is:")
print(df)

#first discrete difference of 'GDP' Series
print("\ndf['GDP'].diff() returns:")
print(df['GDP'].diff())

The output of the above code will be:

The DataFrame is:
      GDP  GNP  HPI
2015  1.5    1  2.0
2016  2.5    2  3.0
2017  3.5    3  2.0
2018  1.5    3  NaN
2019  2.5    2  2.0
2020 -1.0   -1  2.0

df['GDP'].diff() returns:
2015    NaN
2016    1.0
2017    1.0
2018   -2.0
2019    1.0
2020   -3.5
Name: GDP, dtype: float64

❮ Pandas Series - Functions