Pandas DataFrame - cumsum() function
The Pandas DataFrame cumsum() function computes cumulative sum over a DataFrame or Series axis and returns a DataFrame or Series of the same size containing the cumulative sum.
Syntax
DataFrame.cumsum(axis=None, skipna=True)
Parameters
axis |
Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', cumulative sums are generated for each column. If 1 or 'columns', cumulative sums are generated for each row. Default: 0 |
skipna |
Optional. Specify True to exclude NA/null values when computing the result. Default is True. |
Return Value
Return cumulative sum of Series or DataFrame.
Example: using cumsum() column-wise on whole DataFrame
In the example below, a DataFrame info is created. The cumsum() function is used to get the cumulative sum of each column.
import pandas as pd import numpy as np info = pd.DataFrame({ "Salary": [25, 24, 30, 28, 25], "Bonus": [10, 8, 9, np.nan, 9]}, index= ["2015", "2016", "2017", "2018", "2019"] ) #displaying the dataframe print(info,"\n") #displaying the cumulative sum print("info.cumsum() returns:") print(info.cumsum(),"\n") #using skipna=False print("info.cumsum(skipna=False) returns:") print(info.cumsum(skipna=False))
The output of the above code will be:
Salary Bonus 2015 25 10.0 2016 24 8.0 2017 30 9.0 2018 28 NaN 2019 25 9.0 info.cumsum() returns: Salary Bonus 2015 25 10.0 2016 49 18.0 2017 79 27.0 2018 107 NaN 2019 132 36.0 info.cumsum(skipna=False) returns: Salary Bonus 2015 25 10.0 2016 49 18.0 2017 79 27.0 2018 107 NaN 2019 132 NaN
Example: using cumsum() row-wise on whole DataFrame
To get the row-wise cumulative sum, the axis parameter can be set to 1.
import pandas as pd import numpy as np info = pd.DataFrame({ "2016": [25, 24, 30, 28, 25], "2017": [18, 20, 25, np.nan, 28], "2018": [25, 24, 25, 30, 25]}, index= ["P1", "P2", "P3", "P4", "P5"] ) #displaying the dataframe print(info,"\n") #displaying the cumulative sum print("info.cumsum(axis=1) returns:") print(info.cumsum(axis=1),"\n") #using skipna=False print("info.cumsum(axis=1, skipna=False) returns:") print(info.cumsum(axis=1, skipna=False))
The output of the above code will be:
2016 2017 2018 P1 25 18.0 25 P2 24 20.0 24 P3 30 25.0 25 P4 28 NaN 30 P5 25 28.0 25 info.cumsum(axis=1) returns: 2016 2017 2018 P1 25.0 43.0 68.0 P2 24.0 44.0 68.0 P3 30.0 55.0 80.0 P4 28.0 NaN 58.0 P5 25.0 53.0 78.0 info.cumsum(axis=1, skipna=False) returns: 2016 2017 2018 P1 25.0 43.0 68.0 P2 24.0 44.0 68.0 P3 30.0 55.0 80.0 P4 28.0 NaN NaN P5 25.0 53.0 78.0
Example: using cumsum() on selected column
Instead of whole DataFrame, the cumsum() function can be applied on selected columns. Consider the following example.
import pandas as pd import numpy as np info = pd.DataFrame({ "Salary": [25, 24, 30, 28, 25], "Bonus": [10, 8, 9, np.nan, 9], "Others": [5, 4, 7, 5, 8]}, index= ["2015", "2016", "2017", "2018", "2019"] ) #displaying the dataframe print(info,"\n") #cumulative sum on single column print("info['Salary'].cumsum() returns:") print(info['Salary'].cumsum(),"\n") #cumulative sum on multiple column print("info[['Salary', 'Others']].cumsum() returns:") print(info[['Salary', 'Others']].cumsum(),"\n")
The output of the above code will be:
Salary Bonus Others 2015 25 10.0 5 2016 24 8.0 4 2017 30 9.0 7 2018 28 NaN 5 2019 25 9.0 8 info['Salary'].cumsum() returns: 2015 25 2016 49 2017 79 2018 107 2019 132 Name: Salary, dtype: int64 info[['Salary', 'Others']].cumsum() returns: Salary Others 2015 25 5 2016 49 9 2017 79 16 2018 107 21 2019 132 29
❮ Pandas DataFrame - Functions