Pandas - DataFrame Attributes
DataFrame attributes reflect information that is intrinsic to the DataFrame. Accessing a DataFrame through its attributes allows us to get the intrinsic properties of the DataFrame. The most commonly used attributes are mentioned below:
Function | Description |
---|---|
DataFrame.columns | Returns column labels of the DataFrame. |
DataFrame.dtypes | Return the dtypes in the DataFrame. |
DataFrame.empty | Indicates whether DataFrame is empty. |
DataFrame.index | Returns the index (row labels) of the DataFrame. |
DataFrame.ndim | Return an int representing the number of axes / array dimensions. |
DataFrame.shape | Return a tuple representing the dimensionality of the DataFrame. |
DataFrame.size | Return an int representing the number of elements in this object. |
DataFrame.values | Return a Numpy representation of the DataFrame. |
Lets discuss these attributes in detail:
DataFrame.columns
The columns attribute is used to return column labels of the DataFrame. Consider the following example:
import pandas as pd df = pd.DataFrame({ "Bonus": [5, 3, 2, 4, 3, 4], "Last Salary": [58, 60, 63, 57, 62, 59], "Salary": [60, 62, 65, 59, 63, 62]}, index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"] ) print("The DataFrame contains:") print(df) print("\nThe column labels are:") print(df.columns)
The output of the above code will be:
The DataFrame contains: Bonus Last Salary Salary John 5 58 60 Marry 3 60 62 Sam 2 63 65 Jo 4 57 59 Ramesh 3 62 63 Kim 4 59 62 The column labels are: Index([u'Bonus', u'Last Salary', u'Salary'], dtype='object')
DataFrame.dtype
The dtypes attribute is used to get the dtypes in the given DataFrame. Consider the following example.
import pandas as pd data = {'Name': ['John', 'Marry', 'Jo', 'Sam'], 'Age': [25, 24, 30, 28]} df = pd.DataFrame(data) print("dtypes of df:\n", df.dtypes)
The output of the above code will be:
dtypes of df: Name object Age int64 dtype: object
DataFrame.empty
The empty attribute is used to check whether the given DataFrame is empty or not.
import pandas as pd Name = ['John', 'Marry', 'Jo', 'Sam'] df1 = pd.DataFrame(Name) df2 = pd.DataFrame() print("Is df1 empty?:", df1.empty) print("Is df2 empty?:", df2.empty)
The output of the above code will be:
Is df1 empty?: False Is df2 empty?: True
DataFrame.index
The index attribute is used to return the index (row labels) of the DataFrame.
import pandas as pd df = pd.DataFrame({ "Bonus": [5, 3, 2, 4, 3, 4], "Last Salary": [58, 60, 63, 57, 62, 59], "Salary": [60, 62, 65, 59, 63, 62]}, index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"] ) print("The DataFrame contains:") print(df) print("\nThe index (row labels) are:") print(df.index)
The output of the above code will be:
The DataFrame contains: Bonus Last Salary Salary John 5 58 60 Marry 3 60 62 Sam 2 63 65 Jo 4 57 59 Ramesh 3 62 63 Kim 4 59 62 The index (row labels) are: Index(['John', 'Marry', 'Sam', 'Jo', 'Ramesh', 'Kim'], dtype='object')
DataFrame.ndim
The ndim attribute is used to get the dimensions (number of axes / array dimensions) of the given DataFrame. Consider the example below:
import pandas as pd df = pd.DataFrame({ "Bonus": [5, 3, 2, 4, 3, 4], "Last Salary": [58, 60, 63, 57, 62, 59], "Salary": [60, 62, 65, 59, 63, 62]}, index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"] ) #dimension of df print("Dimension of df:", df.ndim)
The output of the above code will be:
Dimension of df: 2
DataFrame.shape
The shape attribute can be used to get a tuple representing the dimensionality of the DataFrame. Consider the following example.
import pandas as pd df = pd.DataFrame({ "Bonus": [5, 3, 2, 4, 3, 4], "Last Salary": [58, 60, 63, 57, 62, 59], "Salary": [60, 62, 65, 59, 63, 62]}, index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"] ) #shape of df print("Shape of df:", df.shape)
The output of the above code will be:
Shape of df: (6, 3)
DataFrame.size
The size attribute is used to get number of elements in the given DataFrame. Consider the example below:
import pandas as pd df = pd.DataFrame({ "Bonus": [5, 3, 2, 4, 3, 4], "Last Salary": [58, 60, 63, 57, 62, 59], "Salary": [60, 62, 65, 59, 63, 62]}, index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"] ) print("The DataFrame is:") print(df) print("\nThe number of elements in df:", df.size)
The output of the above code will be:
The DataFrame is: Bonus Last Salary Salary John 5 58 60 Marry 3 60 62 Sam 2 63 65 Jo 4 57 59 Ramesh 3 62 63 Kim 4 59 62 The number of elements in df: 18
DataFrame.values
The values attribute is used to return numpy representation of the DataFrame. Consider the following example:
import pandas as pd df = pd.DataFrame({ "Bonus": [5, 3, 2, 4, 3, 4], "Last Salary": [58, 60, 63, 57, 62, 59], "Salary": [60, 62, 65, 59, 63, 62]}, index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"] ) print("The DataFrame is:") print(df) print("\nThe numpy representation of df:") print(df.values)
The output of the above code will be:
The DataFrame is: Bonus Last Salary Salary John 5 58 60 Marry 3 60 62 Sam 2 63 65 Jo 4 57 59 Ramesh 3 62 63 Kim 4 59 62 The numpy representation of df: [[ 5 58 60] [ 3 60 62] [ 2 63 65] [ 4 57 59] [ 3 62 63] [ 4 59 62]]