How to get size of Pandas DataFrame? | Get the number of rows and columns in dataframe
In this pandas tutorial, we will discuss about:
-
what is a dataframe in pandas?
-
how to get the size of pandas dataframe?
-
how to get number of rows and columns in pandas dataframe?
-
get size in memory of pandas dataframe
Here we will talk about how much amount of memory is consuming for the data in the DataFrame and get number of rows and columns in pandas dataframe.
There are several methods to get size of dataframe in pandas. We will discuss each one by one.
Before going to get in details, we will see what is a DataFrame?
What is a dataframe in pandas?
DataFrame is a data structure that contains rows and columns. It will store the data in rows and columns. From this we can say this is an two dimensional data structure.
In Pandas, we can create dataframe by using a function called DataFrame(). Since this data structure is available in pandas, we have to get it by importing pandas module.
In Python, import keyword is used to import any kind of module.
Syntax:
import module_name as alias
where,
1. module_name is the name of the module which we want to import. Here it is pandas
2. alias is the module nickname which we assign to the module_name
import pandas as pd
We are aliasing pandas as pd.
pandas.DataFrame() is the method used to create dataframe.
Syntax:
pandas.DataFrame(data,columns,index)
where,
1. data is the input list/dictionary to assign data to the dataframe
2. columns is used to assign the columns in the dataframe
3. index is used to assign the index/row labels
Note: Here we are creating a dataframe from the dictionary. So no need to pass columns as a separate parameter. Since dictionary key will take column name as key and value is probably the values in the dataframe.
Create pandas dataframe
Let’s create a sample DataFrame with 3 rows and 4 columns with building data.
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# display the dataframe
print(dataframe_object)
Output:
block_no name cost sqaure-feet
0 ba-001 villas ohri daba 456700 5000
1 ba-002 CSD3 12000 1200
2 ba-003 fgdb1 450000 4564
You can see the output as shown above.
We will create pandas dataframe as shown here to use in our future explanations in this tutorial.
Get the number of rows and columns in pandas dataframe
In this part, we will discuss how to get number of rows and columns in pandas dataframe.
Method 1 : Get the number of rows and columns in pandas dataframe using shape
shape method is used to return a tuple contains two values.
Here,
-
First value represents the total number of rows in the DataFrame, and
-
Second value represents the total number of columns in the DataFrame.
Syntax:
dataframe_object.shape
where, dataframe_object is the input dataframe.
Parameters:
It won’t take any parameters
Example:
In this example, we are using shape function to understand how to get number of rows and columns in pandas dataframe from above example.
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# get the shape
print(dataframe_object.shape)
Output:
(3, 4)
From the above output, we can get rows and column easily.
Method 2 : Get size of dataframe in pandas using size
size method is used to return a value that represents the total number values in the DataFrame.
Syntax:
dataframe_object.size
where, dataframe_object is the input dataframe.
Parameters:
It won’t take any parameters
Example:
In this example, we are using size function to return the number of values in the dataframe.
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# get the size
print(dataframe_object.size)
Output:
12
Method 3 : Get size of dataframe in pandas using len()
len() function is used to return a value that represents the total number of rows in the DataFrame.
Syntax:
len(dataframe_object)
where, dataframe_object is the input dataframe.
Parameters:
It will take dataframe object as input
Example:
In this example, we are using len() function to return or get the number of rows in a dataframe.
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# get the number of rows in a dataframe
print(len(dataframe_object))
Output:
3
If we want to get the number of columns in a dataframe, we have to use columns method to get the columns. len() is applied to it to get the total number of columns.
Syntax:
len(dataframe_object.columns)
Example:
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# get the number of columns in a dataframe
print(len(dataframe_object.columns))
Output:
4
Method 4 : Get the number of rows and columns in pandas dataframe using len() with axes()
axes() represents the rows and columns, which is used to get the total number of rows and columns in the dataframe.
It is used with len() to get the total number of rows and columns
Syntax:
For rows - len(dataframe_object.axes[0])
For columns - len(dataframe_object.axes[1])
where, dataframe_object is the input dataframe.
Parameters:
It won’t take any parameters.
Example:
In this example, we are using axes() function to return or get the number of rows and columns in pandas dataframe.
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# get the number of rows in a dataframe
print(len(dataframe_object.axes[0]))
# get the number of columns in a dataframe
print(len(dataframe_object.axes[1]))
Output:
3
4
Here we get the number of rows and columns in pandas dataframe.
Get size in memory of pandas dataframe
Method 1 : Get size of dataframe in pandas using memory_usage
memory_usage() will return the memory size consumed by each row across the column in bytes.
Syntax:
dataframe_object.memory_usage(index)
where, dataframe_object is the input dataframe.
Parameters:
index is an optional parameter. It will take two Boolean values. If it is set to True, then it will display the memory consumed by Index also. If it is set to False, It will not display the memory consumed by Index. By default it is True.
Example:
In this example, we are getting the memory occupied by the above created dataframe.
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# get the memory in bytes
print(dataframe_object.memory_usage())
Output:
Index 128
block_no 24
name 24
cost 24
sqaure-feet 24
dtype: int64
From the above output, each column values occupies 24 bytes.
Method 2 : Get size in memory of pandas dataframe using info()
info() will return the information from the dataframe that includes column name with associated data type, memory consumed by the dataframe and count of Non Null values.
Syntax:
dataframe_object.info()
Example:
In this example, we are getting information from the dataframe
import pandas as pd
# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
'name':['villas ohri daba','CSD3','fgdb1'],
'cost':[456700,12000,450000],
'sqaure-feet':[5000,1200,4564]}
# pass this building_data to the DataFrame
dataframe_object=pd.DataFrame(building_data)
# get the information
print(dataframe_object.info())
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 4 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 block_no 3 non-null object
1 name 3 non-null object
2 cost 3 non-null int64
3 sqaure-feet 3 non-null int64
dtypes: int64(2), object(2)
memory usage: 224.0+ bytes
None
From the above output, the memory consumed by the entire dataframe is 224 bytes.
Conclusion
In this tutorial, we discussed how to get the size of the dataframe in memory using memeoty_usage() and info() methods. Inorder to get the total number of rows and columns , we used axes(),shape and size methods.
Would you like to see your article here on tutorialsinhand.
Join
Write4Us program by tutorialsinhand.com
About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University.
Published 1400+ Technical Articles on Python, R, Swift, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :
Published Date :
Feb 15,2022