Articles

How to get size of Pandas DataFrame? | Get the number of rows and columns in dataframe

How to get size of Pandas DataFrame? | Get the number of rows and columns in dataframe


In this pandas tutorial, we will discuss about:

  • what is a dataframe in pandas?
  • how to get the size of pandas dataframe?
  • how to get number of rows and columns in pandas dataframe?
  • get size in memory of pandas dataframe

Here we will talk about how much amount of memory is consuming for the data in the DataFrame and get number of rows and columns in pandas dataframe.

 

There are several methods to get size of dataframe in pandas. We will discuss each one by one.

 

Before going to get in details, we will see what is a DataFrame?

 

What is a dataframe in pandas?

DataFrame is a data structure that contains rows  and columns. It will store the data in rows and columns. From this we can say this is an two dimensional data structure.

In Pandas, we can create dataframe by using a function called DataFrame(). Since this data structure is available in pandas, we have to get it by importing pandas module.

 

In Python, import keyword is used to import any kind of module.

Syntax:

import module_name as alias

where,

1.      module_name is the name of the module which we want to import. Here it is pandas

2.      alias is the module nickname which we assign to the module_name

 

import pandas as pd

We are aliasing pandas as pd.

 

pandas.DataFrame() is the method used to create dataframe.

Syntax:

pandas.DataFrame(data,columns,index)

where,

1.      data is the input list/dictionary to assign data to the dataframe

2.      columns is used to assign the columns in the dataframe

3.      index is used to assign the index/row labels

 

Note: Here we are creating a dataframe from the dictionary. So no need to pass columns as a separate parameter. Since dictionary key will take column name as key and value is probably the values in the dataframe.


Create pandas dataframe

Let’s create a sample DataFrame with 3 rows and 4 columns with building data.

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# display the dataframe
print(dataframe_object)

Output:

  block_no              name    cost  sqaure-feet
0   ba-001  villas ohri daba  456700         5000
1   ba-002              CSD3   12000         1200
2   ba-003             fgdb1  450000         4564

You can see the output as shown above.

We will create pandas dataframe as shown here to use in our future explanations in this tutorial.


Get the number of rows and columns in pandas dataframe

In this part, we will discuss how to get number of rows and columns in pandas dataframe.

 

Method 1 : Get the number of rows and columns in pandas dataframe using shape

shape method is used to return a tuple contains two values.

Here,

  • First value represents the total number of rows in the DataFrame, and
  • Second value represents the total number of columns in the DataFrame.

Syntax:

dataframe_object.shape

where, dataframe_object is the input dataframe.

Parameters:

It won’t take any parameters

Example:

In this example, we are using shape function to understand how to get number of rows and columns in pandas dataframe from above example.

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# get the shape
print(dataframe_object.shape)

Output:

(3, 4)

From the above output, we can get rows and column easily.


Method 2 : Get size of dataframe in pandas using size

size method is used to return a value that represents the total number values in the DataFrame.

Syntax:

dataframe_object.size

where, dataframe_object is the input dataframe.

Parameters:

It won’t take any parameters

Example:

In this example, we are using size function to return the number of values in the dataframe.

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# get the size
print(dataframe_object.size)

Output:

12

Method 3 : Get size of dataframe in pandas using len()

len() function is used to return a value that represents the total number of rows in the DataFrame.

Syntax:

len(dataframe_object)

where, dataframe_object is the input dataframe.

Parameters:

It will take dataframe object as input

Example:

In this example, we are using len() function to return or get the number of rows in a dataframe.

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# get the number of rows in a dataframe
print(len(dataframe_object))

Output:

3

If we want to get the number of columns in a dataframe, we have to use columns method to get the columns. len() is applied to it to get the total number of columns.

Syntax:

len(dataframe_object.columns)

Example:

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# get the number of columns in a dataframe
print(len(dataframe_object.columns))

Output:

4

Method 4 : Get the number of rows and columns in pandas dataframe using len() with axes()

axes() represents the rows and columns, which is used to get the total number of rows and columns in the dataframe.

 

It is used with len() to  get the total number of rows and columns

Syntax:

For rows - len(dataframe_object.axes[0])

For columns - len(dataframe_object.axes[1])

where, dataframe_object is the input dataframe.

Parameters:

It won’t take any parameters.

Example:

In this example, we are using axes() function to return or get the number of rows and columns in pandas dataframe.

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# get the number of rows in a dataframe
print(len(dataframe_object.axes[0]))

# get the number of columns in a dataframe
print(len(dataframe_object.axes[1]))

Output:

3
4

Here we get the number of rows and columns in pandas dataframe.


Get size in memory of pandas dataframe

Method 1 : Get size of dataframe in pandas using memory_usage

 

memory_usage() will return the memory size consumed by each row across the column in bytes.

Syntax:

dataframe_object.memory_usage(index)

where, dataframe_object is the input dataframe.

Parameters:

index is an optional parameter. It will take two Boolean values. If it is set to True, then it will display the memory consumed by Index also. If it is set to False, It will not display the memory consumed by Index. By default it is True.

Example:

In this example, we are getting the memory occupied by the above created dataframe.

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# get the memory in bytes
print(dataframe_object.memory_usage())

Output:

Index          128
block_no        24
name            24
cost            24
sqaure-feet     24
dtype: int64

From the above output, each column values occupies 24 bytes.

 

Method 2 : Get size in memory of pandas dataframe using info()

info() will return the information from the dataframe that includes column name with associated data type, memory consumed by the dataframe and count of Non Null values.

Syntax:

dataframe_object.info()

Example:

In this example, we are getting information from the dataframe

import pandas as pd

# create a dictionary with building data
building_data={'block_no':['ba-001','ba-002','ba-003'],
               'name':['villas ohri daba','CSD3','fgdb1'],
               'cost':[456700,12000,450000],
               'sqaure-feet':[5000,1200,4564]}
                              
# pass this  building_data to the DataFrame  
dataframe_object=pd.DataFrame(building_data)     

# get the information
print(dataframe_object.info())

Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 4 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   block_no     3 non-null      object
 1   name         3 non-null      object
 2   cost         3 non-null      int64 
 3   sqaure-feet  3 non-null      int64 
dtypes: int64(2), object(2)
memory usage: 224.0+ bytes
None

From the above output, the memory consumed by the entire dataframe is 224 bytes.


Conclusion

In this tutorial, we discussed how to get the size of the dataframe in memory using memeoty_usage() and info() methods. Inorder to get the total number of rows and columns , we used axes(),shape and size methods.


Pandas

Would you like to see your article here on tutorialsinhand. Join Write4Us program by tutorialsinhand.com

About the Author
Gottumukkala Sravan Kumar 171FA07058
B.Tech (Hon's) - IT from Vignan's University. Published 800+ Technical Articles on Python, R, Java, C#, LISP, PHP - MySQL and Machine Learning
Page Views :    Published Date : Feb 15,2022  
Please Share this page

Related Articles

Like every other website we use cookies. By using our site you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and our Terms of Service. Learn more Got it!