search
Search
Login
Math ML Join our weekly DS/ML newsletter
menu
menu search toc more_vert
Robocat
Guest 0reps
Thanks for the thanks!
close
Outline
chevron_left Creating DataFrames Cookbook
Combining multiple Series into a DataFrameCombining multiple Series to form a DataFrameConverting a Series to a DataFrameConverting list of lists into DataFrameConverting list to DataFrameConverting percent string into a numeric for read_csvConverting scikit-learn dataset to Pandas DataFrameConverting string data into a DataFrameCreating a DataFrame from a stringCreating a DataFrame using listsCreating a DataFrame with different type for each columnCreating a DataFrame with empty valuesCreating a DataFrame with missing valuesCreating a DataFrame with random numbersCreating a DataFrame with zerosCreating a MultiIndex DataFrameCreating a Pandas DataFrameCreating a single DataFrame from multiple filesCreating empty DataFrame with only column labelsFilling missing values when using read_csvImporting DatasetImporting tables from PostgreSQL as Pandas DataFramesInitialising a DataFrame using a constantInitialising a DataFrame using a dictionaryInitialising a DataFrame using a list of dictionariesInserting lists into a DataFrame cellKeeping leading zeroes when using read_csvParsing dates when using read_csvPreventing strings from getting parsed as NaN for read_csvReading data from GitHubReading file without headerReading large CSV files in chunksReading n random lines using read_csvReading space-delimited filesReading specific columns from fileReading tab-delimited filesReading the first few lines of a file to create DataFrameReading the last n lines of a fileReading URL using read_csvReading zipped csv file as a DataFrameRemoving Unnamed:0 columnResolving ParserError: Error tokenizing dataSaving DataFrame as zipped csvSkipping rows without skipping header for read_csvSpecifying data type for read_csvTreating missing values as empty strings rather than NaN for read_csv
Comments
Log in or sign up
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
help Ask a question
Share on Twitter
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
icon_star
Doc Search
icon_star
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Navigate to
A
A
brightness_medium
share
arrow_backShare
Twitter
Facebook
chevron_left Creating DataFrames Cookbook
Combining multiple Series into a DataFrameCombining multiple Series to form a DataFrameConverting a Series to a DataFrameConverting list of lists into DataFrameConverting list to DataFrameConverting percent string into a numeric for read_csvConverting scikit-learn dataset to Pandas DataFrameConverting string data into a DataFrameCreating a DataFrame from a stringCreating a DataFrame using listsCreating a DataFrame with different type for each columnCreating a DataFrame with empty valuesCreating a DataFrame with missing valuesCreating a DataFrame with random numbersCreating a DataFrame with zerosCreating a MultiIndex DataFrameCreating a Pandas DataFrameCreating a single DataFrame from multiple filesCreating empty DataFrame with only column labelsFilling missing values when using read_csvImporting DatasetImporting tables from PostgreSQL as Pandas DataFramesInitialising a DataFrame using a constantInitialising a DataFrame using a dictionaryInitialising a DataFrame using a list of dictionariesInserting lists into a DataFrame cellKeeping leading zeroes when using read_csvParsing dates when using read_csvPreventing strings from getting parsed as NaN for read_csvReading data from GitHubReading file without headerReading large CSV files in chunksReading n random lines using read_csvReading space-delimited filesReading specific columns from fileReading tab-delimited filesReading the first few lines of a file to create DataFrameReading the last n lines of a fileReading URL using read_csvReading zipped csv file as a DataFrameRemoving Unnamed:0 columnResolving ParserError: Error tokenizing dataSaving DataFrame as zipped csvSkipping rows without skipping header for read_csvSpecifying data type for read_csvTreating missing values as empty strings rather than NaN for read_csv
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings

Reading large CSV files in chunks in Pandas

Pandas
chevron_right
Cookbooks
chevron_right
DataFrame Cookbooks
chevron_right
Creating DataFrames Cookbook
schedule Jul 1, 2022
Last updated
local_offer PythonPandas
Tags
tocTable of Contents
expand_more

To read large CSV files in chunks in Pandas, use the read_csv(~) method and specify the chunksize parameter. This is particularly useful if you are facing a MemoryError when trying to read in the whole DataFrame at once.

Example

Consider the following sample.txt file:

A,B
1,2
3,4
5,6
7,8
9,10

To read this file in chunks of two rows, set chunksize like so:

for chunk in pd.read_csv("sample.txt", chunksize=2):
print(chunk)
print("-----")
A B
0 1 2
1 3 4
-----
A B
2 5 6
3 7 8
-----
A B
4 9 10
-----

Each chunk is a DataFrame, allowing you to work with the dataset piece by piece if you do not need the whole dataset in memory at one time.

mail
Join our newsletter for updates on new DS/ML comprehensive guides (spam-free)
robocat
Published by Arthur Yanagisawa
Edited by 0 others
Did you find this page useful?
thumb_up
thumb_down
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!