The label of the columns to include. All other columns will be excluded from the resulting csv. By default, all columns are included in the resulting csv.

6. headerlink | boolean or list of string | optional

Whether or not to include the column labels in the csv. If a list of strings is passed, then those strings will overwrite the existing column labels. By default, header=True.

7. index | boolean | optional

Whether or not to include row labels in the csv. By default, index=True.

8. index_labellink | string or sequence or False or None | optional

The column labels to use. By default, index_label=None.

9. mode | string | optional

The mode to open the file. By default, mode="w", which stands for write mode.

10. encoding | string | optional

The encoding to use when writing to a file. By default, encoding="utf-8".

11. compression | string or dict | optional

The compression algorithm to use. The allowed values are as follows:


        
        
            
                
                
                    "infer" "gzip" "bz2" "zip" "xz"

By default, compression="infer", which means that if a path is supplied for path_or_buf, then the compression algorithm will be inferred from the extension you appended. For instance, if path_or_buf is "my_data.zip", then the "zip" compression will be used. If an extension is not supplied, then no compression will take place.

12. quoting | optional

By default, quoting=csv.QUOTE_MINIMAL.

13. quotecharlink | string of length one | optional

When a value happens to contain a delimiter, then the value will erroneously be divided up. You can wrap your value using quotechar to prevent such unwanted splits from happening. By default, '"'.

14. line_terminatorlink | string | optional

The character used to indicate a line break. Defaults to os.linesep.

15. chunksize | int or None | optional

The number of rows to write at one time. If your DataFrame is large, using a large chunksize (e.g. 10000) may lead to better performance. By default, chunksize=None.

16. date_format | string | optional

The format string for datetime objects.

17. doublequotelink | boolean | optional

Whether or not to parse quotechar. By default, doublequote=True.

18. escapecharlink | string | optional

The character to escape the double quotation marks. By default, escapechar=None.

19. decimal | string | optional

The character denoting a decimal point. By default, decimal=".". This parameter exists since some European countries like France use a , to denote a decimal point instead.

Return Value

If path_or_buf is specified, then None is returned. Otherwise, a string is returned.

Examples

Writing csv to a file

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4], "B":[5,6]}, index=["a","b"])
df
                
            
               A  B
a  3  5
b  4  6

To save our df as a CSV file:


        
        
            
                
                
                    df.to_csv(path_or_buf="my_data")

The outputted file my_data, which is located in the same directory as the Python script, is as follows:

Writing csv as a string

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4], "B":[5,6]}, index=["a","b"])
df
                
            
               A  B
a  3  5
b  4  6

To save our df as a csv string, don't specify path_or_buf:


        
        
            
                
                
                    df.to_csv()
                
            
            ',A,B\na,3,5\nb,4,6\n'

Note that printing this out will render the \n to take effect:


        
        
            
                
                
                    print(df.to_csv())
                
            
            ,A,B
a,3,5
b,4,6

Specifying float_format

Consider the following DataFrame that contains floating point numbers:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3.00005,4],"B":[5,6]}, index=["a","b"])
df
                
            
               A        B
a  3.00005  5
b  4.00000  6

In order to format the floating point numbers (e.g. how many decimal places to include), we use the float_format parameter. The syntax follows that of Python's standard string formatter, which we cover here in detail.

As an example, to include up to 3 decimal places:


        
        
            
                
                
                    df.to_csv(float_format="%.3f")
                
            
            ',A,B\na,3.000,5\nb,4.000,6\n'

Specifying columns

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4], "B":[5,6]}, index=["a","b"])
df
                
            
               A  B
a  3  5
b  4  6

By default, all columns are included in the resulting csv. To include only specific columns, specify their column labels like so:


        
        
            
                
                
                    print(df.to_csv(columns=["A"]))
                
            
            ,A
a,3
b,4

Specifying header

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4],"B":[5,6]}, index=["a","b"])
df
                
            
               A  B
a  3  5
b  4  6

By default, header=True, which means that the header is include in the resulting csv:


        
        
            
                
                
                    print(df.to_csv())   # header=True
                
            
            ,A,B
a,3,5
b,4,6

To exclude the headers, set header=False like so:


        
        
            
                
                
                    print(df.to_csv(header=False))
                
            
            a,3,5
b,4,6

We can also pass a list of new column labels like so:


        
        
            
                
                
                    print(df.to_csv(header=["C","D"]))
                
            
            ,C,D
a,3,5
b,4,6

Specifying index_label

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4],"B":[5,6]}, index=["a","b"])
df
                
            
               A  B
a  3  5
b  4  6

By default, index_label=None, which means that an empty index label will be included:


        
        
            
                
                
                    print(df.to_csv())
                
            
            ,A,B
a,3,5
b,4,6

Notice how we begin with a comma here - the index label is empty, but it is still included.

To remove the index names, set index_label=False like so:


        
        
            
                
                
                    print(df.to_csv(index_label=False))
                
            
            A,B
a,3,5
b,4,6

Some statistical software like R may find this format easier to parse.

Specifying quotechar

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":["3,9",4], "B":[5,6]}, index=["a","b"])
df
                
            
               A    B
a  3,9  5
b  4    6

Here, notice how one of our values is "3,9", which unfortunately contains the default separator ,. If we were to turn this into a csv, we would end up with 3,9,5 in the first row, which is incorrect since it indicates that we have 3 values in this row instead of 2.

In order to indicate that 3,9 is one single value, the to_csv(~) method wraps a quotation mark (") around it by default:


        
        
            
                
                
                    df.to_csv(sep=",")
                
            
            ',A,B\na,"3,9",5\nb,4,6\n'

Notice how we have "3,9" now.

Instead of ", we can specify a single character as the wrapper by passing in quotechar like so:


        
        
            
                
                
                    df.to_csv(quotechar="@")
                
            
            ',A,B\na,@3,9@,5\nb,4,6\n'

Specifying line_terminator

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4], "B":[5,6]}, index=["a","b"])
df
                
            
               A  B
a  3  5
b  4  6

By default, each row is split using a new line character (\n):


        
        
            
                
                
                    df.to_csv()
                
            
            ',A,B\na,3,5\nb,4,6\n'

We can use a custom character to split the rows by passing in line_terminator:


        
        
            
                
                
                    df.to_csv(line_terminator="@")
                
            
            ',A,B@a,3,5@b,4,6@'

Specifying doublequote and escapechar

Consider the following DataFrame:


        
        
            
                
                
                    df = pd.DataFrame({"A":['3"9',4],"B":[5,6]}, index=["a","b"])
df
                
            
               A    B
a  3"9  5
b  4    6

Notice how df contains a value that has a single double quotation mark ".

By default, doublequote=True, which means that the quotechar is duplicated like so:


        
        
            
                
                
                    print(df.to_csv())
                
            
            ,A,B
a,"3""9",5
b,4,6

Here, note the following:

originally the output was "3"9", where the outer "" is there to indicate that the value is a string.
the problem here is that, the value actually contains " so this results in syntax error as "3"9" is not a valid string.
to overcome this problem, the method escapes the middle " using the quotechar, which is (confusingly) " by default.
as a result, we end up with this peculiar-looking "3""9".

We can turn off this behaviour where strings are wrapped in "" by setting doublequote=False, and then providing the escapechar parameter:


        
        
            
                
                
                    print(df.to_csv(doublequote=False, escapechar="@"))
                
            
            ,A,B
a,3@"9,5
b,4,6

Here, the escapechar is needed since "39 alone is invalid syntax - " denotes a start of a string, and there is no closing ". The role of escapechar is to indicate that " is not at all related to a string.

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

Official Pandas Documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html

thumb_up

thumb_down

chat_bubble_outline

settings

Enjoy our search

Hit / to insta-search docs and recipes!