The column labels to use for the DataFrame. By default, if columns is not passed and data provides no column labels, then integer indices will be used.

4. dtypelink | dtype | optional

The data type to use for the DataFrame if possible. Only one type is allowed, and no error is thrown if type conversion is unsuccessful. By default, dtype=None, that is, the data type is inferred.

5. copy | boolean | optional

This parameter is only relevant if data is a DataFrame or a 2D ndarray.

If True, then a new DataFrame is returned. Modifying this returned DataFrame will not affect data, and vice versa.
If False, then modifying the returned DataFrame will also mutate the original data, and vice versa.

By default, copy=False.

Return value

A DataFrame object.

Examples

Using a dictionary of arrays

To create a DataFrame using a dictionary of arrays:


        
        
            
                
                
                    df = pd.DataFrame({"A":[3,4], "B":[5,6]})
df
                
            
               A  B
0  3  5
1  4  6

Here, the key-value pair of the dictionary is as follows:

key: column label
value: values of that column

Also, since the data does not contain any index (i.e. row labels), the default integer indices are used.

Using a nested dictionary

To create a DataFrame using a nested dictionary:


        
        
            
                
                
                    col_one = {"a":3,"b":4}
col_two = {"a":5,"b":6}
df = pd.DataFrame({"A":col_one, "B":col_two})
df
                
            
               A  B
a  3  5
b  4  6

Here, we've specified the index in col_one and col_two.

Using a Series

To create a DataFrame using a Series:


        
        
            
                
                
                    s_one = pd.Series([3,4], index=["a","b"])
s_two = pd.Series([5,6], index=["a","b"])
df = pd.DataFrame({"A":s_one, "B":s_two})
df
                
            
               A  B
a  3  5
b  4  6

Using 2D array

We can pass in a 2D list or 2D NumPy array like so:


        
        
            
                
                
                    df = pd.DataFrame([[3,4],[5,6]])
df
                
            
               0  1
0  3  4
1  5  6

Notice how the default row and column labels are integer indices.

Using a constant

To initialise a DataFrame using a single constant, we need to specify parameters columns and index so as to define the shape of the DataFrame:


        
        
            
                
                
                    pd.DataFrame(2, index=["a","b"], columns=["A","B","C"])
                
            
               A  B  C
a  2  2  2
b  2  2  2

Specifying column labels and index

To explicitly set the column labels and index (i.e. row labels):


        
        
            
                
                
                    df = pd.DataFrame([[3,4],[5,6]], columns=["A","B"], index=["a","b"])
df
                
            
               A  B
a  3  4
b  5  6

Specifying dtype

To set a preference for the type of all columns:


        
        
            
                
                
                    df = pd.DataFrame([["3",4],["5",6]], dtype=float)
df
                
            
               0    1
0  3.0  4.0
1  5.0  6.0

Notice how "3" was casted to a float.

Note that no error will be thrown even if the type conversion is unsuccessful. For instance:


        
        
            
                
                
                    df = pd.DataFrame([["3@@@",4],["5",6]], dtype=float)
df
                
            
               0     1
0  3@@@  4.0
1  5     6.0

Here, the dtypes of the columns are as follow:


        
        
            
                
                
                    df.dtypes
                
            
            0     object
1    float64
dtype: object

Pandas | Series constructor

Pandas Series(~) constructor initialises a new Series.

chevron_right