# Converting column type to float in Pandas DataFrame

schedule May 20, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
# Solution

To convert the column type to `float` in Pandas DataFrame:

We recommend using `to_numeric()` since this method is more flexible.

# Example - converting data type of a single column

Consider the following DataFrame:

``` df = pd.DataFrame({"A":["3","4"],"B":["5","6"]})df A B0 3 51 4 6 ```

Currently, the column types are as follows:

``` df.dtypes A objectB objectdtype: object ```

## Using astype method

To convert column `A` into type `float`, use the Series' `astype()` method:

``` df["A"] = df["A"].astype("float")df.dtypes A float64B objectdtype: object ```

## Using as_numeric method

To convert column `A` into type `float32`, use the Pandas' `to_numeric(~)` method:

``` df["A"] = pd.to_numeric(df["A"], downcast="float")df.dtypes A float32B objectdtype: object ```

### Case when conversion is not possible

Consider the following DataFrame:

``` df = pd.DataFrame({"A":["3#","4"]})df A0 3#1 4 ```

Here, the value `"3#"` cannot be converted into a numeric type. By default, the `to_numeric(~)` type will throw an error in such cases:

``` df["A"] = pd.to_numeric(df["A"]) ValueError: Unable to parse string "3#" at position 0 ```

We can map values that cannot be converted into `NaN` instead:

``` df["A"] = pd.to_numeric(df["A"], errors='coerce')df.dtypes A float64B objectdtype: object ```

Note that Pandas will only allow columns containing `NaN` to be of type `float`.

# Example - converting data type of multiple columns to float

To convert the data type of multiple columns to float, use Pandas' `apply(~)` method with `to_numeric(~)`.

## Case when conversion is possible

Consider the following DataFrame:

``` df = pd.DataFrame({"A":["3","4"],"B":["5","6"]})df A B0 3 51 4 6 ```

Currently, the column types are as follows:

``` df.dtypes A objectB objectdtype: object ```

To convert the type of all the columns, use the DataFrame's `apply(~)` method:

``` df = df.apply(pd.to_numeric)df.dtypes A int64B int64dtype: object ```

Here, we are iteratively applying Pandas' `to_numeric(~)` method to each column of the DataFrame. The `to_numeric(~)` method takes as argument a single column (Series) and converts its type to numeric (e.g. `int` or `float`).

## Case when conversion is not possible

Consider the following DataFrame:

``` df = pd.DataFrame({"A":["3","4"],"B":["5#","6"]})df A B0 3 5#1 4 6 ```

Here, column `B` cannot be converted into numeric type since `5#` is not a valid number. Applying the `to_numeric(~)` method without arguments will result in an error:

``` df = df.apply(pd.to_numeric) ValueError: Unable to parse string "5#" at position 0 ```

### Ignoring unsuccessful columns

Instead of throwing an error, we can supply the following keyword argument to `to_numeric()` in order to ignore columns where the conversion is not possible:

``` df = df.apply(pd.to_numeric, errors='ignore', downcast='float')df.dtypes A float32B objectdtype: object ```

## Replace with NaN for unsuccessful values

To fill values that cannot be successfully converted into the specified data type with `NaN`:

``` df = df.apply(pd.to_numeric, errors='coerce', downcast='float')df A B0 3 NaN1 4 6.0 ```

Here, the value `"5#"` could not be converted into a numeric type and therefore we end up with a `NaN` instead. The converted data types are as follows:

``` df.dtypes A float32B float32dtype: object ```
