# Converting K and M to numerical form in Pandas DataFrame

Pandas
chevron_right
Cookbooks
chevron_right
DataFrame Cookbooks
chevron_right
Data Manipulation Cookbook
schedule Jul 1, 2022
Last updated
local_offer PythonPandas
Tags
Consider the following DataFrame:

``` df = pd.DataFrame({"A":["20K","2.5K","30M","3.5M","500"]})df A0 20K1 2.5K2 30M3 3.5M4 500 ```

Here, column `A` is of type string.

# Solution

To convert `"K"` (thousand) and `"M"` (million) to numerical form:

``` df["A"].replace({"K":"*1e3", "M":"*1e6"}, regex=True).map(pd.eval).astype(int) 0 200001 25002 300000003 35000004 500Name: A, dtype: int64 ```

# Explanation

We first use the `replace(~)` method to replace `K` and `M` with `*1e3` and `*1e6`, respectively:

``` df["A"].replace({"K":"*1e3", "M":"*1e6"}, regex=True) 0 20*1e31 2.5*1e32 30*1e63 3.5*1e64 500Name: A, dtype: object ```

Note the following:

• `regex=True` is needed if we want the key string to be replaced by value string (e.g. `K` replaced by `"*1e3"` in this case)

• `1e3` is the scientific notation of `1000`.

Next, we mathematically evaluate each value using `map(pd.eval)`:

``` df["A"].replace({"K":"*1e3", "M":"*1e6"}, regex=True).map(pd.eval) 0 20000.01 2500.02 30000000.03 3500000.04 500.0Name: A, dtype: float64 ```

Here, the Series' `map(~)` method applies the `pd.eval(~)` method to each of the values.

Finally, we convert all the values into integer using `astype(int)`:

``` df["A"].replace({"K":"*1e3", "M":"*1e6"}, regex=True).map(pd.eval).astype(int) 0 200001 25002 300000003 35000004 500Name: A, dtype: int64 ```