**Pandas**

*chevron_left*

**Basic and Descriptive Statistics**

# Pandas DataFrame | quantile method

*schedule*Aug 10, 2023

*toc*Table of Contents

*expand_more*

**mathematics behind data science**with 100+ top-tier guides

Start your free 7-days trial now!

Pandas `DataFrame.quantile(~)`

method returns the interpolated value at the specified quantile.

# Parameters

1. `q`

| `array-like`

of `float`

The desired quantile to compute, which must be between 0 (inclusive) and 1 (inclusive). By default, `q=0.5`

, that is the value at the 50th percentile is computed.

2. `axis`

link | `None`

or `int`

or `string`

| `optional`

Whether to compute the quantile row-wise or column-wise:

Axis | Description |
---|---|

| Compute the quantile for each column. |

| Compute the quantile for each row. |

By default, `axis=0`

.

3. `numeric_only`

| `boolean`

| `optional`

Whether or not to compute the quantiles only for rows/columns of numeric type. If set to `False`

, then quantiles of rows/columns with `datetime`

and `timedelta`

will also be computed. By default, `numeric_only=True`

.

4. `interpolation`

link | `string`

| `optional`

How the values are interpolated when the given percentile sits between two data-points, say `i`

and `j`

where `i<j`

:

Value | Description |
---|---|

| Standard linear interpolation |

| Returns |

| Return |

| Returns |

| Returns |

By default, `interpolation="linear"`

.

# Return Value

If `q`

is a scalar, then a `Series`

is returned. Otherwise, a `DataFrame`

is returned.

# Examples

Consider the following DataFrame:

```
df
A B0 2 51 4 62 6 73 8 8
```

## Computing percentile column-wise

To compute the 50th percentile of each column:

```
df.quantile() # q=0.5
A 5.0B 6.5Name: 0.5, dtype: float64
```

Here, the return type is `Series`

. To interpret the output, exactly 50% of the values in column `A`

is smaller than `5.0`

.

## Computing percentile row-wise

To compute the 30th percentile of each row:

```
df.quantile(q=0.3, axis=1)
0 2.91 4.62 6.33 8.0Name: 0.3, dtype: float64
```

## Computing multiple percentiles

To get the values at the 50th and 75th percentiles for each column:

```
df.quantile([0.5, 0.75]) # returns a DataFrame
A B0.50 5.0 6.500.75 6.5 7.25
```

## Changing interpolation methods

Consider the following DataFrame:

```
df
A 0 21 42 63 8
```

### linear

Consider the case when the value corresponding to the specified quantile does not exist:

```
df.quantile(0.5) # interpolation="linear"
A 5.0Name: 0.5, dtype: float64
```

Here, since the value corresponding to the 50th percentile does not exist in column `A`

, the value was linearly interpolated between 4 and 6.

### lower

```
df.quantile(0.5, interpolation="lower")
A 4Name: 0.5, dtype: int64
```

Again, since the 50% quantile does not exist, we need to perform interpolation. We know it is between the values 4 and 6. By passing in `"lower"`

, we select the lower value, that is, 4 in this case.

### higher

```
df.quantile(0.5, interpolation="higher")
A 6Name: 0.5, dtype: int64
```

Same logic as `"lower"`

, but we take the upper value.

Here's the same `df`

for your reference:

```
df
A 0 21 42 63 8
```

### nearest

```
df.quantile(0.5, interpolation="nearest")
A 6Name: 0.5, dtype: int64
```

By passing in `"nearest"`

, instead of always selecting the lower or upper value, we take whichever is nearest. In this case, the 50% quantile is 5, which is coincidentally right in the middle of 4 and 6. In such cases, the upper value is selected.

### midpoint

```
df.quantile(0.5, interpolation="midpoint")
A 5.0Name: 0.5, dtype: float64
```

Here, we just take the midpoint of the lower and upper value, so `(4+6)/2=5`

.