# Pandas DataFrame | quantile method

schedule Aug 10, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Pandas `DataFrame.quantile(~)` method returns the interpolated value at the specified quantile.

# Parameters

1. `q` | `array-like` of `float`

The desired quantile to compute, which must be between 0 (inclusive) and 1 (inclusive). By default, `q=0.5`, that is the value at the 50th percentile is computed.

2. `axis`link | `None` or `int` or `string` | `optional`

Whether to compute the quantile row-wise or column-wise:

Axis

Description

`0` or `"index"`

Compute the quantile for each column.

`1` or `"columns"`

Compute the quantile for each row.

By default, `axis=0`.

3. `numeric_only` | `boolean` | `optional`

Whether or not to compute the quantiles only for rows/columns of numeric type. If set to `False`, then quantiles of rows/columns with `datetime` and `timedelta` will also be computed. By default, `numeric_only=True`.

4. `interpolation`link | `string` | `optional`

How the values are interpolated when the given percentile sits between two data-points, say `i` and `j` where `i<j`:

Value

Description

`"linear"`

Standard linear interpolation

`"lower"`

Returns `i`

`"higher"`

Return `j`

`"midpoint"`

Returns `(i+j)/2`

`"nearest"`

Returns `i` or `j`, whichever is closer

By default, `interpolation="linear"`.

# Return Value

If `q` is a scalar, then a `Series` is returned. Otherwise, a `DataFrame` is returned.

# Examples

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[2,4,6,8],"B":[5,6,7,8]})df    A  B0  2  51  4  62  6  73  8  8 ```

## Computing percentile column-wise

To compute the 50th percentile of each column:

``` df.quantile()   # q=0.5 A 5.0B 6.5Name: 0.5, dtype: float64 ```

Here, the return type is `Series`. To interpret the output, exactly 50% of the values in column `A` is smaller than `5.0`.

## Computing percentile row-wise

To compute the 30th percentile of each row:

``` df.quantile(q=0.3, axis=1) 0 2.91 4.62 6.33 8.0Name: 0.3, dtype: float64 ```

## Computing multiple percentiles

To get the values at the 50th and 75th percentiles for each column:

``` df.quantile([0.5, 0.75])   # returns a DataFrame A B0.50 5.0 6.500.75 6.5 7.25 ```

## Changing interpolation methods

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[2,4,6,8]})df    A  0  21  42  63  8 ```

### linear

Consider the case when the value corresponding to the specified quantile does not exist:

``` df.quantile(0.5)   # interpolation="linear" A 5.0Name: 0.5, dtype: float64 ```

Here, since the value corresponding to the 50th percentile does not exist in column `A`, the value was linearly interpolated between 4 and 6.

### lower

``` df.quantile(0.5, interpolation="lower") A 4Name: 0.5, dtype: int64 ```

Again, since the 50% quantile does not exist, we need to perform interpolation. We know it is between the values 4 and 6. By passing in `"lower"`, we select the lower value, that is, 4 in this case.

### higher

``` df.quantile(0.5, interpolation="higher") A 6Name: 0.5, dtype: int64 ```

Same logic as `"lower"`, but we take the upper value.

Here's the same `df` for your reference:

``` df    A  0  21  42  63  8 ```

### nearest

``` df.quantile(0.5, interpolation="nearest") A 6Name: 0.5, dtype: int64 ```

By passing in `"nearest"`, instead of always selecting the lower or upper value, we take whichever is nearest. In this case, the 50% quantile is 5, which is coincidentally right in the middle of 4 and 6. In such cases, the upper value is selected.

### midpoint

``` df.quantile(0.5, interpolation="midpoint") A 5.0Name: 0.5, dtype: float64 ```

Here, we just take the midpoint of the lower and upper value, so `(4+6)/2=5`.

