search
Search
Math ML
Map of Data Science
Guest 0reps
Thanks for the thanks!
close
account_circle
Profile
exit_to_app
Sign out
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
Doc Search
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Shrink
Navigate to
A
A
brightness_medium
share
arrow_backShare
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
auto_stories new
settings

# Pandas DataFrame | quantile method

Pandas
chevron_right
Documentation
chevron_right
DataFrame
chevron_right
Basic and Descriptive Statistics
schedule Jul 1, 2022
Last updated
local_offer PythonPandas
Tags

Pandas `DataFrame.quantile(~)` method returns the interpolated value at the specified quantile.

# Parameters

1. `q` | `array-like` of `float`

The desired quantile to compute, which must be between 0 (inclusive) and 1 (inclusive). By default, `q=0.5`, that is the value at the 50th percentile is computed.

2. `axis`link | `None` or `int` or `string` | `optional`

Whether to compute the quantile row-wise or column-wise:

Axis

Description

Compute the quantile for each column.

`0` or `"index"`

Compute the quantile for each row.

`1` or `"columns"`

By default, `axis=0`.

3. `numeric_only` | `boolean` | `optional`

Whether or not to compute the quantiles only for rows/columns of numeric type. If set to `False`, then quantiles of rows/columns with `datetime` and `timedelta` will also be computed. By default, `numeric_only=True`.

4. `interpolation`link | `string` | `optional`

How the values are interpolated when the given percentile sits between two data-points, say `i` and `j` where `i<j`:

Value

Description

`"linear"`

Standard linear interpolation

`"lower"`

Returns `i`

`"higher"`

Return `j`

`"midpoint"`

Returns `(i+j)/2`

`"nearest"`

Returns `i` or `j`, whichever is closer

By default, `interpolation="linear"`.

# Return Value

If `q` is a scalar, then a `Series` is returned. Otherwise, a `DataFrame` is returned.

# Examples

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[2,4,6,8],"B":[5,6,7,8]})df    A  B0  2  51  4  62  6  73  8  8 ```

## Computing percentile column-wise

To compute the 50th percentile of each column:

``` df.quantile()   # q=0.5 A 5.0B 6.5Name: 0.5, dtype: float64 ```

Here, the return type is `Series`. To interpret the output, exactly 50% of the values in column `A` is smaller than `5.0`.

## Computing percentile row-wise

To compute the 30th percentile of each row:

``` df.quantile(q=0.3, axis=1) 0 2.91 4.62 6.33 8.0Name: 0.3, dtype: float64 ```

## Computing multiple percentiles

To get the values at the 50th and 75th percentiles for each column:

``` df.quantile([0.5, 0.75])   # returns a DataFrame A B0.50 5.0 6.500.75 6.5 7.25 ```

## Changing interpolation methods

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[2,4,6,8]})df    A  0  21  42  63  8 ```

### linear

Consider the case when the value corresponding to the specified quantile does not exist:

``` df.quantile(0.5)   # interpolation="linear" A 5.0Name: 0.5, dtype: float64 ```

Here, since the value corresponding to the 50th percentile does not exist in column `A`, the value was linearly interpolated between 4 and 6.

### lower

``` df.quantile(0.5, interpolation="lower") A 4Name: 0.5, dtype: int64 ```

Again, since the 50% quantile does not exist, we need to perform interpolation. We know it is between the values 4 and 6. By passing in `"lower"`, we select the lower value, that is, 4 in this case.

### higher

``` df.quantile(0.5, interpolation="higher") A 6Name: 0.5, dtype: int64 ```

Same logic as `"lower"`, but we take the upper value.

Here's the same `df` for your reference:

``` df    A  0  21  42  63  8 ```

### nearest

``` df.quantile(0.5, interpolation="nearest") A 6Name: 0.5, dtype: int64 ```

By passing in `"nearest"`, instead of always selecting the lower or upper value, we take whichever is nearest. In this case, the 50% quantile is 5, which is coincidentally right in the middle of 4 and 6. In such cases, the upper value is selected.

### midpoint

``` df.quantile(0.5, interpolation="midpoint") A 5.0Name: 0.5, dtype: float64 ```

Here, we just take the midpoint of the lower and upper value, so `(4+6)/2=5`.

mail
Edited by 0 others
thumb_up
thumb_down
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!