search
Search
Unlock 100+ guides
search toc
close
Cancel
Post
account_circle
Profile
exit_to_app
Sign out
search
keyboard_voice
close
Searching Tips
Search for a recipe:
"Creating a table in MySQL"
Search for an API documentation: "@append"
Search for code: "!dataframe"
Apply a tag filter: "#python"
Useful Shortcuts
/ to open search panel
Esc to close search panel
to navigate between search results
d to clear all current filters
Enter to expand content preview
Doc Search
Code Search Beta
SORRY NOTHING FOUND!
mic
Start speaking...
Voice search is only supported in Safari and Chrome.
Shrink
Navigate to
Pandas
655 guides
keyboard_arrow_down
check_circle
Mark as learned
thumb_up
0
thumb_down
0
chat_bubble_outline
0
Comment
auto_stories Bi-column layout
settings

Pandas DataFrame | sum method

schedule Aug 12, 2023
Last updated
local_offer
PythonPandas
Tags
mode_heat
Master the mathematics behind data science with 100+ top-tier guides
Start your free 7-days trial now!

Pandas `DataFrame.sum(~)` method computes the sum for each row or column of the source DataFrame.

Parameters

1. `axis`link | `int` or `string` | `optional`

Whether to compute the sum row-wise or column-wise:

Axis

Description

`"index"` or `0`

Sum is computed for each column.

`"columns"` or `1`

Sum is computed for each row.

By default, `axis=0`.

2. `skipna`link | `boolean` | `optional`

Whether or not to ignore missing values (`NaN`). By default, `skipna=True`.

3. `level` | `string` or `int` | `optional`

The name or the integer index of the level to consider for summation. This is relevant only if your DataFrame is Multi-index.

4. `numeric_only`link | `None` or `boolean` | `optional`

The allowed values are as follows:

Value

Description

`True`

Only numeric rows/columns will be considered (e.g. `float`, `int`, `boolean`).

`False`

Attempt computation with all types (e.g. strings and dates), and throw an error whenever the summation is invalid.

`None`

Attempt computation with all types, and ignore all rows/columns that do not allow for summation without raising an error.

For summation to be valid, the `+` operator must be well-defined between the types.

By default, `numeric_only=None`.

5. `min_count` | `int` | `optional`

The minimum number of values that must be present to perform summation. If there are fewer than `min_count` values (excluding `NaN`), then `NaN` will be returned. By default, no minimum is set.

Return Value

If the `level` parameter is specified, then a `DataFrame` will be returned. Otherwise, a `Series` will be returned.

Examples

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[2,3], "B":[4,5]})df    A  B0  2  41  3  5 ```

Column-wise summation

To compute the sum for each column:

``` df.sum()   # axis=0 A 5B 9dtype: int64 ```

Here, the return type is `Series`.

Row-wise summation

To compute the sum for each row, set `axis=1`:

``` df.sum(axis=1) 0 61 8dtype: int64 ```

Specifying skipna

Consider the following DataFrame with a missing value:

``` df = pd.DataFrame({"A":[2,pd.np.nan], "B":[4,5]})df A B0 2.0 41 NaN 5 ```

By default, `skipna=True`, which means that `NaN`s are ignored in the computation:

``` df.sum() A 2.0B 9.0dtype: float64 ```

Setting to `skipna=False` will take into account the `NaN`s:

``` df.sum(skipna=False) A NaNB 9.0dtype: float64 ```

The reason we get `NaN` for the sum of column `A` is that any arithmetic computation involving `NaN`s will result in `NaN`s.

Specifying numeric_only

Consider the following DataFrame:

``` df = pd.DataFrame({"A":[4,5], "B":[2,True], "C":["6",False]})df    A  B     C0  4  2     "6"1  5  True  False ```

Here, both columns `B` and `C` contain mixed types, but the key difference is that summation is defined for `B`, but not for `C`. Recall that the internal representation of a `True` boolean is `1`, so the operation `2+True` actually evaluates to `3`:

``` 2 + True 3 ```

On the other hand, `"6"+False` throws an error:

``` 6 + "False" TypeError: unsupported operand type(s) for +: 'int' and 'str' ```

None

By default, `numeric_only=None`, which means that rows/columns with mixed types will also be considered:

``` df.sum(numeric_only=None) A 9B 3dtype: int64 ```

Here, notice how summation was performed on column `B`, but not on `C`. By passing in `None`, rows/columns that result in invalid summations will simply be ignored without throwing an error.

False

By setting `numeric_only=False`, rows/columns with mixed types will again be considered, but an error will be thrown when summation cannot be performed:

``` df.sum(numeric_only=False) TypeError: can only concatenate str (not "bool") to str ```

Here, we end up with an error because column `C` contains mixed types where the `+` operation is not defined.

True

By setting `numeric_only=True`, only numeric rows/columns will be considered:

``` df.sum(numeric_only=True) A 9dtype: int64 ```

Notice how columns `B` and `C` were ignored since they contain mixed types.

Case of empty DataFrame

Computing a sum of an empty DataFrame or Series will result in `0`:

``` df = pd.DataFrame({"A":[]})df.sum() A 0.0dtype: float64 ```
Edited by 0 others
thumb_up
thumb_down
Comment
Citation
Ask a question or leave a feedback...
thumb_up
0
thumb_down
0
chat_bubble_outline
0
settings
Enjoy our search
Hit / to insta-search docs and recipes!