# Describing certain columns of a DataFrame in Pandas

Aug 12, 2023
PythonPandas
To describe certain columns, as opposed to all columns, use the `[]` notation to first extract the desired columns and then use the `describe(~)` method.

Consider the following DataFrame:

``` names = pd.Series(["alex","bob","cathy"], dtype="string")gender = pd.Series(["male","male","female"], dtype="category")age = pd.Series([20,30,20], dtype="int")df = pd.DataFrame({"names":names,"gender":gender,"age":age})df    names  gender  age0  alex   male    201  bob    male    302  cathy  female  20 ```

To describe only columns `gender` and `age`:

``` df[["gender","age"]].describe(include="all")        gender      agecount    3      3.000000unique   2         NaNtop     male       NaNfreq     2         NaNmean    NaN     23.333333std     NaN     5.773503min     NaN     20.00000025%     NaN     20.00000050%     NaN     20.00000075%     NaN     25.000000max     NaN     30.000000 ```

Here, note the following:

• the `df[["gender","age"]]` syntax extracts the columns `gender` and `age` from `df` as a DataFrame

• the `include=all` parameter indicates that we want to compute the descriptive statistic of all columns. If this is left out, then only numeric types will be considered, and so the `gender` column will be ignored.

