casino keyboard_arrow_down

**Prob and Stats**

38 guides

1. Basics of statistics

Population, samples and sampling techniquesMeasures of central tendencyMeasures of spreadlockQuantiles, quartiles and percentileslockHistogramlockBox-plot diagramslock

2. Basics of probability theory

Basics of set theory and Venn diagramsCounting with permutationsCounting with combinationslockSample space, events and probability axiomslockConditional probabilitylockMultiplication and addition rulelockLaw of total probabilitylockBayes' theoremlock

3. Random variables

Random variablesExpected valueProperties of expected valuelockVariancelockProperties of variancelockCovariancelockCorrelationlock

4. Point estimation

Sample estimatorsSample meanSample variancelockSample covariancelockSample correlationlockUnbiased estimatorlockMean squared errorlockSampling distribution of the sample meanlock

5. Discrete probability distributions

check_circle

Mark as learned thumb_up

2

thumb_down

0

chat_bubble_outline

0

Comment auto_stories Bi-column layout

settings

# Guides on probability and statistics

*schedule*Aug 12, 2023

local_offer

Tags Probability and Statistics

*toc*Table of Contents

*expand_more*

Master the

Start your free 7-days trial now!

**mathematics behind data science**with 100+ top-tier guidesStart your free 7-days trial now!

Heya guys ðŸ‘‹, welcome to our comprehensive probability and statistics series!

Unlike our ML series, this series has a chronological flow like a textbook since concepts in probability and statistics tend to be progressive.

The series is far from complete and we hope to publish comprehensive guides every week. As always, please feel free to email me at isshin@skytowner.com or join our Discord channel if you get stuck!

# Chapter 1 - Basics of statistics

1.1. Population, samples and sampling techniques

This guide covers common techniques to sample from a population such as random, stratified and convenience sampling.

1.2. Measures of central tendency

This guide covers three main measures of central tendency: the mean, median and mode.

1.3. Measures of spread

This guide covers three main measures of spread: variance, standard deviation and mean absolute deviation.

1.4. Quantiles, quartiles and percentiles

A q-quantile divides the data points into q equal portions. Quartiles are 4-quantiles and percentiles are 100-quantiles.

1.5.1 Visualizing data - histogram

A histogram is a diagram that illustrates the distribution of a given set of values.

1.5.2 Visualizing data - boxplot diagrams

A boxplot diagram, or box-whisker diagram, is a popular way to visualize the spread of a dataset using quartiles.

# Chapter 2 - Basics of probability theory

2.1. Basics of set theory and Venn diagrams

Set theory is a branch of mathematics that studies sets, which are simply a list of elements where ordering does not matter.

2.2. Counting with permutations

Permutation refers to the number of ways of ordering â€Œrâ€Œ elements from a total of â€Œnâ€Œ elements.

2.3. Counting with combinations

Combinations refer to the number of ways we can pick a set of â€Œkâ€Œ elements from a total â€Œnâ€Œ elements without regard to the ordering.

2.4. Sample space, events and probability axioms

This guide is about sample space, events (simple, compound and disjoint) and the three axioms of probability.

2.5. Conditional probability

Given two events A and B, the conditional probability of B given A is the probability that B occurs given that A has already occurred.

2.6. Multiplication and addition rule

The multiplication rule is the rearranged version of the definition of conditional probability, and the addition rule takes into account double-counting of events.

2.7. Law of total probability

The law of total probability partitions the sample space, allowing us to compute marginal probabilities.

2.8. Bayes' theorem

Bayes's theorem is a mathematical formula to compute conditional probabilities of events.

# Chapter 3 - Random variables

3.1. Random variables

A random variable X is a function that associates a value x to every possible outcome in an experiment (sample space).

3.2. Expected value

The expected value of random variable X is a number that tells us the average value of X we expect to see when we perform a large number of independent repetitions of an experiment.

3.3. Properties of expected value

This guide goes over all the main properties of the expected value of random variables along with their proofs.

3.4. Variance

Variance is the average squared distance between a random variable and its mean, measuring the spread of the random variable's distribution.

3.5. Properties of variance

This guide goes over all the main properties of the variance of random variables along with their proofs.

3.6. Covariance

The covariance of two random variables is a measure of the linear relationship between them.

3.7. Correlation

The correlation coefficient is used to determine the linear relationship between two variables. It normalizes covariance values to fall within the range 1 (strong positive linear relationship) and -1 (strong negative linear relationship).

# Chapter 4 - Point estimation

4.1. Sample mean

The sample mean, which is computed as the average of the sample observations, is an unbiased estimator of the population mean.

4.2. Sample variance

Sample variance is an unbiased estimator for the population variance that can be computed by dividing the sum of squared differences from the mean by n-1.

4.3. Sample covariance

Sample covariance is an unbiased estimator of the population covariance and measures the association between two variables.

4.4. Sample correlation

Sample correlation is a quantity between -1 and 1 that measures the level of association between two variables.

4.5.1. Properties of estimators - bias

The bias of an estimator tells us how off its estimates are on average from the true population parameter.

4.5.2. Properties of estimators - mean squared error

The mean squared error of an estimator represents the average squared difference between the computed estimates and the true population parameter.

4.6. Central limit theorem

The central limit theorem states that regardless of the population distribution, the sampling distribution of the sample mean is approximately normal given a large sample size.

# Chapter 5 - Discrete probability distributions

5.1. Probability mass function

The probability mass function (PMF) assigns probabilities to every possible value of a discrete random variable.

5.2. Binomial distribution

The binomial distribution is a discrete probability distribution of obtaining exactly n successes out of repeated Bernoulli trials.

5.3. Geometric distribution

The geometric distribution is the discrete distribution of the number of trials to observe the first success in repeated independent Bernoulli trials.

5.4. Negative binomial distribution

The negative binomial distribution is the discrete distribution of the number of trials to observe the first r successes in repeated independent Bernoulli trials.

5.5. Hyper-geometric distribution

The hypergeometric distribution is a discrete distribution of the number of successes in a sequence of trials without replacement.

5.6. Poisson distribution

The Poisson distribution models the number of events occurring within a given time interval.

Published by Isshin Inada

Edited by 0 others

Did you find this page useful?

thumb_up

thumb_down

Comment

Citation

Ask a question or leave a feedback...

thumb_up

2

thumb_down

0

chat_bubble_outline

0

settings

Enjoy our search

Hit / to insta-search docs and recipes!