1) Overview:
Statistics.
Probability and Distribution.
More Distributions and The Central Limit Theorem.
Correlation and Hypothesis Testing.
2) Details:
Statistics
Measures of Center.
Statistics.
Measures of Spread.
Probability and Distribution
Conditional Probability.
Dependent vs. Independent events.
Given ...., the probability of ...
Discrete Distribution.
Sample mean vs. Theorical mean.
As the size of the sample increases, the sample mean will approach the expected value.
Continuous Distribution.
Normal Distribution.
Uniform Distribution.
More Distributions
Normal Distribution.
Skewed (lệch).
Kurtosis (độ đỉnh/ phẳng).
Binomial Distribution.
Used for independent events producing binary outcomes.
Counts the number of successes in independent events.
Determined by the parameters n (total nb of events) and p (probability of success).
Expected value = n x p
The Central Limit Theorem
Central Limit Theorem.
The sampling distribution of a statistic becomes closer to the normal distribution as the size of the sample increases.
The Poison Distribution
Poison Processes:
Average number of events in a period is known but the time and space between events is random.
Ex: number of website visits in a day.
Poison Distribution:
Probability of some number of events occurring over a fixed period of time.
Ex: Probability of less than 200 visits to a website in a day.
Lambda (λ):
The average number of events per time interval
Or: the expected value of the distribution.
Difference to Normal Distribution:
The normal distribution is continuous and symmetric, used for modeling continuous data.
The Poisson distribution is discrete and skewed, used for modeling count data involving events occurring over intervals.
Correlation and Hypothesis Testing
Hypothesis testing.
Null Hypothesis: This is the default assumption that there is no effect, no difference, or no relationship. It represents the status quo or the absence of an effect.
Alternative Hypothesis: This is the statement that contradicts the null hypothesis. It suggests that there is an effect, a difference, or a relationship in the data.
Experiments.
A subset of hypothesis testing.
Ex: what is the effect of treatment (independent) on response (dependent)