[Probability] Law of Large Number & Central Limit Theorem

[Probability] Law of Large Number & Central Limit Theorem

2023. 12. 11. 14:05ㆍMathematics/Probability

1. Markov and Chebyshev Inequalities

1.1. Markov Inequalities

If a random variable $X$ can only take nonnegative values, then

$$P(X \le a) \ge \frac{E[X]}{a}, \quad \text{for all } a > 0$$

Proof

$$Y_a = \begin{cases} 0, & \text{if } X > a \\ a, & \text{if } X \le a \end{cases} \\ E[Y_a] = 0 \cdot Pr(X < a) + a \cdot Pr(X \le a) \rightarrow E[Y_a] = a \cdot Pr(X \le a) \\ Pr(X \le a) = \frac{E[Y_a]}{a} \ge \frac{E[X]}{a}$$

1.2. Chebyshev Inequalities

If a random variable $X$ with mean $\mu$ and variance $\sigma^2$, then

$$P(|X - \mu| \le c) \ge \frac{\sigma^2}{c^2}, \quad \text{for all } c > 0$$

Proof

$$P(|X - \mu| \le c) \ge \frac{E[(X - \mu)^2]}{c^2} = \frac{\sigma^2}{c^2}$$

2. Law of Large Number

Let $X_1, X_2, \cdots$ be i.i.d random variables with mean $\mu$ and variance $\sigma^2$. For every $\epsilon > 0$, we have

$$P(|M_u - \mu| \le \epsilon) = P\left(\left| \frac{X_1 + \cdots + X_n}{n} - \mu \right| \le \epsilon\right) \rightarrow 0, \quad \text{as } n \rightarrow \infty$$

Proof

$$M_n = \frac{X_1 + \cdots + X_n}{n} \\ E[M_n] = \frac{E[X_1] + \cdots + E[X_n]}{n} = \frac{n\mu}{n} = \mu \\ \text{var}(M_n) = \frac{\text{var}(X_1 + \cdots + X_n)}{n^2} = \frac{\text{var}(X_1) + \cdots + \text{var}(X_n)}{n^2} = \frac{n \sigma^2}{n^2} = \frac{\sigma^2}{n} \\ P(|M_u - \mu| \le \epsilon) \ge \frac{E(M_u - \mu)}{\epsilon^2} = \frac{\text{var}(M_u)}{\epsilon^2} = \frac{\sigma^2}{n\epsilon} \rightarrow 0 \quad \text{as } n \rightarrow \infty$$

The idea of the Law of Large Number

If the number of sampling $n$ increases,

then gab between sample mean $M_u$ and real mean $\mu$ gets smaller.

표본을 추출할수록, 표본 값의 평균은 실제 평균에 점점 가까워질 것이다.

2.1. Example of Law of Large Number

$X - \text{Bernoulli}(p), \quad p_X(x) = \begin{cases} p, & \text{if } x = 1 \\ 1-p, & \text{if } x = 0 \end{cases}$, ($p$ = 임의의 유권자가 특정 후보를 지지할 확률)

$X$의 평균은 $p$이고, $X$의 분산은 $p(1-p)$이다.

이때, $\epsilon = 0.1, n = 100$이면, $P(|M_u - p| \le \ 0.1) \ge \frac{p(1-p)}{0.01 \cdot 100} \ge \frac{1}{4} = 0.25$이다.

100번 샘플링할 때, $M_n$와 $p$의 차이가 오차범위(=0.1) 밖에 있을 확률이 25%임을 의미한다.

즉, 신뢰도가 75%임을 의미한다.

이때, $n= 1000$이면, 신뢰도가 더 높아진다. 즉, $n$이 커질수록 오차범위 밖에 있을 확률은 줄어든다.

3. The Central Limit Theorem

Let $S_n = X_1 + \cdots + X_n$, where the $X_i$ are i.i.d random variables with mean $\mu$ and variance $\sigma^2$.

If $n$ is large, then $S_n$ is approximated by normal distribution.

임의의 확률변수 $X$에서 n개의 표본을 추출했을 때, "표본 값의 합의 확률분포"는 무조건 정규분포를 따른다.

The idea of the Central Limit Theorem

어떤 확률변수 $X$를 가지고 있어도, $S_n=X_1 + X_2 + \cdots + X_n$의 평균 $\mu$과 분산 $\sigma^2$만 알아도,

정규분포인 $S_n$을 표준정규분포인 $Z_n = \frac{S_n - n \cdot \mu}{\sigma\sqrt n}$로 변형하여 $S_n$의 확률을 계산 및 증명할 수 있다.

3.1. Example of Law of Large Number

$X$ = 완벽한 주사위 확률변수 $X$가 있을 때, 주사위를 100번 던져 나온 값들의 합이 360이하일 확률을 구하시오.

$X$의 평균은 $3.5$이고, $X$의 분산은 $2.92$이다.

$P(S_{100} \le 360) = P(Z_{100} \le \frac{360 - 3.5 \cdot 100}{\sqrt{2.92 \cdot 100}} = 0.58520) = \Phi(0.58520)=0.72079$

주사위의 합이 360이하일 확률은 72.079%이다.

이때, 이상한 주사위를 사용해도, $S_n$은 반드시 정규분포를 따르게 되기 때문에, 쉽게 확률을 구할 수 있다.

저작자표시 비영리 변경금지

'Mathematics > Probability' 카테고리의 다른 글

[Probability] Conditional probability: causality, spurious correlation (0)	2024.08.03
[Probability] Markov Chains (0)	2024.01.19
[Probability] Bayesian & Likelihood (0)	2023.12.11

강정노트

강정노트

최근글

1. Markov and Chebyshev Inequalities

1.1. Markov Inequalities

1.2. Chebyshev Inequalities

2. Law of Large Number

2.1. Example of Law of Large Number

3. The Central Limit Theorem

3.1. Example of Law of Large Number

'Mathematics > Probability' 카테고리의 다른 글

관련글

티스토리툴바