经验分布函数

经验分布函数（英語：empirical distribution function）是统计学中一个与样本经验测度有关的分布函数。该累积分布函数是在所有 $n$ 个数据点上都跳跃 $1/ n$ 的阶跃函数。对被测变量的某个值而言，该值的分布函数值表示所有观测样本中小于或等于该值的样本所占的比例。

经验分布函数是对用于生成样本的累积分布函数的估计。根据Glivenko–Cantelli定理（英语：Glivenko–Cantelli_theorem）可以证明，经验分布函数以概率1收敛至这一累积分布函数。

定义[编辑]

令 $(x 1, \dots, x n)$ 为独立同分布的的实随机变量，它们共同的累积分布函数为 $F (t)$ 。于是，经验分布函数可定义为 ^[1]^[2]

{\hat {F}}_{n}(t)={\frac {{\mbox{number of elements in the sample}}\leq t}{n}}={\frac {1}{n}}\sum _{i=1}^{n}\mathbf {1} _{x_{i}\leq t},

其中 $\mathbf {1} _{A}$ 为事件 $A$ 的指示函数。对给定的 $t$ ， $\mathbf {1} _{x_{i}\leq t}$ 是 $p = F (t)$ 时的伯努利随机变量。因而 $n{\hat {F}}_{n}(t)$ 则是期望为 $nF (t)$ 、方差为 $nF (t)(1 - F (t))$ 的二项随机变量。这意味着 ${\hat {F}}_{n}(t)$ 是 $F (t)$ 的无偏估计。

不过，有些文献中亦会将经验分布函数定义为^[3]^[4]

{\hat {F}}_{n}(t)={\frac {1}{n+1}}\sum _{i=1}^{n}\mathbf {1} _{x_{i}\leq t}.

参考文献[编辑]

^ van der Vaart, A.W. Asymptotic statistics. Cambridge University Press. 1998: 265. ISBN 0-521-78450-6.
^ PlanetMath 互联网档案馆的存檔，存档日期May 9, 2013，.
^ Coles, S. (2001) An Introduction to Statistical Modeling of Extreme Values. Springer, p. 36, Definition 2.4. ISBN 978-1-4471-3675-0.
^ Madsen, H.O., Krenk, S., Lind, S.C. (2006) Methods of Structural Safety. Dover Publications. p. 148-149. ISBN 0486445976

[vdv265-1] van der Vaart, A.W. Asymptotic statistics. Cambridge University Press. 1998: 265. ISBN 0-521-78450-6.

[2] PlanetMath 互联网档案馆的存檔，存档日期May 9, 2013，.

[3] Coles, S. (2001) An Introduction to Statistical Modeling of Extreme Values. Springer, p. 36, Definition 2.4. ISBN 978-1-4471-3675-0.

[4] Madsen, H.O., Krenk, S., Lind, S.C. (2006) Methods of Structural Safety. Dover Publications. p. 148-149. ISBN 0486445976

[1]

[2]

[3]

[4]