168 lines
8.9 KiB
TeX
168 lines
8.9 KiB
TeX
|
\documentclass[全部作业]{subfiles}
|
|||
|
\input{mysubpreamble}
|
|||
|
\begin{document}
|
|||
|
\setcounter{chapter}{7}
|
|||
|
\setcounter{section}{3}
|
|||
|
\section{似然比检验与分布拟合检验}
|
|||
|
\begin{enumerate}
|
|||
|
\questionandanswer[3]{
|
|||
|
设$x_1,x_2, \cdots ,x_n$为来自指数分布$\operatorname{Exp}(\lambda_1)$的样本,$y_1,y_2, \cdots ,y_m$为来自指数分布$\operatorname{Exp}(\lambda_1)$的样本,且两组样本独立,其中$\lambda_1,\lambda_2$是未知的正参数。
|
|||
|
}{}
|
|||
|
\begin{enumerate}
|
|||
|
\questionandanswerSolution[]{
|
|||
|
求假设$H_0:\lambda_1=\lambda_2 \quad\mathrm{vs}\quad H_1:\lambda_1\neq \lambda_2$的似然比检验;
|
|||
|
}{
|
|||
|
参数空间为$\Theta_0 = \{ (\lambda_1,\lambda_2)| \lambda_1 = \lambda_2 >0 \}$,$\Theta = \{ (\lambda_1,\lambda_2)|\lambda_1>0, \lambda_2>0 \}$。最大似然估计为
|
|||
|
$$
|
|||
|
\hat{\lambda_1} = \frac{n}{\sum_{i=1}^{n} x_i}, \hat{\lambda_2} = \frac{m}{\sum_{i=1}^{m} y_i}, \hat{\lambda_0}=\frac{n+m}{\sum_{i=1}^{n} x_i + \sum_{i=1}^{m} y_i}
|
|||
|
$$
|
|||
|
所以似然比检验为
|
|||
|
$$
|
|||
|
\Lambda = \frac{\left( \frac{n}{\sum_{i=1}^{n} x_i} \right) ^{n} \left( \frac{m}{\sum_{i=1}^{m} y_i} \right) ^{m}}{\left( \frac{n+m}{\sum_{i=1}^{n} x_i + \sum_{i=1}^{m} y_i} \right) ^{n+m}}
|
|||
|
$$
|
|||
|
}
|
|||
|
\questionandanswerProof[]{
|
|||
|
证明上述检验法的拒绝域仅依赖于比值 $\displaystyle \left. \sum_{i=1}^{n} x_i \middle/ \sum_{i=1}^{n} y_i \right.$;
|
|||
|
}{
|
|||
|
此检验的拒绝域为
|
|||
|
$$
|
|||
|
\{ \Lambda \geqslant c \}= \left\{ \left. \sum_{i=1}^{n} x_i \middle/ \sum_{i=1}^{n} y_i \right. \leqslant \cdot \text{或} \left. \sum_{i=1}^{n} x_i \middle/ \sum_{i=1}^{n} y_i \right. \geqslant \cdot \right\}
|
|||
|
$$
|
|||
|
这说明仅依赖于比值$\displaystyle \left. \sum_{i=1}^{n} x_i \middle/ \sum_{i=1}^{n} y_i \right.$。
|
|||
|
}
|
|||
|
\questionandanswerSolution[]{
|
|||
|
求统计量 $\displaystyle \left. \sum_{i=1}^{n} x_i \middle/ \sum_{i=1}^{n} y_i \right. $在原假设成立下的分布。
|
|||
|
}{
|
|||
|
因为 $\sum_{i=1}^{n} x_i \sim \operatorname{Ga}(n, \lambda_1)$, $\sum_{i=1}^{m} y_i \sim \operatorname{Ga}(m, \lambda_2)$,所以在原假设成立下,
|
|||
|
$$
|
|||
|
\left. \sum_{i=1}^{n} x_i \middle/ \sum_{i=1}^{n} y_i \right. \sim F(2n, 2m)
|
|||
|
$$
|
|||
|
}
|
|||
|
\end{enumerate}
|
|||
|
\questionandanswerProof[4]{
|
|||
|
设$x_1,x_2, \cdots ,x_n$为来自正态总体$N(\mu,\sigma^{2})$的 i.i.d. 样本,其中$\mu,\sigma^{2}$未知。证明关于假设$H_0:\mu\leqslant \mu_0 \quad\mathrm{vs}\quad H_1:\mu>\mu_0$的单侧$t$检验是似然比检验(显著性水平$\alpha < \frac{1}{2}$)。
|
|||
|
}{
|
|||
|
似然比统计量为
|
|||
|
$$
|
|||
|
\Lambda = \frac{(2\pi \hat{\sigma})^{-\frac{n}{2}} \exp (-\frac{n}{2})}{(2\pi \hat{\sigma}_0^{2})^{-\frac{n}{2}}\exp (-\frac{n}{2})}
|
|||
|
$$
|
|||
|
拒绝域为 $\displaystyle \{ \Lambda\geqslant c \}=\left\{ \frac{\sqrt{n}(\bar{x}-\mu_0)}{s}\geqslant t_{1-\alpha}(n-1) \right\} $,这说明似然比检验此时就是单侧$t$检验。
|
|||
|
}
|
|||
|
\questionandanswerSolution[6]{
|
|||
|
掷一颗骰子60次,结果如下
|
|||
|
\begin{center}
|
|||
|
\begin{tabular}{ccccccc}
|
|||
|
\toprule
|
|||
|
点数 & 1 & 2 & 3 & 4 & 5 & 6 \\
|
|||
|
\midrule
|
|||
|
次数 & 7 & 8 & 12 & 11 & 9 & 13 \\
|
|||
|
\bottomrule
|
|||
|
\end{tabular}
|
|||
|
\end{center}
|
|||
|
试在显著性水平为0.05下检验这颗骰子是否均匀。
|
|||
|
}{
|
|||
|
这是分布拟合优度检验:
|
|||
|
$$
|
|||
|
\chi^{2} = \sum_{i=1}^{6} \frac{(n_i - 10)^{2}}{10}=2.8, \quad W=\{ \chi^{2}\geqslant \chi^{2}_{0.95}(5)=11.0705 \}
|
|||
|
$$
|
|||
|
所以不拒绝原假设,即认为这颗骰子均匀。
|
|||
|
}
|
|||
|
\questionandanswerSolution[9]{
|
|||
|
在一批灯泡中抽取300只作寿命试验,其结果如下:
|
|||
|
\begin{center}
|
|||
|
\begin{tabular}{ccccc}
|
|||
|
\toprule
|
|||
|
寿命(h) & <100 & [100,200) & [200,300) & $\geqslant 300$ \\
|
|||
|
% \midrule
|
|||
|
灯泡数 & 121 & 78 & 43 & 58 \\
|
|||
|
\bottomrule
|
|||
|
\end{tabular}
|
|||
|
\end{center}
|
|||
|
在显著性水平为0.05下能否认为灯泡寿命服从指数分布$\operatorname{Exp}(0.005)$?
|
|||
|
}{
|
|||
|
也是分布拟合优度检验。题目中寿命分为了四个区间,由于指数分布的累计分布函数为$e^{-\lambda t}$,所以当$\lambda=0.005$时这四个区间的的概率$p$以及$np$分别为
|
|||
|
$$
|
|||
|
p=diff([e^{-300 \lambda}, e^{-200 \lambda}, e^{-100 \lambda},1]) \approx [0.2231, 0.1447, 0.2387, 0.3935]
|
|||
|
$$
|
|||
|
$$
|
|||
|
np = 300 \times [0.2231, 0.1447, 0.2387, 0.3935] \approx [66.93, 43.41, 71.61, 118.05]
|
|||
|
$$
|
|||
|
所以$\chi^{2} = \sum_{axis=0} \frac{(x-np)^{2}}{np} \approx 1.8393$,拒绝域为$\{ \chi^{2}\geqslant \chi^{2}_{0.995}(3)\approx 7.8147 \}$,所以不能拒绝原假设,所以认为灯泡寿命服从指数分布 $\operatorname{Exp}(0.005)$。
|
|||
|
|
|||
|
}
|
|||
|
\questionandanswerSolution[10]{
|
|||
|
下表是上海1875年到1955年的81年间,根据其中63年观察到的一年中(5月到9月)下暴雨次数的整理资料
|
|||
|
\begin{center}
|
|||
|
\begin{tabular}{ccccccccccc}
|
|||
|
\toprule
|
|||
|
$i$ & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & $\geqslant 9$ \\
|
|||
|
\midrule
|
|||
|
$n_i$ & 4 & 8 & 14 & 19 & 10 & 4 & 2 & 1 & 1 & 0 \\
|
|||
|
\bottomrule
|
|||
|
\end{tabular}
|
|||
|
\end{center}
|
|||
|
试检验一年中暴雨次数是否服从泊松分布($\alpha=0.05$)。
|
|||
|
}{
|
|||
|
由于泊松分布的参数的矩估计和最大似然估计是一样的,所以这里只需要计算样本的均值即 $\sum_{i=0}^{9} ( n_i \times i )/ 63 = 2.8571$,即为$\hat{\lambda}$。
|
|||
|
|
|||
|
为了满足每一类的样本观测次数不小于5,需要合并$i\leqslant 1$和$i\geqslant 5$。
|
|||
|
|
|||
|
之后计算 $\sum_{k=1}^{5} (n_k - n \hat{p_{k}})^{2} / n \hat{p_{k}}\approx 2.4995$,拒绝域为 $W=\{ \chi^{2}\geqslant \chi^{2}_{0.95}(5-1-1)\approx 7.8147 \}$,所以不能拒绝原假设,所以可以认为一年中暴雨次数服从泊松分布。
|
|||
|
|
|||
|
}
|
|||
|
\questionandanswerProof[12]{
|
|||
|
设按有无特性A与B将$n$个样品分成四类,组成$2\times 2$列联表:
|
|||
|
\begin{center}
|
|||
|
\begin{tabular}{c|cc|c}
|
|||
|
\toprule
|
|||
|
$ $ & $B$ & $\bar{B}$ & 合计 \\
|
|||
|
\hline
|
|||
|
$A$ & $a$ & $b$ & $a+b$ \\
|
|||
|
$\bar{A}$ & $c$ & $d$ & $c+d$ \\
|
|||
|
\hline
|
|||
|
合计 & $a+c$ & $b+d$ & $n$ \\
|
|||
|
\bottomrule
|
|||
|
\end{tabular}
|
|||
|
\end{center}
|
|||
|
其中$n=a+b+c+d$,试证明此时列联表独立性检验的$\chi^{2}$统计量可以表示成
|
|||
|
$$
|
|||
|
\chi^{2} = \frac{n(ad-bc)^{2}}{(a+b)(c+d)(a+c)(b+d)}
|
|||
|
$$
|
|||
|
}{
|
|||
|
对于$a$,最大似然估计为$\frac{(a+c)(a+b)}{n^{2}}$,同理可以计算其他参数的最大似然估计,所以检验统计量为
|
|||
|
$$
|
|||
|
\begin{aligned}
|
|||
|
&\chi^{2} = \frac{\left( a-\frac{(a+b)(a+c)}{n} \right) ^{2}}{\frac{(a+b)(a+c)}{n}} + \frac{\left( b - \frac{(a+b)(b+d)}{n} \right)^{2} }{\frac{(a+b)(b+d)}{n}} + \frac{\left( c-\frac{(a+c)(c+d)}{n} \right) ^{2}}{\frac{(a+c)(c+d)}{n}} + \frac{\left( d-\frac{(c+d)(b+d)}{n} \right) ^{2}}{\frac{(c+d)(b+d)}{n}} \\
|
|||
|
&= \frac{\begin{split}
|
|||
|
(a + b) (a + c) (d n - (b + d) (c + d))^{2} + (a + b) (b + d) (c n - (a + c) (c + d))^{2}\\ + (a + c) (c + d) (b n - (a + b) (b + d))^{2} + (b + d) (c + d) (a n - (a + b) (a + c))^{2}
|
|||
|
\end{split}}{n (a + b) (a + c) (b + d) (c + d)} \\
|
|||
|
&= \frac{n(ad-bc)^{2}}{(a + b) (a + c) (b + d) (c + d)} \\
|
|||
|
\end{aligned}
|
|||
|
$$
|
|||
|
}
|
|||
|
\questionandanswerSolution[13]{
|
|||
|
在研究某种新措施对猪白痢的防治效果问题时,获得了如下数据:
|
|||
|
\begin{center}
|
|||
|
\begin{tabular}{c|cc|c|c}
|
|||
|
\toprule
|
|||
|
& 存活数 & 死亡数 & 合计 & 死亡率 \\
|
|||
|
\hline
|
|||
|
对照 & 114 & 36 & 150 & 24\% \\
|
|||
|
新措施 & 132 & 18 & 150 & 12\% \\
|
|||
|
\hline
|
|||
|
合计 & 246 & 54 & 300 & 18\% \\
|
|||
|
\bottomrule
|
|||
|
\end{tabular}
|
|||
|
\end{center}
|
|||
|
试问新措施对防治该种疾病是否有显著疗效($\alpha=0.05$)?
|
|||
|
}{
|
|||
|
原假设为新措施对该种疾病无显著疗效。
|
|||
|
根据第12题计算统计量
|
|||
|
$$
|
|||
|
\chi^{2} = \frac{300\times (114\times 18 - 132\times 36)^{2}}{(114+36)(36+18)(18+132)(132+114)} = \frac{300}{41} \approx 7.31707317073171
|
|||
|
$$
|
|||
|
此时$r=c=2$,所以$(r-1)(c-1)=1$,所以$\chi^{2}_{0.95}(1)=3.8415$,所以拒绝域为$\{ \chi^{2}\geqslant 3.8415 \}$,所以拒绝原假设,所以新措施对防治该种疾病有显著疗效。
|
|||
|
|
|||
|
}
|
|||
|
\end{enumerate}
|
|||
|
\end{document}
|