Chi-Squared GOF Test for finitely many values

Test Statistic $\sum_{j = 1}^{s} \frac{( N _{j} - n p _{j} ) ^{2}}{n p _{j}}$

data sequence size $n$
frequencies $n_{s}$ of Item Expression $ξ_{s}$ (in Random Variable $N_{j}$ )
probability of Item Expression $p_{j}$
Some data sequence from unknown CDF $F$ .
A known discrete CDF $G$

Given $H_{0}$ the Test Statistic is small. Thats why the p-Value is given by $p (t) = P (T_{n} \geq t)$

We can simulate the test statistic by n-fold drawing with replacement from balls with probability $p_{j}$ .

Or use the fact that for $n \to \infty$ the CDFs of the test statistic will converge to the Chi-Squared Distribution with $s - 1$ Degrees of Freedom. This approximation is only good if the Class Condition is met. Otherwise use simulation above.

Example Code

Estimate parameters with Plug-In-Method.

N = 6
n = 100
 
# Data with unknown probability
x = rbinom(n, 5, 0.8)
 
# Estimate probability
pt = mean(x) / N
 
# Use estimates proability to get "real" probabilities for all N outcomes
p = dbinom(0:N, N, pt)
 
# Calculuate the empirical relative frequencies (probabilities) for our data
tab = c()
for (i in 0:N){
  tab[i+1] = sum(x == i)
}
 
# Calculate Test statistic (more or less: difference between real and empirical probabilities)
T = sum((tab - n*p)**2 / (n*p))
T
 
# Use CDF of chi squared to determine p-Value of test statistic
pval = 1 - pchisq(T, df=N-1)
pval

Marcs Notes

Explorer

Chi-Squared GOF Test for finitely many values

Chi-Squared GOF Test for finitely many values

Example Code

Graphansicht

Inhaltsverzeichnis

Backlinks