Kolmogorov Smirnov Test

Should be used if data sequence is continous.

What can we test with this test?

GOF between two samples
If one of two samples is stochastically bigger or smaller than the other sample
GOF for one sample to a certain CDF
Using Lilliefors Test of Normality we can test if a sample comes from a class of normal distributions

GOF

The Null Hypothesis is $H_{0} : F = G$

The Test Statistic is given by the theorem of Glivenko-Cantelli with

$D_{n} (X_{1}, \dots, X_{n}) := sup_{t \in R} ∣ F_{n} (t; X_{1}, \dots, X_{n}) - F (t) ∣$

Given $H_{0}$ the test statistic should be small.

We can implement a supremum function in R like this:

sup = function(x, F){
  fn = ecdf(x)
  xi = c(knots(fn),Inf)
  eta = c(-Inf, knots(fn))
  max(abs(fn(xi) - F(xi)), abs(fn(eta) - F(xi)))
}

where knots gives the jump positions of a function. One can then use the function to get the supremum of the differences between the ECDF $x$ and a CDF like this:

sup(x, function(t) punif(t))

We have for the slightly changed Test Statistic (multiplied by $n$ ) : $lim_{n \to \infty} P (n D_{n} (X_{1}, \dots, X_{n}) \leq t) = K (t)$ where $K (t)$ is the Kolmogorov Distribution.

We can also test the Null Hypothesis $H_{0} : F \leq G$ with the Test Statistic $D_{n}^{+} := sup_{t \in R} (F_{n} (t; X_{1}, \dots, X_{n}) - F (t))$ for which we can simulate the Distribution with any continous distribution as it is independent on the underlying distribution if it is continous.

Now for large $n$ the distributon will have an explicit approximate form, a variation of the Kolmogorov Distribution: $K^{+} (t) = (1 - e^{- 2 t^{2}}) 1_{(0, \infty)} (t)$

How to perform the test

Formulate Null Hypothesis
Fix probability of Type 1 Error
Determine $1 - α$ Quantile of the Kolmogorov Distribution
Perform experiment and calculate Test Statistic
accept $H_{0}$ if $D_{n} (x_{1}, \dots, x_{n}) \leq \frac{K _{1 - α}}{n}$

Test of Equality of two continous distributions

$H_{0} : F = G$

Test Statistic

In R

Test if equal

x = rnorm(100)
y = rnorm(100, 0.2, 1.1)
 
ks.test(x=x, y=y, alternative="two.sided")

Test if equal to some CDF

x = rnorm(100)
 
ks.test(x=x, "norm", alternative="two.sided")

Test if less or greater:

x = rnorm(100)
y = rnorm(100, 0, -1)
 
ks.test(x=x, y=y, alternative="l")
ks.test(x=x, y=y, alternative="g")

Marcs Notes

Explorer

Kolmogorov Smirnov Test

Kolmogorov Smirnov Test

GOF

How to perform the test

Test of Equality of two continous distributions

In R

Graphansicht

Inhaltsverzeichnis

Backlinks