The geometric model of visual meteor magnitudes is a commonly used statistical approach to describe the magnitude distribution of a meteor shower. The observable magnitude distribution of meteors is then \[ {\displaystyle P[M = m] \sim f(m) \, \mathrm r^{-m}} \,\mathrm{,} \] where \(m \ge -0.5\) denotes the difference between the limiting magnitude and the meteor magnitude, and \(f(m)\) is the perception probability function.
The estimation of the population index r, briefly called the r-value, is a common task in the evaluation of meteor magnitudes. Here we demonstrate two methods for unbiased estimation of this parameter.
First, we obtain some magnitude observations from the example data set, which also includes the limiting magnitude.
observations <- with(PER_2015_magn$observations, {
idx <- !is.na(lim.magn) & sl.start > 135.81 & sl.end < 135.87
data.frame(
magn.id = magn.id[idx],
lim.magn = lim.magn[idx]
)
})
head(observations, 5) # Example values
magn.id | lim.magn |
---|---|
225413 | 5.30 |
225432 | 5.95 |
225438 | 6.01 |
225449 | 6.48 |
225496 | 5.50 |
Next, the observed meteor magnitudes are matched with the corresponding observations. This is necessary as we need the limiting magnitudes of the observations to determine the r-value.
Using
magnitudes <- with(new.env(), {
magnitudes <- merge(
observations,
as.data.frame(PER_2015_magn$magnitudes),
by = 'magn.id'
)
magnitudes$magn <- as.integer(as.vector(magnitudes$magn))
subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5)
})
head(magnitudes, 5) # Example values
we obtain a data frame with the absolute observed frequencies
Freq
for each observation of a magnitude class. The
expression
subset(magnitudes, (magnitudes$lim.magn - magnitudes$magn) > -0.5
ensures that meteors fainter than the limiting magnitude are not used if
they exist.
magn.id | lim.magn | magn | Freq | |
---|---|---|---|---|
9 | 225413 | 5.30 | 4 | 1.0 |
11 | 225413 | 5.30 | 1 | 2.0 |
14 | 225413 | 5.30 | 3 | 3.0 |
15 | 225432 | 5.95 | 4 | 2.0 |
17 | 225432 | 5.95 | 3 | 1.5 |
This data frame contains a total of 97 meteors. This is a sufficiently large number to estimate the r-value.
The maximum likelihood method can be used to estimate the r-value in
an unbiased manner. For this, the function dvmgeom()
is
needed, which returns the probability density of the observable meteor
magnitudes when the r-value and the limiting magnitudes are known.
The following algorithm estimates the r-value by maximizing the
likelihood with the optim()
function. The function
ll()
returns the negative log-likelihood, as
optim()
identifies a minimum.
# maximum likelihood estimation (MLE) of r
result <- with(magnitudes, {
# log likelihood function
ll <- function(r) -sum(Freq * dvmgeom(magn, lim.magn, r, log=TRUE))
r.start <- 2.0 # starting value
r.lower <- 1.2 # lowest expected value
r.upper <- 4.0 # highest expected value
# find minimum
optim(r.start, ll, method='Brent', lower=r.lower, upper=r.upper, hessian=TRUE)
})
This gives the expected value and the variance of the r-value:
print(result$par) # mean of r
#> [1] 2.344528
print(1/result$hessian[1][1]) # variance of r
#> [1] 0.01718636
We can additionally visualize the likelihood function here.
with(new.env(), {
data.plot <- data.frame(r = seq(2.0, 2.8, 0.01))
data.plot$ll <- mapply(function(r){
with(magnitudes, {
# log likelihood function
sum(Freq * dvmgeom(magn, lim.magn, r, log = TRUE))
})
}, data.plot$r)
data.plot$l <- exp(data.plot$ll - max(data.plot$ll))
data.plot$l <- data.plot$l / sum(data.plot$l)
plot(data.plot$r, data.plot$l,
type = "l",
col = "blue",
xlab = "r",
ylab = "likelihood"
)
abline(v = result$par, col = "red", lwd = 1)
})
The likelihood function is approximately a normal distribution in this case. This is important in this context because the variance of the estimated r-value is derived from the curvature (the second derivative at the maximum) of the log-likelihood function.
Estimation based on the maximum likelihood principle is
computationally demanding. As an alternative, a variance-stabilizing
transformation can be applied. This transformation maps meteor
magnitudes onto a different scale, yielding a distribution whose
variance no longer depends on the parameter r
.
The variance-stabilizing transformation has the following additional advantages:
n
, but not on the true r-value,The resulting procedure is straightforward: it suffices to compute
the mean of the transformed meteor magnitudes, from which an estimate of
the parameter r
is obtained.
tm.mean <- with(magnitudes, {
N <- sum(Freq)
tm <- vmgeomVstFromMagn(magn, lim.magn)
tm.mean <- sum(Freq * tm)/N
tm.var <- sum(Freq * (tm - tm.mean)^2)/(N-1)
tm.mean.var <- tm.var / N
list('val' = tm.mean, 'var' = tm.mean.var, 'sd' = sqrt(tm.mean.var))
})
Thus, one obtains the mean and the variance of the mean of
tm
.
print(paste('tm mean:', tm.mean$val))
#> [1] "tm mean: 4.65971707325243"
print(paste('tm var:', tm.mean$var))
#> [1] "tm var: 0.00639641463423595"
Using the bootstrap method, it can be assessed whether the mean is normally distributed.
# Bootstrapping Method
tm.means <- with(magnitudes, {
N <- sum(Freq)
tm <- vmgeomVstFromMagn(magn, lim.magn)
replicate(50000, {
mean(sample(tm, size = N, replace = TRUE, prob = Freq))
})
})
The graphical representation indicates that this is indeed approximately the case.
with(new.env(), {
tm.min <- tm.mean$val - 3 * tm.mean$sd
tm.max <- tm.mean$val + 3 * tm.mean$sd
tm.means <- subset(tm.means, tm.means > tm.min & tm.means < tm.max)
brks <- seq(min(tm.means) - 0.02, max(tm.means) + 0.02, by = 0.02)
hist(tm.means,
breaks = brks,
col = "skyblue",
border = "black",
main = "Histogram of mean tm",
xlab = "tm",
ylab = "count"
)
})
The corresponding transformation to the r-value likewise results in an approximate normal distribution.
with(new.env(), {
tm.min <- tm.mean$val - 3 * tm.mean$sd
tm.max <- tm.mean$val + 3 * tm.mean$sd
tm.means <- subset(tm.means, tm.means > tm.min & tm.means < tm.max)
r <- vmgeomVstToR(tm.means)
brks <- seq(min(r) - 0.02, max(r) + 0.02, by = 0.02)
hist(r,
breaks = brks,
col = "skyblue",
border = "black",
main = "Histogram of mean r",
xlab = "r",
ylab = "count"
)
abline(v = vmgeomVstToR(tm.mean$val), col = "red", lwd = 1)
})
Therefore, the first-order delta method can be applied to estimate the mean and its variance.
# estimation of r
result <- with(new.env(), {
r.hat <- vmgeomVstToR(tm.mean$val)
dr_dtm <- vmgeomVstToR(tm.mean$val, deriv.degree = 1L)
# Delta method: variance and standard error of r.hat
r.var <- (dr_dtm^2) * tm.mean$var
list('r.hat' = r.hat, 'r.var' = r.var)
})
This gives the expected value and the variance of the r-value:
So far, we have operated under the assumption that the real
distribution of meteor magnitudes is exponential and that the perception
probabilities are accurate. We now use the Chi-Square goodness-of-fit
test to check whether the observed frequencies match the expected
frequencies. Then, using the estimated r-value, we retrieve the relative
frequencies p
for each observation and add them to the data
frame magnitudes
:
We must also consider the probabilities for the magnitude class with the brightest meteors.
The smallest magnitude class magn.min
is -6. In
calculating the probabilities, we assume that the magnitude class -6
contains meteors that are either brighter or equally bright as -6 and
thus use the function pvmgeom()
to determine their
probability.
idx <- magnitudes$magn == magn.min
magnitudes$p[idx] <- with(
magnitudes[idx,],
pvmgeom(m = magn + 1L, lm = lim.magn, result$r.hat, lower.tail = TRUE)
)
This ensures that the probability of observing a meteor of any given magnitude is 100%. This is known as the normalization condition. Accordingly, the Chi-Square goodness-of-fit test will fail if this condition is not met.
We now create the contingency table magnitutes.observed
for the observed meteor magnitudes and its margin table.
magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
magnitutes.observed.mt <- margin.table(magnitutes.observed, margin = 2)
print(magnitutes.observed.mt)
#> magn
#> -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6
#> 0.0 0.0 0.0 0.0 3.0 4.0 7.0 10.0 23.0 26.5 20.0 3.0 0.5
Next, we check which magnitude classes need to be aggregated so that each contains at least 10 meteors, allowing us to perform a Chi-Square goodness-of-fit test.
The last output shows that meteors of magnitude class 0
or brighter must be combined into a magnitude class 0-
.
Meteors with a brightness less than 4
are grouped here in
the magnitude class 4+
, and a new contingency table
magnitudes.observed is created:
magnitudes$magn[magnitudes$magn <= 0] <- '0-'
magnitudes$magn[magnitudes$magn >= 4] <- '4+'
magnitutes.observed <- xtabs(Freq ~ magn.id + magn, data = magnitudes)
print(margin.table(magnitutes.observed, margin = 2))
#> magn
#> 0- 1 2 3 4+
#> 14.0 10.0 23.0 26.5 23.5
We now need the corresponding expected relative frequencies
magnitutes.expected <- xtabs(p ~ magn.id + magn, data = magnitudes)
magnitutes.expected <- magnitutes.expected/nrow(magnitutes.expected)
print(sum(magnitudes$Freq) * margin.table(magnitutes.expected, margin = 2))
#> magn
#> 0- 1 2 3 4+
#> 14.67835 13.54105 19.20148 21.13485 28.44428
and then carry out the Chi-Square goodness-of-fit test:
chisq.test.result <- chisq.test(
x = margin.table(magnitutes.observed, margin = 2),
p = margin.table(magnitutes.expected, margin = 2)
)
As a result, we obtain the p-value:
If we set the level of significance at 5 percent, then it is clear that the p-value with 0.4155366 is greater than 0.05. Thus, under the assumption that the magnitude distribution follows an geometric meteor magnitude distribution and assuming that the perception probabilities are correct (i.e., error-free or precisely known), the assumptions cannot be rejected. However, the converse is not true; the assumptions may not necessarily be correct. The total count of meteors here is too small for such a conclusion.
To verify the p-value, we also graphically represent the Pearson residuals:
chisq.test.residuals <- with(new.env(), {
chisq.test.residuals <- residuals(chisq.test.result)
v <- as.vector(chisq.test.residuals)
names(v) <- rownames(chisq.test.residuals)
v
})
plot(
chisq.test.residuals,
main="Residuals of the chi-square goodness-of-fit test",
xlab="m",
ylab="Residuals",
ylim=c(-3, 3),
xaxt = "n"
)
abline(h=0.0, lwd=2)
axis(1, at = seq_along(chisq.test.residuals), labels = names(chisq.test.residuals))