# A statistical approach to estimating station biases and error levels

Yael Radzyner, Meirav Galun and Boaz Nadler
April 2023
__Statistical Model for Station Biases and Error Levels__

Consider a seismic network of
`N` monitoring stations.

Suppose a seismic event *j* of unknown (true) magnitude `Y`_{j} was
detected by a subset of stations *S*_{j}. Denote by *X*_{i,j} the reported magnitudes
by these stations, *i* ∈* S*_{j}.

The currently employed standard method to compute a network estimate of the event magnitude is by
a simple averaging (excluding outliers) of the reported magnitudes

*Y*_{net}(event *j*) =
$\frac{\text{1}}{\text{|\hspace{0.17em}Sj\hspace{0.17em} |}}$
∑_{i∈ Sj}*
X*_{i,j}

The above formula implicitly assumes that stations have very small (if any) systematic errors (biases) and that all stations have the same (or very similar) error levels, i.e.,
the random errors in their estimated event magnitudes all have the same
standard deviations. Namely the above averaging is consistent with the following model,

*X*_{i,j} = *Y*_{j} + *σ ξ*_{i,j}

where *σ* is the error level at all stations and *ξ*_{i,j} are random variables with zero mean and unit variance.

As we illustrate in our manuscript, using a large dataset from the REB, empirical data is inconsistent with the above model.

In our work we instead consider the following model. Each station *i* has two unknown parameters: a station bias *b*_{i} and a station error level *σ*_{i}.

We assume that for an event *j* of unknown magnitude *Y*_{j}, for each station *i* that reported a magnitude for the event, the follows holds

*X*_{i,j} = *Y*_{j} + *b*_{i} + *σ*_{i} ξ_{i,j}

Given a large collection of reported station magnitudes *X*_{i,j} over a large set of events *j* the goal is to estimate the vector of station biases
*b* = (*b*_{1},···,b_{N})
and the vector of station error levels
*σ* = (*σ*_{1},···,σ_{N})

__Inference Method__

The key idea in our approach to estimate the vectors *b* and *σ* is to construct quantitites that do not depend on the unknown event magnitudes *Y*_{j}, which
may be viewed as nuisance parameters.

To this end we consider for any event *j* and for any pair of stations *i,k* that detected this event, the following quantity

*X*_{i,j} - X_{k,j}
=
b_{i} - b_{k}
+ σ_{i}ξ_{i,j} - σ_{k}ξ_{k,j}

Conveniently in the above quantity the unknown magnitude *Y*_{j} has been cancelled out.

**Error Level Estimation:**

Let *v* = (σ_{1}^{2},···,
σ_{N}^{2}).
Based on the above, we propose to estimate *v* by minimizing the following
functional

*T*(*v*) = ∑_{i≠k} |*M*_{i,k} | ·
( * V*_{i,k} - v_{i} - v_{k} )^{2}
where *V*_{i,k} = Var( *X*_{i,j} - X_{k,j} )
is the empirical variance of the above differences, over all events *j* jointly detected
by stations *i,k* and
and *M*_{i,k} is the total number of such events jointly detected by stations *i,k*
In the above the summation is only over pairs of stations for which |*M*_{i,k}| ≥ 2.
The above is a quadratic objective, and its solution is given by the solution of an *N×N*
system of linear equations.

**Stations' Biases Estimation:**

Similarly, to estimate the station biases we first compute the following empirical means
*D*_{i,k} =
$\frac{\text{1}}{\text{|\hspace{0.17em}Mi,k\hspace{0.17em} |}}$
·
∑_{ j ∈ Mi,k } (
*X*_{i,j} - X_{k,j} )

We then propose to estimate the station biases by minimizing the following objective
*G* (*b*) =
$\frac{\text{|Mi,k|}}{\text{\sigma i2 + \sigma k2}}$
·
∑_{i≠k}
(* D*_{i,k} - b_{i} + b_{k} )^{2}

Since the true error levels are unknown, in the above we replace *σ*_{i}^{2}
by *v*_{i}, estimated by the solution of the linear system described above.
Estimating the station biases also amounts to solving an *N× N* system of linear equations. Importantly, this system is rank defficient. Assuming the graph of stations having at least 2 jointly detected events is connected the rank defficiently is one. We thus find the solution of minimal norm,
and then remove the remaining single degree of freedom by the normalization
∑_{i} *b*_{i} = 0.

**Python Code and Demo:**

The following Python script generates random data from a network of *N* monitoring stations
and then estimates the station error levels and biases as described above.

demo_seismic_station_bias_error_levels.py

Download Python source