# A statistical approach to estimating station biases and error levels

April 2023

Statistical Model for Station Biases and Error Levels

Consider a seismic network of N monitoring stations.
Suppose a seismic event j of unknown (true) magnitude Yj   was detected by a subset of stations Sj. Denote by Xi,j the reported magnitudes by these stations, i ∈ Sj.
The currently employed standard method to compute a network estimate of the event magnitude is by a simple averaging (excluding outliers) of the reported magnitudes

Ynet(event j) = $\frac{\text{1}}{\text{|\hspace{0.17em}Sj\hspace{0.17em} |}}$i Sj Xi,j

The above formula implicitly assumes that stations have very small (if any) systematic errors (biases) and that all stations have the same (or very similar) error levels, i.e., the random errors in their estimated event magnitudes all have the same standard deviations. Namely the above averaging is consistent with the following model,

Xi,j = Yj + σ ξi,j

where σ is the error level at all stations and ξi,j are random variables with zero mean and unit variance.
As we illustrate in our manuscript, using a large dataset from the REB, empirical data is inconsistent with the above model.

In our work we instead consider the following model. Each station i has two unknown parameters: a station bias bi and a station error level σi.
We assume that for an event j of unknown magnitude Yj, for each station i that reported a magnitude for the event, the follows holds

Xi,j = Yj + bi + σi ξi,j

Given a large collection of reported station magnitudes Xi,j over a large set of events j the goal is to estimate the vector of station biases b  =  (b1,···,bN)  and the vector of station error levels σ  =  (σ1,···,σN)

Inference Method

The key idea in our approach to estimate the vectors b and σ is to construct quantitites that do not depend on the unknown event magnitudes Yj, which may be viewed as nuisance parameters.
To this end we consider for any event j and for any pair of stations i,k that detected this event, the following quantity
Xi,j - Xk,j   =   bi - bk + σiξi,j - σkξk,j

Conveniently in the above quantity the unknown magnitude Yj has been cancelled out.
Error Level Estimation:
Let v  =   (σ12,···, σN2). Based on the above, we propose to estimate v by minimizing the following functional
T(v) = ∑i≠k |Mi,k | · ( Vi,k - vi - vk )2
where Vi,k = Var( Xi,j - Xk,j ) is the empirical variance of the above differences, over all events j jointly detected by stations i,k and and Mi,k is the total number of such events jointly detected by stations i,k In the above the summation is only over pairs of stations for which |Mi,k| ≥ 2. The above is a quadratic objective, and its solution is given by the solution of an N×N system of linear equations.

Stations' Biases Estimation:

Similarly, to estimate the station biases we first compute the following empirical means
Di,k   =   $\frac{\text{1}}{\text{|\hspace{0.17em}Mi,k\hspace{0.17em} |}}$ · ∑ j ∈ Mi,k   (  Xi,j - Xk,j )

We then propose to estimate the station biases by minimizing the following objective
G (b)   =   $\frac{\text{|Mi,k|}}{\text{σi2 + σk2}}$ · ∑i≠k   ( Di,k - bi + bk )2

Since the true error levels are unknown, in the above we replace σi2 by vi, estimated by the solution of the linear system described above. Estimating the station biases also amounts to solving an N× N system of linear equations. Importantly, this system is rank defficient. Assuming the graph of stations having at least 2 jointly detected events is connected the rank defficiently is one. We thus find the solution of minimal norm, and then remove the remaining single degree of freedom by the normalization ∑i bi   =   0.
Python Code and Demo:
The following Python script generates random data from a network of N monitoring stations and then estimates the station error levels and biases as described above.
demo_seismic_station_bias_error_levels.py