# Math Test

The intrinsic CAR (ICAR) model specifies the following set of conditional distributions for the

spatial random effect parameter:

$$

R_i = S_i \\

S_i \mid \bm{S}_{\smallsetminus i} \sim \mathcal{N}\left(\frac{1}{\sum_j{w_{ij}}}\sum_j{w_{ij}s_j}, \frac{\sigma_s^2}{\sum_j{w_{ij}}}\right)

$$

where $w_{ij}$ is the element of a spatial weights matrix $\bm{W}$ corresponding to row $i$ and column $j$ [@Besag91; @Besag74; @Lee11; @Best05]. The spatial weights matrix determines the spatial proximity between the random effects, and is most commonly defined as a binary, first-order, adjacency matrix, i.e.

$$

w_{ij} = \begin{cases}

1 & \mbox{if areas } i \mbox{ and } j \mbox{ are adjacent}\\

0 & \mbox{otherwise} \\

\end{cases}.

$$

This model implies that the conditional expectation of $S_i$ is equal to the mean of the random effects at neighbouring locations. The $S_i$ can be regarded as *structured* spatial random effects. The BYM model (named in honour of the authors @Besag91) is an extension of the ICAR model which includes an additional parameter for *unstructured* spatial random effects:

$$R_i = S_i + U_i$$

where

$$U_i \sim \mathcal{N}\left(0, \sigma_U^2\right).$$

Note that there are two identifiability issues with this model which have practical implications.

The first issue is the identifiability of the structured spatial random effects $S_i$. If an intercept term is included in the model, then these random effects must be constrained to sum to zero, i.e. $\sum_{i=1}^N{S_i} = 0$. Alternatively, if no intercept is included, then this constraint can be removed. Both parameterisations of the model will provide the same inference. However, note that software like WinBUGS automatically enforces this constraint, so an intercept should be included in this case. Note also that this identifiability issue applies equally to the ICAR model.

The second issue is a likelihood identifiability problem which arises because the two spatial random effects $S_i$ and $U_i$ are not *uniquely* identifiable; only their sum is identifiable [@Eberly]. However, there is a simple remedy which largely resolves this identifiability issue, which can be easily applied post-estimation. This solution requires modifying (or “correcting”) the parameters $S_i$ and $U_i$ by an amount that represents the “proportion of excess variation”, variation that cannot be explained by the rest of the model (including any covariates) and therefore ought to be included in the unstructured random effect parameter. Let $S_i^{(m)}$ and $U_i^{(m)}$ denote the estimate of the structured and unstructed SREs for the $m^{\text{th}}$ MCMC iteration respectively. Assuming a sample size of $M$, the modification is as follows:

$$

S_i^{(m)}:=S_i^{(m)} – \psi \cdot U_i^{(m)}, \\

U_i^{(m)}:=U_i^{(m)} + \psi \cdot U_i^{(m)}

$$

where $\psi$ is the proportion of excess variation,

$$

\psi = \frac{\mathscr{S}(\bm{S})}{\mathscr{S}(\bm{S})+\mathscr{S}(\bm{U})} \\

\mathscr{S}(\bm{S})=sd\left\{\underset{m=1,\ldots,M}{\text{median}} S_i^{(m)} \right\}.

$$

For convenience, the computation of the proportion of excess variation $\psi$ is coded as a user-defined function, which is then applied to each of the random effects. Note the user-defined function assumes $S$ and $U$ are two-dimensional arrays of dimension $M \times N$ where $N$ is the number of observations (areas).

The result is that the structured spatial random effect is now readily identifiable, independently of the unstructured spatial random effect, even though the sum $S + U$ remains unchanged. The smoothed spatial pattern becomes much clearer after the modification. Assuming the axes represent cardinal directions, the spatial pattern reveals lower values in the South, and larger values in the North. In practice, the modification may not yield such a stark change or seem to improve clarity significantly. However, if there is enough data to identify a structured spatial pattern, this technique will help separate the signal ($S$) from the noise ($U$).