Multilevel Models

My significant other, in getting his M.S. research ready for publication, introduced me to multilevel modeling, and we worked through a reanalysis together. Now, I’m also interested to apply this framework to my 2nd dissertation chapter before publishing, so I can model cohesive community response while still extracting species’ responses.

The following code block is copied straight from a forum post, but I wanted to save it in case the thread disappears someday. The author is converting multilevel models from lme to (Bayesian) R-INLA format.

“For each model I have first the lme formulation and then the formula for INLA” – Ignacio Quintero

# random intercept
m1 = lmer(dr ~ st.ele + (1|poId), data = sdat)
form = dr ~ st.ele + f(poId, model = 'iid')

#organize data
n.loc = max(sdat$poId)
sdat$ = sdat$poId
sdat$ = sdat$poId + n.loc
sdat$ = sdat$ + n.loc

# uncorrelated random intercept and random slope
m2 = lmer(dr ~ st.ele + (1|poId) + (0 + st.ele|poId), data = sdat)
form = dr ~ st.ele + f(, model = 'iid') + f(, st.ele, model = 'iid')

# correlated random intercept and random slope
m3 = lmer(dr ~ st.ele + (st.ele |poId), data = sdat)
form = dr ~ st.ele + f(, model = 'iid2d', n = 2*n.loc) + f(, st.ele, copy = '')

# uncorrelated random intercept and random slope of st.ele^2
m4 = lmer(dr ~ st.ele + I(st.ele^2) + (1|poId) + (0 + I(st.ele^2)|poId), data = sdat)
form = dr ~ st.ele + I(st.ele^2) + f(, model = 'iid') + f(, I(st.ele^2), model = 'iid')

I’m currently running a model selection per community, to determine which variables are most important across each avian community.

Spatio-temporal Models

A few tips and tricks along the way for spatial models…

  • you must have an observation for every feature
  • those observations must be in order of the shape file
  • the corresponding identifier can only be (1:length of shape file)

I’m not sure the spatial effect alone will be much more than a localized range map (but maybe I just need to think through it more)? I built an ecological regression model by adding a spatial term to my previous best-fit model:

model = abundance ~ f(ID,model="bym",graph="ch2.adj") + f(Year,model="rw2") + yearfix + f(ObsN, model="iid") + firstyear + SPI

I’m also looking at a spatial trend model:

model = abundance ~ f(ID,model="bym",graph="ch2.adj") + f(ID.area,Year,model="iid") + Year + f(ObsN, model="iid") + firstyear

It is possible to model nonlinear trends as well (though I’m not sure I totally understand the mechanism in the example, because it doesn’t seem like it necessarily has anything to do with the predictor).

I’m trying out models that are interactions of space and my co-variates of interest (i.e. weather and land cover), such as…

model = abundance ~ f(ID,model="bym",graph="ch2.adj") + f(Year,model="iid") + yearfix + f(ObsN, model="iid") + firstyear + f(ID.area,SPI,model="iid")

Bayesian Inference Ideas from R-INLA

I kind of hesitate to post “general” R-INLA posts because it’s a.) not my job so b.) I don’t really have time to give it the treatment it deserves. I don’t want to rewrite the website, yet, though I do want to make it more accessible. Right now, I’m mining the R-INLA site for biologically relevant ideas…

  • dispersal barrier models
  • spatial models of whatever phenomena (e.g. range maps)
  • spatio-temporal models of whatever phenomena (e.g. spatially varying relationships between response and covariate)
  • species distribution models
  • clustering analyses (? look more into this)

Until I have my own graphs and maps for the cover photo, here is where there is currently a R-INLA conference going on. I wish I could be there, for several reasons!

Bayesian Models with R-INLA

Fitting Bayesian models has gotten even better with the R package INLA. The fact that it’s integrated into R has led to more simplistic scripting, and the approach means much reduced computational time. There is a lot of literature out there about “why and how” the integrated nested Laplace approximation, which is why my treatment of it in this blog will likely be more practical. “The” paper about INLA for Bayesian stats starts out talking about the class of structured additive regression models. Fortunately, for non-statisticians like me, they throw me a bone and give me a list of examples from which I can pick out a few that I recognize.

  • (generalized) linear models
  • (generalized) additive models
  • smoothing spline models
  • state space models
  • semiparametric regression
  • spatial/temporal models
  • log-Gaussian Cox processes
  • geostatistical/additive models

The focus is on latent Gaussian models (about which there will be an entire conference this fall; the postdoc I work with and I have joked about going). As the name would imply, the latent field is Gaussian, “controlled by a few hyperparameters and with non-Gaussian response variable” (Rue et al. 2009).

Now, there are nice tutorials on the website! There are “volumes” that correspond to the OpenBUGS examples. The OpenBUGS write-ups are nice for explanation of the Bayesian stats behind the examples.

Another cool feature is that you can run it remotely within your R script. With that general question of why my models crash or how the fit is going, I found a reference to error checking that might help me diagnose.