I also found that I needed to add the line:

`(add-to-list 'auto-mode-alist '("\\.Rmd" . poly-markdown+r-mode))`

to my .emacs file for polymode to start automatically when I opened a .Rmd file. This was mentioned in the polymode instructions which I should probably have read first. The full entries in my .emacs file for markdown-mode and polymode are:

`;;; Markdown mode`

(autoload 'markdown-mode "markdown-mode" "Major mode for editing Markdown files" t)

(setq auto-mode-alist (cons '("\\.markdown" . markdown-mode) auto-mode-alist))

(setq auto-mode-alist (cons '("\\.md" . markdown-mode) auto-mode-alist))

(setq auto-mode-alist (cons '("\\.ronn?" . markdown-mode) auto-mode-alist))

`;;; Polymode`

(setq load-path (append '("/home/sbonner/.emacs.d/polymode/" "/home/sbonner/.emacs.d/polymode/modes") load-path))

`(require 'poly-R)`

(require 'poly-markdown)

`(add-to-list 'auto-mode-alist '("\\.Rmd" . poly-markdown+r-mode))`

Looking great!

]]>I (try to) stay up to date with the articles published in the following journals:

- Biometrics
- Methods in Ecology and Evolution
- Journal of Agricultural, Biological, and Environmental Statistics
- Environmental and Ecological Statistics
- Biometrika (to a lesser degree)
- The Auk (mostly because I like birds)

Other journals that I should be watching include:

- Ecology
- Journal of Wildlife Management

But, there is only so much time in the day. The easiest way to keep up with what’s published in these journals is to subscribe to the table of contents alerts. For example, alerts for JABES can be requested by providing your e-mail address on the journal’s webpage.

The other good way to find recent articles is to search for specific authors or to look for new articles that cite old articles of interest. There are several tools that do this including the Web of Science and Google Scholar. These tools have different features and both have advantages and disadvantages. For example, the Web of Science allows you to limit searches by the field of the publication. On the other hand, Google Scholar provides tools that alert you when a specific author publishes new papers or is cited by other authors. For example, I use Google Scholar to follow the articles and citations of Bill Link from the USGS one of the leaders in Bayesian inference for the analysis of ecological data.

Several organizations also provide unified portals for journal alerts and allow for alerts using keyword searches as well (e.g. Zetoc). I have never used these myself. Just another thing to do!

Please feel free to comment if you think I’ve missed something important. It would be great to get more ideas.

]]>I recently found Mr. Sparkle wonderful LaTeX in Word project which makes this a lot easier! The heart of this is Word document that contains embedded macros which use a remote server to process LaTeX code into embedded images. To use the package you simply download and extract the tarball, save it to a new file, and start typing away. The Alt-L keystroke will open a new window that you can type LaTeX commands into, and pressing the convert button will turn these commands into images. The best thing is that the code stays embedded in the document. With the same keystroke you can reopen the command window and edited the existing code. Fantastic!

]]>I was originally generating the zero-truncated Poisson values using a while loop to simulate from a Poisson with mean until a non-zero value was generated. This is inefficient and runs into problems if is sufficiently small that the probability of a non-zero value is below computer precision.

To remedy this I found this nice solution from Peter Daalgard in response to a question from Spencer Graves on the R-help list. This solution essentially uses the inverse probability transform but truncates the CDF so that the values generated are guaranteed to be greater than 1. The same could be done for any distribution (that has a distribution function implemented in R). My code below implements this trick but adds a tolerance that avoids problems if the value of lambda is too small.

rtpois <- function(n, lambda,tol=1e-10){ ## Simulate from zero-truncated Poisson ## Initialize output x <- rep(NA,n) ## Identify lambda values below tolerance low <- which(lambda < tol) nlow <- length(low) if(nlow > 0){ x[low] <- 1 if(nlow < n) x[-low] <- qpois(runif(n-nlow, dpois(0, lambda[-low]), 1), lambda[-low]) } else x <- qpois(runif(n-nlow, dpois(0, lambda), 1), lambda) return(x) }]]>

Bonner, S.J. and Schofield, M. (2014). MC(MC)MC: Exploring Monte Carlo integration within MCMC for mark-recapture models with individual covariates. Methods in Ecology and Evolution, 5(12):1305–1315.

]]>As a simple test, I have used mcparallel to implement multiple chains for a Metropolis-Hastings sampler of the mean parameter in a multivariate log-normal distribution. That is, to sample from the posterior distribution of given a an iid sample of random p-vectors such that . In the simulation I have set , , and .

The sampler I have implemented generates proposals using a symmetric multivariate normal proposal distribution with variance Sigma.prop. I have manually tuned the proposal variance to achieve acceptance rates just above .5 and visually pleasing traceplots.

The use of the parallel package comes in three steps:

1. Initialize the random number stream:

This part of the code selects and initializes the L’Ecuyer random number generator so that each chain uses a different RNG.

## Initialize random number generator to obtain different chains from ## each parallel run RNGkind("L'Ecuyer-CMRG") set.seed(7777) mc.reset.stream()

2. Start the parallel processes:

The second part of the code starts the parallels processes running. I have used the lapply function here so that the number of parallel processes is not hardwired and can be changed easily.

## Start parallel processes niter <- 1000 Sigma.prop <- .03*Sigma</p> MHprocesses <- lapply(1:nchains,function(i){ mcparallel(MHalg(mu.init[i,],x,Sigma,n.iter,Sigma.prop)) })

3. Collect results:

The final chunk uses the mccollect function to wait for the parallel jobs to finish and collect the output.

## Wait and collect output MHout <- mccollect(MHprocesses)

In this case, the chains could have been run in parallel using the mcapply function. This would avoid the need to set the RNG and to collect results manually. Both functions use forks and share memory until objects are modified, so they should not differ in speed or memory use. However, mcparallel seems to be more flexible and I will need the flexibility in some of projects.

Here is the full code. Any comments are welcome!

############################################################ # # parallel_1.R # # First test with the parallel package. # # In this example I run parallel chains for a Metroposlis- # Hastings sampler for the multivariate log-normal # distribution using the mcparallel() function to # parallelize the code. # ############################################################ ##### Preliminaries ##### ## Load packages library(parallel) library(mvtnorm) ## Define functions dmvlnorm <- function(x,mu,Sigma,log=FALSE){ ## Multivariate log-normal density dmvnorm(log(x),mu,Sigma,log=log) } rmvlnorm <- function(n,mu,Sigma){ ## Multivariate log-normal random generator ## Generate normal random variates logx <- rmvnorm(n,mu,Sigma) ## Exponentiate to get log normal random variates x <- exp(logx) return(x) } MHstep <- function(mu.curr,x,Sigma,Sigma.prop){ ## Metropolis-Hastings step ## Generate proposal mu.prop <- mu.curr + rmvnorm(1,rep(0,length(mu.curr)),Sigma.prop) ## Compute Hastings ratio (on log scale) log.alpha <- sum(dmvlnorm(x,mu.prop,Sigma,log=TRUE)) - sum(dmvlnorm(x,mu.curr,Sigma,log=TRUE)) ## Accept or reject proposal if(-rexp(1) < log.alpha) return(mu.prop) else return(mu.curr) } MHalg <- function(mu,x,Sigma,niter,Sigma.prop){ ## Initialize storage mu.store <- matrix(nrow=niter + 1,ncol=length(mu)) mu.store[1,] <- mu ## Initialize acceptance monitor accept <- rep(0,niter) ## Run MH algorithm for(k in 1:niter){ ## Perform one MH step mu <- MHstep(mu,x,Sigma,Sigma.prop) ## Store values mu.store[k+1,] <- mu ## Track acceptance if(all(mu.store[k+1,] != mu.store[k,])) accept[k] <- 1 } ## Return sampled values return(list(mu=mu.store,accept=accept)) } ##### Run Samplers ##### ## Generate some random data mu <- c(1,1) Sigma <- .5 + .5 * diag(2) n <- 25 x <- rmvlnorm(n,mu,Sigma) ## Parameters n <- 25 mu <- c(1,.5) Sigma <- .5 + .5*diag(2) ## Contour plot of true density xgrid <- seq(0,30,length=100) Z <- sapply(xgrid,function(x1){ dmvlnorm(cbind(x1,xgrid),mu,Sigma) }) contour(xgrid,xgrid,Z) ## Generate data x <- rmvlnorm(n,mu,Sigma) ## Generate initial values nchains <- 4 mu.init <- rmvnorm(nchains,apply(log(x),2,mean),2*Sigma) ## Initialize random number generator to obtain different chains from ## each parallel run RNGkind("L'Ecuyer-CMRG") set.seed(7777) mc.reset.stream() ## Start parallel processes niter <- 1000 Sigma.prop <- .03*Sigma MHprocesses <- lapply(1:nchains,function(i){ mcparallel(MHalg(mu.init[i,],x,Sigma,n.iter,Sigma.prop)) }) ## Wait and collect output MHout <- mccollect(MHprocesses) ##### Examine Output ##### ## One-dimensional traceplots col <- rainbow(nchains) par(mfrow=c(2,1)) trace1 <- sapply(1:nchains,function(i) MHout[[i]]$mu[,1]) matplot(trace1,type="l",lty=1,col=col) abline(h=mu[1],lty=2,col="grey") trace2 <- sapply(1:nchains,function(i) MHout[[i]]$mu[,2]) matplot(trace2,type="l",lty=1,col=col) abline(h=mu[2],lty=2,col="grey") ## Two-dimensional traceplot par(mfrow=c(1,1)) lim <- range(sapply(MHout,function(out) range(out$mu))) plot(NA,NA,xlim=lim,ylim=lim) for(i in 1:nchains) lines(MHout[[i]]$mu[,1],MHout[[i]]$mu[,2],col=col[i]) abline(v=mu[1],lty=2,col="grey") abline(h=mu[2],lty=2,col="grey") ## Acceptance rates sapply(1:nchains,function(i) mean(MHout[[i]]$accept)) ## Density plots par(mfrow=c(2,1)) dens1 <- lapply(1:nchains,function(i){ density(MHout[[i]]$mu[,1],from=lim[1],to=lim[2]) }) ylim1 <- c(0,max(sapply(1:nchains,function(i) max(dens1[[i]]$y)))) plot(NA,NA,xlim=lim,ylim=ylim1) for(i in 1:nchains) lines(dens1[[i]],col=col[i]) abline(v=mu[1],lty=2,col="grey") dens2 <- lapply(1:nchains,function(i){ density(MHout[[i]]$mu[,2],from=lim[1],to=lim[2]) }) ylim2 <- c(0,max(sapply(1:nchains,function(i) max(dens2[[i]]$y)))) plot(NA,NA,xlim=lim,ylim=ylim2) for(i in 1:nchains) lines(dens2[[i]],col=col[i]) abline(v=mu[2],lty=2,col="grey")]]>