This project was supported
by the Political Science program
of the National Science Foundation

MCMC source and R commands

Modifying MCMCpack

To recreate the estimator, retrieve and unzip the original vintage MCMCpack files from CRAN (MCMCpack_0.8.1.tar.gz ) to your UNIX/LINUX machine. Copy the .cc and .r files below over the existing files of the same name, tarzip the files into a new package (MCMCpack_custom.tar.gz), and install the new custom package in R (R CMD INSTALL MCMCpack_custom.tar.gz).

MCMChierEI.cc : C++ code (copy to /src directory)
MCMChierEI.R : R language (copy to /r directory)

Running the simulation

The sample R language below calls the datasets stored in the DATA folder, initializes and runs the MCMC simulation, and creates and stores a sample from the posterior distribution for the quantities of interest at the state level. The MCD posterior is available in R as an object named "posterior", but that file is not preserved/written after the session is terminated. The R program uses information from the posterior to create the state-level summary quantities that are archived as ESTIMATES. The sample from the MCD posterior is an extremely large file, so we do not archive each of these files. The Illinois 1920 posterior is available as a STATA dataset so that the figures from Chapter 4 can be replicated. All other posteriors can be produced with the R programs – each state/election simulation requires about 4 hours.

CT_1920.r : R commands to run the simulation for 1920 Connecticut and save the state-level estimates

Reading and evaluating diagnostics

All MCMC diagnostics reported in the book are generated from the Illinois 1920 posterior. A set of parameters is stripped out as diag.txt each time the simulation executes and the R code copies this to preserve the diagnostics for the state-level logits (the filename is styeardiag.txt, or il20diag.txt for Illinois in 1920).

IL20diag.txt: thinned chain for each state-level logit (used to produce Table 4.9)

Archiving the estimates

Each MCMC simulation generates a set of 8 probabilities created from 6,000 draws sampled from the posterior distribution (thinned from a full set of 1.2 million iterations after a burn-in period of typically 10 million observations). To make it easier to compare simulations across states and over time, 2,000 observations from the converged chain are used for the figures and tables in the book. The STATA commands below were used to read the simulated chains, extend the burn-in period if indicated by the convergence diagnostics, and sample 2,000 observations from after the series converged. Xxx_archive.do retrieves the appropriate posterior, trims the posterior to 2,000 observations, and writes as xxposumyydta for state (xx) and election (yy).