Plotting maps using R (example with Brazilian Municipal level data)

This is an example of how to plot GIS data using R. In this example, I will plot a map of Brazilian municipalities and the color of each city in the maps will be mapped to its GDP, proportion of votes of the winner president in the 2014 election, and proportion of families receiving the assistance of the governamental program Bolsa Familia. You need

  1. Shapefiles (Municipalities, States, Region)
  2. Information about municipalities GDP (here or here)
  3. Information about votes (here)
  4. Information about bolsa familia (here)
  5. A file linking the ID of the cities used by the Shapefiles (in the example it is the Brazilian Institute of Geography and Statistics’ code) and the ID of the cities used by election data (not provided here).

Note: as I didn’t provide the last item, you may not be able to reproduce the map with the result of the election. You should be able to reproduce using the GDP, though.

The tricky part is to create categories to attach colors that correspond to the values of a possible continuous variable. If there is a limited amount of colors available, you must create as many categories as the number of colors. Lets say you have 500 different colors but 5000 cities, you must create 500 categories of GDP. One suggestion would be using the percentiles. It may not make much difference in the final result, but you should compare the results when using colors by category and the results using colors in the continuous variable without making this transformation.

I present a map colored according to the proportion of votes each candidate received in the second round of 2014 Brazilian Election in each municipality along with another map that represents the proportion of families that receive Bolsa Familia in each city. Bolsa familia is a federal redistributive welfare program. Although it looks like so, *these two maps do not imply that Bolsa Familia explains or is correlated to presidencial vote. They are just two maps, and any stronger statement about causes of (factors correlated to) outcome of election must be further verified with appropriate statistical techniques*.

# =====================================================
# MapBRMunic.R
# --------------
# Author: Diogo A. Ferrari 
# =====================================================

# options 
par(las=1,cex.axis=.7,bty='l', pch=20, cex.main=.9, mar=c(4,5,3,2)) 

# packages

# Municipal level data
# ================================================================
# Load files
# ================================================================
# shapefiles
municBR   <- readShapePoly(fn='municipios_2010.shp')
statesBR  <- readShapePoly(fn='estados_2010.shp')
regionsBR <- readShapePoly(fn='regioes_2010.shp')
pib <- read.csv2('pib2010.csv')
# ================================================================

# ================================================================
# merging shapefile and data
# ================================================================
# this order var will make it possible to go back to the original order
municBR@data$order <- 1:nrow(municBR@data)

# NOTE: municipalities more municipalities in the pib file. we will ignore them her
municBR@data <- merge(municBR@data, pib,
		      by.x='codigo_ibg',by.y='C._digo', all.x=T)
# creating GDP per capita
municBR@data$X2010 <- municBR@data$X2010/as.numeric(as.character(municBR@data$populacao))

# ================================================================
# plot
# ================================================================
myPaletteBlue <- brewer.pal(9,'Blues')
createColors  <- colorRampPalette(myPaletteBlue)
myColorsBlue  <- createColors(nrow(municBR@data))
#check the colors
#pie(rep(1,nrow(municBR@data)), col=myColorsBlue, lty=0, labels=NA)

# good way to use colors: creating categories of income (as many as colors we have)
municBR@data            <- municBR@data[order(municBR@data$X2010),]
numberColors            <- length(unique(myColorsBlue))
# the cut points are arbitrary, and you can choose others
cuts                    <- quantile(municBR@data$X2010,
municBR@data$GDPcat     <- cut(municBR@data$X2010,breaks = cuts,include.lowest=T)

# ordering the vector by GDP
municBR@data            <- municBR@data[order(municBR@data$X2010),]

# merging GDPcat and their colors
colorOfCats             <- data.frame(GDPcat=levels(municBR@data$GDPcat),colors=unique(myColorsBlue))
municBR@data            <- merge(municBR@data,colorOfCats)

# going back to the original order of the data in the shapefile
municBR@data            <- municBR@data[order(municBR@data$order),]

plot(municBR, col=as.character(municBR@data$colors), lty=0, main='right colors')
plot(statesBR, add=T) # if you want to emphasize the states boundaries
plot(regionsBR, add=T, lwd=3) # to emphasize the macroregions boundaries

# legend
color.legend(xl=-70, xr=-60,yb=-25,yt=-26, 
	     rect.col=myColorsBlue, gradient='x',
	     cex=.7, pos=c(1,1,1))
text(x=-68,y=-24,label='GDP per capita', cex=1)



Election and Bolsa Familia

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s