lab3.md

Lab 3 - Exploring colour and visualizing spatial data in R with ggplot2's geom_polygon and ggmap

General lab instructions

3 [Mechanics]

Overview

In this lab you will explore how to visualize spatial data in R using ggplot2's geom_polygon and ggmap packages. You will also explore the effective use of colour.

Resources:

ggmap cheatsheet

Exercise 1 - Create a chloropleth map of Canada to visualize statistics

  • Follow this example on how to create a chloropleth map of Canadian provinces & territories to visualize population count data by province. Required data for this example live here. The population data was sourced from this table from Statistics Canada.

1a

1 [Code]   2 [Visualizations]

1b

2 [Reasoning]
  • Discuss the results of your visualization (what did you find out about the data by creating the visualization).
  • Reflect and discuss how the data is represented visually and why or why not you think it is effective. Explicitly state and comment on the marks and channels used in your visual encoding, the tasks that are well supported by it, any choices you made to derive additional attributes beyond the input dataset, and the scale of the data in terms of number of observations, and the number of levels of each categorical or ordered attribute (if categorical or ordered attributes are present). Provide a rationale for your use of colour, with explicit discussion of its constituent visual channels of luminance, saturation, and hue; and of sequential, diverting, or cycling attribute characteristics.

Exercise 2 - Correcting count data for population size

alt text

source: http://xkcd.com/1138/

2a

1 [Code]   2 [Visualizations]
  • Using a similar strategy to what you did to solve Exercise 1, make a chloropleth map that illustrates the number of asthma cases per Canadian province/territory for the year of 2014. The asthma data is available here, and was sourced from here. You can decide how you want to deal with, or ignore, the gender aspect of the data. note - you will have to do some data wrangling to get this data into a useable shape for plotting.

2b

1 [Code]   2 [Visualizations]
  • Make another chloropleth map using this data, but this time standardize to the population size of that province (e.g., provide cases of asthma per 1000, or whatever seems reasonable). You can use the 2014 provincial/territory population data from the population data set used Exercise 1 to do this.

2c

2 [Reasoning]
  • Discuss the results of the two visualizations. How are they similar? How do they differ? Which visualization do you think is more informative? Explain.

Exercise 3 - Plotting points on a map using ggmap

3a

1 [Code]   2 [Visualizations]
  • Download Canadian earthquake data from the Earthquake Database for a time period you are interested in (the database goes all the way back to 1960's). note - you will have to do some data wrangling to get the data in good shape to plot

  • Get a map of Canada using ggmap's get_map() function, and overlay the locations of the earthquakes on the map using ggplot's geom_point.

  • Colour-code the points based on the depth of the earthquake.

  • Have the size of the point/bubble represent the magnitude of the earthquake.

  • Don't forget to label the visualization so that all someone has to do is look at it to understand it.

3b

2 [Reasoning]
  • Discuss the results of your visualization (what did you find out about the data by creating the visualization).
  • Reflect and discuss how the data is represented visually and why or why not you think it is effective. Explicitly state and comment on the marks and channels used in your visual encoding, the tasks that are well supported by it, any choices you made to derive additional attributes beyond the input dataset, and the scale of the data in terms of number of observations, and the number of levels of each categorical or ordered attribute (if categorical or ordered attributes are present). Provide a rationale for your use of colour, with explicit discussion of its constituent visual channels of luminance, saturation, and hue; and of sequential, diverting, or cycling attribute characteristics.

Exercise 4 - Contour heat maps with ggmap

4a

1 [Code]   2 [Visualizations]
  • Download the .csv file for the Vancouver Police Department Crime dataset (or find another dataset with similar data) and create a contour heat map (also known as a contour plot, a filled isocontour plot, or a level plot) with ggmap. Refer to Lesson # 3 in this tutorial as a guide on how to do this.

4b

2 [Reasoning]
  • Discuss the results of your visualization (what did you find out about the data by creating the visualization).
  • Reflect and discuss how the data is represented visually and why or why not you think it is effective. Explicitly state and comment on the marks and channels used in your visual encoding, the tasks that are well supported by it, any choices you made to derive additional attributes beyond the input dataset, and the scale of the data in terms of number of observations, and the number of levels of each categorical or ordered attribute (if categorical or ordered attributes are present). Provide a rationale for your use of colour, with explicit discussion of its constituent visual channels of luminance, saturation, and hue; and of sequential, diverting, or cycling attribute characteristics.