Land Use Regression Modeling of Air Quality using R

This workshop was designed and taught by Marshall Lloyd. At the time of recording, Marshall was a PhD Candidate, Department of Epidemiology and Biostatistics, School of Population and Global Health, McGill University. Supervisor: Dr. Scott Weichenthal. Website: https://scottweichenthal.weebly.com/people.html

This workshop uses data from CANUE, which is available through data access: https://canue.ca/data/

This workshop will provide an introduction to developing Land Use Regression (LUR) models with an application in outdoor air pollution. LUR have been used extensively to estimate within-city spatial variations in outdoor air pollutants such as PM2.5, ozone, and ultrafine particles. This workshop will use a simulated an air pollution monitoring campaign in Toronto and using CANUE estimates of annual ambient PM2.5 as the “observed values” (i.e., air pollution measured at fixed sites during a monitoring campaign). LURs are typically developed by regressing air pollution levels onto various land use parameters from curated geospatial information system (GIS) databases. This workshop will use land use data from the City of Toronto, the National Pollution Release Inventory, and OpenStreetMaps. With the simulated air pollution monitoring data and land use data, land use parameters will be selected, LUR models trained, and then models will be evaluated. Finally, participants will create a raster of air pollution predictions for the entire GTA based on the new models they have developed. This workshop will use a simple approach to LUR model development, but will also present additional approaches that participants can then try on their own time. This workshop will use R software and a provided dataset from CANUE and supplemental open datasets. Previous experience with spatial analysis, vector data, coordinate reference systems, linear regression, and an intermediate knowledge of R is required. R packages that will be extensively used during the workshop: tidyverse, sf, raster, purrr, mgcv.

Code and data for this workshop are available here: GitHub - lloydm6/lur_model_workshop: Develop, evaluate, and use land use regression models.

Please credit Marshall Lloyd whenever you use these files.

These workshops are presented in collaboration by GeoHealth Network, The Canadian Urban Environmental Health Research Consortium (CANUE), Population Data BC, University of Victoria Continuing Studies and sponsored by the University of Toronto Tri-Campus Graduate Program in Geography and Planning, School of Cities, and University of Toronto - Mississauga (The Angela B. Lange and Ian Orchard Graduate Student Initiatives Fund).