6 min read

Residential zoning in Pittsburgh

The New York Times recently published an article about zoning in U.S. cities, particularly single-unit detached residential housing. The article did not include Pittsburgh, so I downloaded the zone shapefile from the WPRDC and made my own map.

This blog quickly goes through the steps to make the map and other graphs about the data.

First, load the required libraries and set up the environment:

library(tidyverse)
library(sf)
library(ggmap)
library(janitor)
library(hrbrthemes)

options(scipen = 999)

Read in the shapefile with st_read and inspect the data with glimpse:

shapefile <- st_read("data/zoning/Zoning.shp")
## Reading layer `Zoning' from data source `C:\Users\Conor\Documents\github\blog\content\post\data\zoning\Zoning.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1055 features and 18 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -80.09534 ymin: 40.36161 xmax: -79.86577 ymax: 40.50097
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
glimpse(shapefile)
## Observations: 1,055
## Variables: 19
## $ objectid   <dbl> 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 101...
## $ area       <dbl> 313156.6, 6630129.1, 2907276.6, 14974381.6, 976915....
## $ perimeter  <dbl> 3950.255, 27722.334, 14342.342, 51430.490, 7215.884...
## $ zoning_    <dbl> 505, 507, 627, 640, 660, 666, 740, 806, 878, 914, 9...
## $ zoning_id  <dbl> 505, 507, 627, 640, 660, 666, 740, 806, 878, 914, 9...
## $ zon_new    <fct> RP, R2-L, UNC, R2-L, UNC, R1D-M, R1A-VH, R1D-L, R1D...
## $ shape_leng <dbl> 3950.255, 27722.334, 14342.342, 51430.490, 8527.517...
## $ correction <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ full_zonin <fct> RESIDENTIAL PLANNED UNIT DEVELOPMENT, TWO-UNIT RESI...
## $ legendtype <fct> Planned Unit Development, Two-Unit Residential, Urb...
## $ municode   <fct> http://library.municode.com/HTML/13525/level4/PIZOC...
## $ status     <fct> Approved, Approved, Approved, Approved, Approved, A...
## $ created_us <fct> pgh.admin, pgh.admin, pgh.admin, pgh.admin, pgh.adm...
## $ created_da <fct> 2019-04-16T15:07:40.150Z, 2019-04-16T15:07:40.150Z,...
## $ last_edite <fct> pgh.admin, pgh.admin, pgh.admin, pgh.admin, pgh.adm...
## $ last_edi_1 <fct> 2019-04-16T15:07:40.150Z, 2019-04-16T15:07:40.150Z,...
## $ Shape__Are <dbl> 313156.6, 6630129.1, 2907276.6, 14975696.6, 1117801...
## $ Shape__Len <dbl> 3950.255, 27722.334, 14342.342, 51227.951, 8549.557...
## $ geometry   <POLYGON [°]> POLYGON ((-79.98031 40.4449..., POLYGON ((-...

We need to munge the data to get it in shape for analysis. This makes some simple TRUE|FALSE flags for basic zone information and uses case_when to create type, which represents aggregated zone types.

df <- shapefile %>% 
  mutate(residential = str_detect(full_zonin, "RESIDENT"),
         single_unit = str_detect(full_zonin, "SINGLE-UNIT"),
         attached = str_detect(full_zonin, "ATTACHED"),
         type = case_when(residential == TRUE & single_unit == TRUE & attached == FALSE ~ "Single-unit detached residential",
                          residential == TRUE & single_unit == FALSE | attached == TRUE ~ "Other residential",
                          full_zonin == "EDUCATIONAL/MEDICAL INSTITUTION" ~ "Educational/Medical",
                          residential == FALSE ~ "Other non-residential"),
         type = factor(type, levels = c("Single-unit detached residential", 
                                        "Other residential",
                                        "Educational/Medical",
                                        "Other non-residential")),
         alpha_flag = type == "Single-unit detached residential")
## Observations: 1,055
## Variables: 23
## $ objectid    <dbl> 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 10...
## $ area        <dbl> 313156.6, 6630129.1, 2907276.6, 14974381.6, 976915...
## $ perimeter   <dbl> 3950.255, 27722.334, 14342.342, 51430.490, 7215.88...
## $ zoning_     <dbl> 505, 507, 627, 640, 660, 666, 740, 806, 878, 914, ...
## $ zoning_id   <dbl> 505, 507, 627, 640, 660, 666, 740, 806, 878, 914, ...
## $ zon_new     <fct> RP, R2-L, UNC, R2-L, UNC, R1D-M, R1A-VH, R1D-L, R1...
## $ shape_leng  <dbl> 3950.255, 27722.334, 14342.342, 51430.490, 8527.51...
## $ correction  <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA...
## $ full_zonin  <fct> RESIDENTIAL PLANNED UNIT DEVELOPMENT, TWO-UNIT RES...
## $ legendtype  <fct> Planned Unit Development, Two-Unit Residential, Ur...
## $ municode    <fct> http://library.municode.com/HTML/13525/level4/PIZO...
## $ status      <fct> Approved, Approved, Approved, Approved, Approved, ...
## $ created_us  <fct> pgh.admin, pgh.admin, pgh.admin, pgh.admin, pgh.ad...
## $ created_da  <fct> 2019-04-16T15:07:40.150Z, 2019-04-16T15:07:40.150Z...
## $ last_edite  <fct> pgh.admin, pgh.admin, pgh.admin, pgh.admin, pgh.ad...
## $ last_edi_1  <fct> 2019-04-16T15:07:40.150Z, 2019-04-16T15:07:40.150Z...
## $ Shape__Are  <dbl> 313156.6, 6630129.1, 2907276.6, 14975696.6, 111780...
## $ Shape__Len  <dbl> 3950.255, 27722.334, 14342.342, 51227.951, 8549.55...
## $ residential <lgl> TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, TRUE, TRUE, ...
## $ single_unit <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRU...
## $ attached    <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FA...
## $ type        <fct> Other residential, Other residential, Other non-re...
## $ alpha_flag  <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE, TR...

This counts the number of rows per full zone description (full_zonin) and type:

df_zones <- df %>% 
  count(full_zonin, type, sort = TRUE) %>% 
  st_drop_geometry()
## # A tibble: 59 x 3
##    full_zonin                              type                           n
##  * <fct>                                   <fct>                      <int>
##  1 PARKS AND OPEN SPACE                    Other non-residential        148
##  2 LOCAL NEIGHBORHOOD COMMERCIAL           Other non-residential        126
##  3 SINGLE-UNIT DETACHED RESIDENTIAL LOW D~ Single-unit detached resi~    75
##  4 MULTI-UNIT RESIDENTIAL MODERATE DENSITY Other residential             69
##  5 HILLSIDE                                Other non-residential         65
##  6 URBAN INDUSTRIAL                        Other non-residential         52
##  7 TWO-UNIT RESIDENTIAL LOW DENSITY        Other residential             51
##  8 SINGLE-UNIT DETACHED RESIDENTIAL HIGH ~ Single-unit detached resi~    44
##  9 SINGLE-UNIT ATTACHED RESIDENTIAL HIGH ~ Other residential             43
## 10 RESIDENTIAL PLANNED UNIT DEVELOPMENT    Other residential             42
## # ... with 49 more rows

Create a basic bar chart to show the distribution of type:

df %>% 
  st_drop_geometry() %>% 
  group_by(type, residential) %>% 
  summarize(area = sum(area)) %>% 
  ungroup() %>% 
  mutate(type = fct_reorder(type, area)) %>% 
  ggplot(aes(type, area / 1000000, fill = residential)) +
    geom_col() +
    scale_y_comma() +
    scale_fill_discrete("Is the zone residential?") +
    labs(x = "Zone type",
         y = "Land area in millions of feet") +
    coord_flip() +
    theme_ipsum()

Use a bar chart to show the distribution of full_zonin:

df %>% 
  st_drop_geometry() %>% 
  group_by(full_zonin, residential) %>% 
  summarize(area = sum(area)) %>% 
  ungroup() %>% 
  mutate(full_zonin = fct_reorder(full_zonin, area)) %>% 
  ggplot(aes(full_zonin, area / 1000000, fill = residential)) +
    geom_col() +
    scale_y_comma() +
    scale_fill_discrete("Is the zone residential?") +
    labs(x = "Full zone description",
         y = "Land area in millions of feet") +
    coord_flip() +
    theme_ipsum()

This calculates the total land area zoned for any type of residential housing:

df %>% 
  st_drop_geometry() %>% 
  mutate(single_unit_flag = type == "Single-unit detached residential") %>% 
  filter(residential == TRUE) %>% 
  summarize(total_area = sum(area))
##   total_area
## 1  770835393

This calculates the % of residential zoning that is zoned for single-unit detached residential housing units:

df %>% 
  st_drop_geometry() %>% 
  filter(residential == TRUE) %>% 
  mutate(single_unit_flag = (type == "Single-unit detached residential")) %>% 
  group_by(single_unit_flag) %>% 
  summarize(zone_area = sum(area)) %>% 
  mutate(pct_area = zone_area / sum(zone_area))
## # A tibble: 2 x 3
##   single_unit_flag  zone_area pct_area
##   <lgl>                 <dbl>    <dbl>
## 1 FALSE            335471796.    0.435
## 2 TRUE             435363597.    0.565

This creates a map of the zones, fills them by type, and overlays it on a GoogleMaps basemap. I also insert the boundaries of the City of Pittsburgh.

city_boundary <- st_read("data/Pittsburgh_City_Boundary/Pittsburgh_City_Boundary.shp")
## Reading layer `Pittsburgh_City_Boundary' from data source `C:\Users\Conor\Documents\github\blog\content\post\data\Pittsburgh_City_Boundary\Pittsburgh_City_Boundary.shp' using driver `ESRI Shapefile'
## Simple feature collection with 8 features and 6 fields
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -80.09534 ymin: 40.36161 xmax: -79.86577 ymax: 40.50097
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
#register_google(key = "your key here")
pgh_map <- get_map(location = "The Hill, Pittsburgh, PA", zoom = 12, maptype = "roadmap", source = "google")

ggmap(pgh_map) +
  geom_sf(data = df %>% filter(type != "Other non-residential"), aes(fill = type), inherit.aes = FALSE, size = .5, alpha = 1, color = NA) +
  geom_sf(data = city_boundary, inherit.aes = FALSE, alpha = 0, size = 2) +
  coord_sf(crs = st_crs(4326)) +
  scale_fill_manual("Zone type",
                      values = c("#ea60b9", "#4cafc5", "yellow", "light grey")) +
  labs(title = "56% of residential zoned land area is single-family detached residential",
         subtitle = "City of Pittsburgh zoning",
         caption = "@conor_tompkins, data from WPRDC") +
  theme_void()

I used scale_fill_manual to manually set the color palette to match the NYTimes article.