A Federal Use Case of the National Water Service Boundary Layer: Screening communities for drinking water finance for the Justice40 Initiative.

Author

Kyle Onda, Center for Geospatial Solutions

Published

June 29, 2022

1 Introduction

The purpose of this exercise is to demonstrate the value of the National Water Service Area Boundary layer with a Federal government use case. This use case is to quantify the extent to which adding a drinking water quality-based indicator to the Climate and Environmental Justice Screening Tool would change the universe of “communities” (2010 Census Tract geographies) that are highlighted. This document describes the how to construct a binary indicator that corresponds to some drinking-water related risk, implements the indicator

2 Defining a drinking water environmental justice indicator

The most common way to do this is to calculate an indicator variable based on drinking water violations data from the U.S. EPA SDWIS database. Here, I review and go over the pros and cons of the most common approaches in governance and the literature. First, some necessary terminology:

2.1 Terminology

Violation

A violation of regulations of the Safe Drinking Water Act

Maximum Contaminant Level (MCL) Violation

When a concentration of a contaminant is detected to be above the limit allowed in drinking water standards.

Total Coliform Rule

A group of maximum contaminant levels and monitoring requirements for the presence of total coliform, fecal coliform, and E. coli bacteria.

Maximum Residual Disinfectant Level (MRDL) Violations

When a concentration of a disinfectant residual is detected to be above the limit allowed in drinking water standards.

Treatment Technique (TT) rule Violations

When specified treatment techniques required for a system’s water source are not applied.

Health-based violations

Violations that are directly related to health risks in drinking water. Corresponds to all MCL, MRDL, and TT violations.

Monitoring and reporting (MR) violations

Failure to conduct drinking water quality tests or to submit the results of those tests in a timely fashion to USEPA or the primacy agency.

Other Violations

A USEPA catch-all category for other violations, which are generally either about conducting sanitary surveys of the system or about reporting their test results to the public or water systems they are interconnected with.

Compliance

Being in a state of not violating a relevant regulation

Compliance period

A period of time when a water system was in violation of a drinking water regulation

2.2 Review of metrics

There are a few considerations in constructing a binary violation indicator:

  1. What types of violations count? MCL only? Health-based only? All?
  2. What contaminants should count in the case of MCL violations? Those monitored under the Total Coliform Rule? Lead and Copper Rule? Nitrates? All?
  3. How long does a violation need to last to be counted? 0 time? 1 month? 1 year?
  4. Over what time period do they count? The past year? 2 years? 5 years? Forever?

A brief review of the literature follows.

Description of metric Pros Cons
EPA “Enforcement Priority” (formerly “Serious Violator”), a threshold composed of points weighted highest towards TCR violations, less so for other health-based violations like Nitrates andl LCR, and least for repeated reporting violations over a rolling 5-year period. See here. Does not count violations that are returned to compliance or are undergoing a verified enforcement action.
  • Already used by USEPA to prioritize enforcement efforts, legitimating use in CEJST

  • Weighs violations on certain contaminants more highly based on relative risks to health

  • Includes monitoring violations in a scaled manner, focusing on repeat violations.

  • Cares about the speed of return to compliance, i.e. some measure of the temporal aspect of a health risk exposure

  • May be considered stigmatizing by utility community

  • Opaque, masks the difference between contaminant exposures and negligent monitoring or reporting

  • Includes non-health based violations that may not be desirable in a “access to safe water” oriented indicator

  • Not including violations with enforcement actions may underestimate current health risks to the service population

All violations, over a multi-year period (Wallsten and Kosec 2008; Konisky and Teodoro 2015; Marcillo and Krometis 2019)
  • Simple to calculate
  • Including monitoring violations captures governance and capacity problems that can undermine the ability of water systems to respond to the needs of their customers.
  • Including monitoring violations may not be appropriate to a health/exposure-based metric
TCR MCL violations only, on an annual basis (Allaire, Wu, and Lall 2018)
  • There is some evidence that TCR violations are the most systematically reported. Other violations may give a biased estimate.
  • Does not count the many documented cases of other kinds of violations

  • The “bias” evidence is from 2000 and may not be as relevant today

All MCL violations over a multi-year period (Dobbin and Fencl 2021; Switzer and Teodoro 2017)
  • Intuitive appeal for considering all contaminants a potential health risk exposure
  • MRDLs and treatment failures are also important health risk exposures

  • Does not capture institutional risks associated with monitoring/ reporting violations

All Health-Based Violations over a multi-year period (Dobbin and Fencl 2021; Switzer and Teodoro 2017)
  • Intuitive appeal for considering all health-based violations
  • Does not capture institutional risks associated with monitoring/ reporting violations

2.3 Proposed metric

Binary indicator, which is if the system experienced more than 1 month in health-based violation in the past two years. Discussion can modify the month and year periods. I believe giving a grace period for 1 month over two years leaves room for true fluke events for utilities that otherwise have good process control. I can also simply create the metric for a few combinations and we can see what seems reasonable.

3 Implementing the metric

First, we load required packages

show the code
library(tidyverse)
library(sf)
library(mapview)
library(vroom)
library(qs)
library(janitor)
library(knitr)

sf_use_s2(FALSE)

Below, we describe and implement the workflow in R.

3.1 Collate health-based violations with compliance periods ending March 2020 or later from SDWIS

First, we retrieve the .zip archive of the SDWIS data download from USEPA, and unzip the violations data table.

show the code
download.file(url = "https://echo.epa.gov/files/echodownloads/SDWA_latest_downloads.zip", destfile = "../data/sdwa.zip")
unzip("data/sdwa.zip", 
      files = "SDWA_VIOLATIONS_ENFORCEMENT.csv",
      exdir = "data")

This archive has several tables, the relevant one being SDWA_VIOLATIONS_ENFORCEMENT.csv, which as of this version of the workflow, was last updated on April 13, 2022.

show the code
unzip("data/sdwa.zip", list = TRUE)
                              Name     Length                Date
1       SDWA_EVENTS_MILESTONES.csv   36426509 2022-04-13 15:11:00
2              SDWA_FACILITIES.csv  156229271 2022-04-13 15:13:00
3        SDWA_GEOGRAPHIC_AREAS.csv   40040738 2022-04-13 15:13:00
4             SDWA_LCR_SAMPLES.csv  104175670 2022-04-13 15:13:00
5       SDWA_PUB_WATER_SYSTEMS.csv  126596558 2022-04-13 15:14:00
6      SDWA_PN_VIOLATION_ASSOC.csv   31721515 2022-04-13 15:14:00
7          SDWA_REF_ANSI_AREAS.csv      86159 2022-04-13 15:14:00
8         SDWA_REF_CODE_VALUES.csv     117171 2022-04-13 15:14:00
9           SDWA_SERVICE_AREAS.csv   18901420 2022-04-13 15:15:00
10            SDWA_SITE_VISITS.csv  336423869 2022-04-13 15:17:00
11 SDWA_VIOLATIONS_ENFORCEMENT.csv 3208856416 2022-04-13 15:41:00
show the code
end_date <- as.Date(unzip("data/sdwa.zip", list = TRUE)$Date[11])

We load the violations table and filter to violations that end later than April 13, 2020, two years before the the most current data was released.

show the code
start_date <- end_date - 365*2
violators <- vroom("data/SDWA_VIOLATIONS_ENFORCEMENT.csv", # read data
                    col_types = cols(.default = "c")) %>%
  filter(IS_HEALTH_BASED_IND == "Y") %>% # select only health-based violations
  mutate( #format dates as dates
    start = as.Date(COMPL_PER_BEGIN_DATE, 
                                   format = "%m/%d/%Y"),
    end = pmin(as.Date(CALCULATED_RTC_DATE,
                                 format = "%m/%d/%Y"),
                              end_date)) # set end date to most current report date

3.2 Calculate the number of days in health-based violation by each CWS within this period

show the code
violators <- violators %>%
  mutate(
    violation_duration = end - start
  ) 

3.3 Filter to CWS with greater than 30 such days

show the code
violators <- violators %>%
  filter(end >= start_date) %>%
  filter(violation_duration >= 30) %>%
  distinct(PWSID)

3.4 Filter the TEMM national water service boundary layer to those states with comprehensive, non-county-based Tier 1 boundaries, plus Utah

First we download and compress the TEMM layer for quick access in later parts of the workflow

show the code
download.file(url = "https://www.hydroshare.org/django_irods/rest_download/6f3386bb4bc945028391cfabf1ea252e/data/contents/temm_layer_v1.0.0/temm.geojson/?url_download=True&zipped=False&aggregation=False", destfile = "data/temm.geojson")

temm <- sf::read_sf("data/temm.geojson")
qsave(temm, file = "data/temm.qs")

Now we load the file, and filter for only the states with comprehensive Tier 1 availability (AZ, CA, CT, KS, NJ, NM, OK, PA, TX, WA) as well as Utah for comparison. We also load in Utah’s data for Culinwary Water Service Areas, filtering for Community Water Systems, which was not included in the original TEMM layer but has been created by Utah DWR, to compare performance between TEMM estimation methods and Tier 1 if possible.

show the code
temm <- qread("data/temm.qs")
states <- c("AZ",
            "CA",
            "CT",
            "KS",
            "NJ",
            "NM",
            "OK",
            "PA",
            "TX",
            "UT",
            "WA")


utah <- sf::read_sf("https://services.arcgis.com/ZzrwjTRez6FJiOq4/arcgis/rest/services/CulinaryWaterServiceAreas/FeatureServer/0/query?where=1%3D1&objectIds=&time=&geometry=&geometryType=esriGeometryEnvelope&inSR=&spatialRel=esriSpatialRelIntersects&resultType=none&distance=0.0&units=esriSRUnit_Meter&returnGeodetic=false&outFields=*&returnGeometry=true&returnCentroid=false&featureEncoding=esriDefault&multipatchOption=xyFootprint&maxAllowableOffset=&geometryPrecision=&outSR=4326&defaultSR=&datumTransformation=&applyVCSProjection=false&returnIdsOnly=false&returnUniqueIdsOnly=false&returnCountOnly=false&returnExtentOnly=false&returnQueryGeometry=false&returnDistinctValues=false&cacheHint=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&having=&resultOffset=&resultRecordCount=&returnZ=false&returnM=false&returnExceededLimitFeatures=true&quantizationParameters=&sqlFormat=none&f=pgeojson") %>% filter(SYSTEMTYPE == "C")


boundaries <- temm %>% filter(
  state_code %in% states
) %>% select(-service_area_type_code)

3.5 Filter the CEJST Census Tracts to the same states

First we download and unzip the data

show the code
download.file(url = "https://static-data-screeningtool.geoplatform.gov/data-pipeline/data/score/shapefile/usa.zip", destfile = "data/j40.zip")
unzip("data/j40.zip", exdir = "data/j40")

Then we load it and the data dictionary

show the code
j40 <- sf::read_sf("data/j40/usa.shp")
j40_dict <- vroom("data/j40/columns.csv")
Rows: 110 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): shapefile_column, column_name, column_description

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

3.6 Filter to the resulting set of CWS with violations

show the code
v <- boundaries %>% 
  filter(pwsid %in% violators$PWSID)

3.7 Assign the indicator to CEJST Census Tracts that spatially intersect this set of CWS

First we spatially intersect the set of CWS boundaries with the threshold period of health-based violations to the CEJST version of the 2010 U.S. Census Tracts. We also do Utah separately. Caveats:

  • I am not setting a threshold for how large an overlap between a census tract and a utility needs to be. A census tract may have only a small portion covered by a utility and we are counting it. There are two scenarios:

    • A relatively large system only serves part of a tract on its periphery

    • A small system with a tier 1 boundary like a mobile home park or golf course community lies completely within a tract much larger than it

    • I have not quantified this yet

show the code
states2 <- c("Arizona",
            "California",
            "Connecticut",
            "Kansas",
            "New Jersey",
            "New Mexico",
            "Oklahoma",
            "Pennsylvania",
            "Texas",
            "Utah",
            "Washington")


j40 <- j40 %>% filter(
  SF %in% states2
)

v1 <- v %>% filter(state_code == states[1])
j1 <- j40 %>% filter(SF == states2[1])

v_j40 <- st_intersection(v1,j1)
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
show the code
for (i in 2:length(states)){
  v1 <- v %>% filter(state_code == states[i])
  j1 <- j40 %>% filter(SF == states2[i])
  v_j40_1 <- st_intersection(v1,j1)
  v_j40 <- bind_rows(v_j40,v_j40_1)
  print(paste0(i))
}
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "2"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "3"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "4"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "5"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "6"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "7"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "8"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "9"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "10"
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
[1] "11"
show the code
v_utah <- utah %>% filter(DWSYSNUM %in% v$pwsid)
j_utah <- j40 %>% filter(SF == "Utah")

v_j40_utah <- st_intersection(v_utah,j_utah)
although coordinates are longitude/latitude, st_intersection assumes that they are planar
Warning: attribute variables are assumed to be spatially constant throughout all
geometries
show the code
qsave(v_j40,"data/viol_pws_j40.qs")
qsave(v_j40_utah, "data/viol_pws_j40_utah.qs")

Then, we need to construct indicator as framed by CEJST, meaning we assign to each Census tract a binary indicator as to whether it intersects a violating water system as well as meeting the socioeconomic threshold set by the CEJST (>65th percentile for percentage of population living on incomes <200% of the Federal Poverty line and with 80% or more of adults >15 not enrolled in higher education). We also only allow for Tier 1 and Tier 2a matches, removing all Tier 2b and Tier 3 matches.

show the code
states2 <- c("Arizona",
            "California",
            "Connecticut",
            "Kansas",
            "New Jersey",
            "New Mexico",
            "Oklahoma",
            "Pennsylvania",
            "Texas",
            "Utah",
            "Washington")


j40 <- j40 %>% filter(
  SF %in% states2
)

v_j40_utah <- qread("data/viol_pws_j40_utah.qs") 
v_j40 <- qread("data/viol_pws_j40.qs") %>% filter(tier == "Tier 1" | tier == "Tier 2a")

v_j40 <- v_j40 %>% filter(SF != "Utah")

j40_dw <- j40 %>% 
  mutate(
   DW = case_when(GEOID10 %in% v_j40$GEOID10 ~ "SDWA Violation Present",
                       TRUE ~ "SDWA Violation Not Present"), # DW indicator with Tier 2 Utah 
   DW_ut1 = case_when( ((GEOID10 %in% v_j40$GEOID10 & SF != "Utah") | (GEOID10 %in% v_j40_utah$GEOID10)) ~ "SDWA Violation Present",
                       TRUE ~ "SDWA Violation Not Present") # DW indicator with Tier 1 Utah
  )

j40_dw <- j40_dw %>%
  mutate(
    dw_disadv = (DW == "SDWA Violation Present" & M_EBSI == 1),
    dw_disadv_ut1 = (DW_ut1 == "SDWA Violation Present" & M_EBSI == 1)
  )

3.8 Calculate and compare summary statistics for the original set of CEJST tracts (in the filtered states) and the new set, by state, including:

counts

Tract counts in selected states by Drinking Water indicator and current CEJST disadvantage status

show the code
j40_dw <- j40_dw %>%
  mutate(
    CEJST_disadv = case_when(
      SM_C == 1 ~ "CEJST disadvantaged",
      TRUE ~ "Not CEJST disadvantaged"
    ), 
    CEJST_income_educ_threshold = case_when(
      M_EBSI == 1 ~ "CEJST 'low' inc/edu",
      TRUE ~ "CEJST 'high' inc/edu"
    )
  )
  x <-  j40_dw %>% st_drop_geometry() %>% 
  tabyl(DW_ut1,CEJST_disadv) %>% 
  adorn_totals(where = c("row","col")) %>%
  adorn_percentages("col") %>%
  adorn_pct_formatting() %>%
  adorn_ns() 
  
  kable(x)
DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
SDWA Violation Not Present 88.1% (7135) 89.2% (15324) 88.9% (22459)
SDWA Violation Present 11.9% (961) 10.8% (1850) 11.1% (2811)
Total 100.0% (8096) 100.0% (17174) 100.0% (25270)
  • Tract counts in selected states by Drinking Water indicator and current CEJST disadvantage status by State

    show the code
    j40_dw <- j40_dw %>%
      mutate(
        CEJST_disadv = case_when(
          SM_C == 1 ~ "CEJST disadvantaged",
          TRUE ~ "Not CEJST disadvantaged"
        ), 
        CEJST_income_educ_threshold = case_when(
          M_EBSI == 1 ~ "CEJST 'low' inc/edu",
          TRUE ~ "CEJST 'high' inc/edu"
        )
      )
    j40_dw %>% st_drop_geometry() %>% 
      tabyl(DW_ut1,CEJST_disadv,SF) %>% 
      adorn_totals(where = c("row","col")) %>%
      adorn_percentages("col") %>%
      adorn_pct_formatting() %>%
      adorn_ns() %>%
      walk2(names(.), ~ print(kable(.x, caption = .y)))
    Arizona
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 92.9% (459) 96.3% (994) 95.2% (1453)
    SDWA Violation Present 7.1% (35) 3.7% (38) 4.8% (73)
    Total 100.0% (494) 100.0% (1032) 100.0% (1526)
    California
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 96.8% (2815) 98.0% (5049) 97.6% (7864)
    SDWA Violation Present 3.2% (92) 2.0% (101) 2.4% (193)
    Total 100.0% (2907) 100.0% (5150) 100.0% (8057)
    Connecticut
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 96.5% (165) 95.8% (634) 95.9% (799)
    SDWA Violation Present 3.5% (6) 4.2% (28) 4.1% (34)
    Total 100.0% (171) 100.0% (662) 100.0% (833)
    Kansas
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 90.2% (184) 81.3% (460) 83.6% (644)
    SDWA Violation Present 9.8% (20) 18.7% (106) 16.4% (126)
    Total 100.0% (204) 100.0% (566) 100.0% (770)
    New Jersey
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 61.2% (301) 68.1% (1033) 66.4% (1334)
    SDWA Violation Present 38.8% (191) 31.9% (485) 33.6% (676)
    Total 100.0% (492) 100.0% (1518) 100.0% (2010)
    New Mexico
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 73.0% (165) 89.4% (244) 82.0% (409)
    SDWA Violation Present 27.0% (61) 10.6% (29) 18.0% (90)
    Total 100.0% (226) 100.0% (273) 100.0% (499)
    Oklahoma
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 66.5% (300) 78.0% (464) 73.0% (764)
    SDWA Violation Present 33.5% (151) 22.0% (131) 27.0% (282)
    Total 100.0% (451) 100.0% (595) 100.0% (1046)
    Pennsylvania
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 87.4% (653) 85.1% (2102) 85.6% (2755)
    SDWA Violation Present 12.6% (94) 14.9% (369) 14.4% (463)
    Total 100.0% (747) 100.0% (2471) 100.0% (3218)
    Texas
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 85.8% (1776) 86.0% (2748) 85.9% (4524)
    SDWA Violation Present 14.2% (293) 14.0% (448) 14.1% (741)
    Total 100.0% (2069) 100.0% (3196) 100.0% (5265)
    Utah
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 94.7% (72) 86.3% (442) 87.4% (514)
    SDWA Violation Present 5.3% (4) 13.7% (70) 12.6% (74)
    Total 100.0% (76) 100.0% (512) 100.0% (588)
    Washington
    DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
    SDWA Violation Not Present 94.6% (245) 96.2% (1154) 96.0% (1399)
    SDWA Violation Present 5.4% (14) 3.8% (45) 4.0% (59)
    Total 100.0% (259) 100.0% (1199) 100.0% (1458)
  • Tract counts in selected states by Drinking Water indicator and current CEJST disadvantage status, by the CEJST non-student low-income socioeconomic indicator

show the code
x<-j40_dw %>% st_drop_geometry() %>% 
  tabyl(DW_ut1,CEJST_disadv,CEJST_income_educ_threshold) %>% 
  adorn_totals(where = c("row","col")) %>%
  adorn_percentages("col") %>%
  adorn_pct_formatting() %>%
  adorn_ns() %>%
  walk2(names(.), ~ print(kable(.x, caption = .y)))
CEJST ‘high’ inc/edu
DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
SDWA Violation Not Present 91.2% (719) 89.2% (14794) 89.3% (15513)
SDWA Violation Present 8.8% (69) 10.8% (1783) 10.7% (1852)
Total 100.0% (788) 100.0% (16577) 100.0% (17365)
CEJST ‘low’ inc/edu
DW_ut1 CEJST disadvantaged Not CEJST disadvantaged Total
SDWA Violation Not Present 87.8% (6416) 88.8% (530) 87.9% (6946)
SDWA Violation Present 12.2% (892) 11.2% (67) 12.1% (959)
Total 100.0% (7308) 100.0% (597) 100.0% (7905)
  • total population (see map)
show the code
pop_by_tract_type <- j40_dw %>% 
  ungroup() %>%
  group_by(CEJST_disadv,dw_disadv_ut1) %>% 
  summarise(population = sum(TPF,na.rm=TRUE)) %>% ungroup()
`summarise()` has grouped output by 'CEJST_disadv'. You can override using the
`.groups` argument.
although coordinates are longitude/latitude, st_union assumes that they are
planar
although coordinates are longitude/latitude, st_union assumes that they are
planar
although coordinates are longitude/latitude, st_union assumes that they are
planar
although coordinates are longitude/latitude, st_union assumes that they are
planar
show the code
j40_types <- j40_dw %>%
  mutate(cat1 = case_when(
    dw_disadv_ut1 & CEJST_disadv == "CEJST disadvantaged"  ~ "DW EJ indicator + Current CEJST",
    !dw_disadv_ut1 & CEJST_disadv == "CEJST disadvantaged"  ~ "No DW EJ indicator + Current CEJST",
    !dw_disadv_ut1 & CEJST_disadv == "Not CEJST disadvantaged"  ~ "No DW EJ indicator + Not Current CEJST",
    dw_disadv_ut1 & CEJST_disadv == "Not CEJST disadvantaged"  ~ "DW EJ indicator + Not Current CEJST"
  ),
        cat2 = case_when(
         DW_ut1=="SDWA Violation Present" & CEJST_disadv == "CEJST disadvantaged"  ~ "DW violation + Current CEJST",
    DW_ut1=="SDWA Violation Not Present" & CEJST_disadv == "CEJST disadvantaged"  ~ "No DW violation + Current CEJST",
    DW_ut1=="SDWA Violation Not Present" & CEJST_disadv == "Not CEJST disadvantaged"  ~ "No DW violation + Not Current CEJST",
    DW_ut1=="SDWA Violation Present" & CEJST_disadv == "Not CEJST disadvantaged"  ~ "DW violation + Not Current CEJST"   
        )) %>% ungroup()

j40_types_1 <- j40_types %>% group_by(cat1) %>% summarise(pop=sum(TPF,na.rm=TRUE))
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
show the code
j40_types_2 <- j40_types %>% group_by(cat2) %>% summarise(pop=sum(TPF,na.rm=TRUE))
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
although coordinates are longitude/latitude, st_union assumes that they are planar
show the code
table <- pop_by_tract_type %>% 
  st_drop_geometry() %>% 
  mutate(pop_millions = population/1000000) %>% 
  select(-population) %>%
  pivot_wider(names_from=dw_disadv_ut1, values_from = pop_millions) %>% 
  rename(`DW Violation + low income`=`TRUE`,
                                                                                                                                 `No DW Violation or not low income`=`FALSE`)
kable(table)
CEJST_disadv No DW Violation or not low income DW Violation + low income
CEJST disadvantaged 32.18822 3.655508
Not CEJST disadvantaged 82.27428 0.338452

Map 1: Census Tracts categorized by drinking water violations and current CEJST status

show the code
mapviewOptions(fgb = TRUE)
m1<-mapview::mapview(j40_types_2, zcol="cat2", layer.name="Tract Category")
mapview::mapshot(m1,url="map1.html")
m1

Map 2: Census Tracts categorized by provisional drinking water EJ indicator (drinking water violation + low income) and current CEJST status

show the code
mapviewOptions(fgb = TRUE)
m2<-mapview::mapview(j40_types_1, zcol="cat1", layer.name="Tract Category")
mapview::mapshot(m2,url="map2.html")
m2
  • measures of income distribution
show the code
library(tidycensus)
census_api_key("b25f8b1b7bf10561c9cbc3a20a4d2572677f1f05")
options(tigris_use_cache = TRUE)
tr <- tidycensus::get_acs()
  • race/ethnicity population distributions

4 Discussion

Will discuss results in detail here.

4.1 Headline 1:

959 “poor” tract/communities across the 11 states were identified to have safe drinking water violations, 67 of which were not previously identified as disadvantaged in the CEJST

References

Allaire, Maura, Haowei Wu, and Upmanu Lall. 2018. “National Trends in Drinking Water Quality Violations.” Proceedings of the National Academy of Sciences 115 (9): 2078–83. https://doi.org/10.1073/pnas.1719805115.
Dobbin, Kristin B., and Amanda L. Fencl. 2021. “Institutional Diversity and Safe Drinking Water Provision in the United States.” Utilities Policy 73 (December): 101306. https://doi.org/10.1016/j.jup.2021.101306.
Konisky, David M., and Manuel P. Teodoro. 2015. “When Governments Regulate Governments.” American Journal of Political Science 60 (3): 559–74. https://doi.org/10.1111/ajps.12221.
Marcillo, Cristina E., and Leigh-Anne H. Krometis. 2019. “Small Towns, Big Challenges: Does Rurality Influence Safe Drinking Water Act Compliance?” AWWA Water Science 1 (1). https://doi.org/10.1002/aws2.1120.
Switzer, David, and Manuel P. Teodoro. 2017. “Class, Race, Ethnicity, and Justice in Safe Drinking Water Compliance*.” Social Science Quarterly 99 (2): 524–35. https://doi.org/10.1111/ssqu.12397.
Wallsten, Scott, and Katrina Kosec. 2008. “The Effects of Ownership and Benchmark Competition: An Empirical Analysis of U.S. Water Systems.” International Journal of Industrial Organization 26 (1): 186–205. https://doi.org/10.1016/j.ijindorg.2006.11.001.