How many Houston homes lost power as a result of the 2021 winter storms in Texas? Is median household income a predictor of community recovery? Were there disproportionate affects on medically vulnerable communities?
Background
Severe winter storms hit Texas February 10th through the 20th in 2021, producing the “worst energy infrastructure failure in Texas state history”.1 Millions of homes lost power, resulting in at least 57 deaths and over $195 billion in property damage.2 As climate change increases the frequency and severity of natural disasters that lead to power outages, environmental justice concerns arise around community recovery and resiliency. Studies have shown median household income to be significant predictors of recovery time 3 and power outages place electricity-dependent individuals (e.g. those with disabilities needing electricity to run medical equipment or refrigeration to store medicine) at higher risk for adverse health outcomes.4 Focusing analysis on Houston, the most populous city in Texas 5, we can use overlay spatial data to quantify how many residents lost power and the potential socioeconomic disparities in those affected by the power outage.
This NASA Earth data product is distributed in 10x10 degree tiles in sinusoidal equal-area projection. Tiles are identified by their horizontal and vertical position in the grid. Houston lies on the border of tiles h08v05 and h08v06, so there will be two tiles per date.
checkout the code
#load required packageslibrary(terra)library(tmap)library(stars)library(tidyverse)library(gridExtra)library(tidycensus)# Set filepathrootdir <- ("/Users/michelle/Documents/UCSB Grad School/Courses/eds_223")datadir <-file.path(rootdir,"data")#accessing stored API key for census datacensus_token <-Sys.getenv('CENSUS_KEY')census_api_key(census_token)#load in night lights tiles from 2021-02-07 night_lights1 <-read_stars(file.path(datadir,"VNP46A1","VNP46A1.A2021038.h08v05.001.2021039064328.tif"))night_lights2 <-read_stars(file.path(datadir,"VNP46A1","VNP46A1.A2021038.h08v06.001.2021039064329.tif"))#combine tiles for 2021-02-07 combined_feb_7 <-st_mosaic(night_lights1, night_lights2)#load in night lights tiles from 2021-02-16night_lights3 <-read_stars(file.path(datadir,"VNP46A1","VNP46A1.A2021047.h08v05.001.2021048091106.tif"))night_lights4 <-read_stars(file.path(datadir,"VNP46A1","VNP46A1.A2021047.h08v06.001.2021048091105.tif"))#combine tiles for 2021-02-16combined_feb_16 <-st_mosaic(night_lights3, night_lights4)
OpenStreetMap Data
OpenStreetMap (OSM) is a project that provides free, publicly available geographic data of the world. This analysis uses OSM roads and building data taken from Geofabrik’s download sites. Roads and building data are shapefiles that are provided via a Geopackage (.gpkg file).
Highways/Roads
Because highways account for a large portion of the night lights observable from space, areas near highways are ignored in order to minimize falsely identifying areas with reduced traffic as areas without power.
checkout the code
#define the SQL query and load in highways datasetquery <-"SELECT * FROM gis_osm_roads_free_1 WHERE fclass='motorway'"highways <-st_read(file.path(datadir,"gis_osm_roads_free_1.gpkg"), query = query)#reproject highways data to EPSG: 3083highways <-st_transform(highways, crs =3083)
Buildings/Houses
To understand which houses were effected by the power outages, a spatial dataset indicating locations of houses for the Houston metropolitan area is utilized.
checkout the code
#define buildings_query and load in buildings datasetbuildings_query <-"SELECT * FROM gis_osm_buildings_a_free_1 WHERE (type IS NULL AND name IS NULL) OR type in ('residential', 'apartments', 'house', 'static_caravan', 'detached')"buildings <-st_read(file.path(datadir,"gis_osm_buildings_a_free_1.gpkg"), query = buildings_query)#reproject buildings data to EPSG: 3083buildings <-st_transform(buildings, crs =3083)
American Community Survey (ACS) Data
American Community Survey (ACS) is a demographics survey program conducted by the U.S. Census Bureau. This program gathers information on a regular, more frequent, basis than the long form decennial census.
Median Household Income
2019 median household income data for each census tract in the Houston area is utilized to understand how socioeconomic factors might play a role in a communities ability to recover from power outages. Income data is provided through an ArcGIS “file geodatabase”. The geodatabase includes layers holding ACS attributes and a separate layer holding the geometry information.
checkout the code
#read in the geometries layer of the ACS data and reproject to EPSG: 3083acs_geom <-st_read(file.path(datadir, "ACS_2019_5YR_TRACT_48_TEXAS.gdb"), layer ="ACS_2019_5YR_TRACT_48_TEXAS")acs_geom <-st_transform(acs_geom, crs =3083)#read in the income layer of the ACS dataacs_income <-st_read(file.path(datadir, "ACS_2019_5YR_TRACT_48_TEXAS.gdb"), layer ="X19_INCOME")#keep only GEOID and B19013e1 column in the ACS income data, rename B19013e1 column to median incomeacs_income_clean <- acs_income |>select(c("GEOID", "B19013e1")) |>rename("median_income"="B19013e1")
Medical Vulnerability
The “one or more disability items allocated” attribute in combination with census tract population from the 2020 ACS is used to see if impacted tracts have a higher population of medically vulnerable people. This data will be accessed through an API and tidycensus package in R. Tidycensus returns a dataframe with desired attributes and geometry column.
checkout the code
#access ACS data variables for 2020 year v20 <-load_variables(2020, "acs5", cache =TRUE)population <-get_acs(state ="Texas",geography ="tract",variables ="B99181_001",geometry =TRUE, year =2020)population_clean <- population |>select("GEOID", "NAME", "estimate", "geometry") |>rename(population ="estimate")disability <-get_acs(state ="Texas", geography ="tract",variables ="B99181_002",geometry =TRUE, year =2020)disability_clean <- disability |>select("GEOID", "NAME", "estimate", "geometry") |>rename(disabled ="estimate")acs_disability <-cbind(disability_clean, population_clean$population) |>rename(population ="population_clean.population") |>mutate(percent_disabled = (disabled/population)*100)acs_disability <-st_transform(acs_disability, crs =3083)
Analysis
In order to find the homes that were impacted by the blackout, I created a blackout mask by doing some map algebra. I subtracted the February 16th raster from the February 7th raster to get a raster containing the change in night lights intensity. I assume that any location that had a drop of more than 200 nW cm-2sr-1 experienced a blackout. Under this assumption, I recalssified the change in night lights intensity raster so that any location with a blackout holds a value of 1 and anywhere else is NA. Then, I vectorized the mask and cropped the mask to the area of interest (in this case the Houston metropolitan area). Here are the coordinates used (-96.5, 29), (-96.5, 30.5), (-94.5, 30.5), (-94.5, 29). To account for the roads and highways, I created a 200m buffer around each highway and excluded them from the blackout mask. Lastly, I spatially joined the blackout mask with the buildings data to get the number of homes that experienced a blackout.
In order to investigate if median household income is a predictor for recovery, I joined income data to census tract geometries and used the blackout mask from the analysis above to find which census tracts experienced blackouts. The distribution of income in the impacted vs unimpacted tracts can then be compared and a map was produced to visualize the findings.
In order to explore if impacted tracts had a higher percentage of people with disabilities, I spatially intersected disability data from ACS with the blackout mask created in the first part of the analysis. This resulting sf object can then be utilized to map relative percentages of disabled people in each census tract and impacted census tracts. Additionally, I visualized the distribution of percent disabled census tracts and calculated the average percent disabled in impacted vs. unimpacted tracts.
checkout the code
#Find the difference between the two rasters (subtracting feb 16 from feb 7 values to get positive values - i.e. nW cm^-2sr^-1 is higher on feb 7 than on feb 16)change_nli <- combined_feb_7-combined_feb_16#check out the change_nli object to see the min and max valueschange_nli#create a reclassification matrix#set lowest value in range to -3120 and highest to 3000 based on the min and max values of the change_nli object#a drop of less than 200 becomes 0 (i.e. no blackout), a drop of more than 200 becomes 1 (i.e. blackout)rcl_matrix <-data.frame(from =c(-3120,200), to =c(199, 3000), becomes =0:1)rcl_matrix#reclassify the change_nli with the matrix created above. rcl_change_nli <-cut(change_nli, breaks =c(-3200, rcl_matrix$to), labels = rcl_matrix$becomes)rcl_change_nli#set 0 (non blackout areas) to NAs rcl_change_nli[rcl_change_nli ==0] =NA#vectorize the blackout maskvect_mask <-st_as_sf(rcl_change_nli) |>st_make_valid()#define Houston areahouston_coord <-rbind(c(-96.5, 29), c(-94.5, 29), c(-94.5, 30.5), c(-96.5,30.5), c(-96.5, 29))#turn Houston coordinates into a polygonhouston_poly <-st_polygon(list(houston_coord))#checkout the CRS of the vect_maskcrs(vect_mask)#convert Houston polygon to simple feature collection and assign the CRS to EPSG: 4326 (same CRS as the vect_mask)houston_geom <-st_sfc(houston_poly, crs =4326)#crop blackout mask to region of interest and reproject to EPSG: 3083blackout_mask_crop <- vect_mask[houston_geom,] |>st_transform(crs =3083)#remove unnecessary columns of highways data frame to create a smaller highways dataframe so it doesn't crash Rhighways_clean <- highways$geom#create a 200m buffer around highways and reproject to EPSG: 3083highway_buffer <-st_union(st_buffer(highways_clean, dist =200))#find areas that experienced blackouts further than 200m from a highway (note: I used st_disjoint to make sure that houses touching the buffer were not included. Alternatively, you could use st_difference and get houses that touch or are on the buffer line.)blackout_highway <- blackout_mask_crop[highway_buffer, ,op = st_disjoint]#find homes within blackout areasblackout_homes <- buildings[blackout_highway,]#join income data to census tract geometriesacs_combined <-left_join(acs_geom, acs_income_clean, by =c("GEOID_Data"="GEOID"))#spatially join census tract data with blackout_homes dataset to get a data frame of census tracts with blackouts, keep only the GEOID_Data and median income, and add a blackout column (used later on for a left_join)blackout_acs <- acs_combined[blackout_homes,] |>select(c("GEOID_Data", "median_income")) |>mutate(blackout ="blackout")
Results
I have calculated a total of 139,103 houses in Houston that lost power as a result of the first two storms. The resulting distributions and calculated averages of impacted vs. unimpacted tracts’ median household income show that impacted tracts had a slightly higher median income (average = $71,462) than the unimpacted census tracts (average = $67,436). This seems a little counter intuitive, but when looking at the data more closely I can see there are a higher number of outliers (i.e. more households with the $250,000 income range) in the impacted tracts which could be pulling the average up. Additionally, when examining if there is a difference in disabled percent across unimpacted vs. impacted tracts, this initial analysis shows no real difference. The calculated mean percent disabled in the impacted tracts is 10.59 and in unimpacted it is 10.62.
checkout the code
#make the acs_combined dataset smaller for quicker processingacs_combined_clean <- acs_combined |>select(c("median_income", "Shape", "GEOID_Data"))#reproject houston_geom to EPGS: 3083houston_geom_3083 <-st_transform(houston_geom, crs =3083)#filter census tract data to Houston extentacs_houston <- acs_combined_clean[houston_geom_3083,]#in order to create a data frame of census tracts in Houston that show which experienced blackouts and which did not, I need to create a new blackout_acs_nogeom dataframe with the geometries dropped in order to left join it to the acs_houston data frame#remove median income from the new blackout_acs data frame since it already exists in the acs_houston data frameblackout_acs_nogeom <-st_drop_geometry(blackout_acs) |>select(c("GEOID_Data", "blackout"))#create the combined data frame (showing blackout and no blackout tracts in Houston)blackout_acs_combined <-left_join(acs_houston, blackout_acs_nogeom, by ="GEOID_Data") |>mutate(blackout =if_else(is.na(blackout), "no blackout", "blackout"))#plot distribution of income in impacted vs. unimpacted tractsggplot(data = blackout_acs_combined, aes(x = blackout, y = median_income, col = blackout)) +geom_jitter(width =0.1) +geom_boxplot(alpha =0.5) +labs(x ="Blackout", y ="Median Income", title ="Houston Power Outage Median Income Distribution")
checkout the code
#split the blackout_acs_combined data frame into two based on blackout and no blackout to plot the separate histogramsno_blackout <- blackout_acs_combined |>filter(blackout =="no blackout")blackout <- blackout_acs_combined|>filter(blackout =="blackout")#create histogram of income in impacted tractsimpacted_tracts <-ggplot(data = blackout, aes(x = median_income)) +geom_histogram(fill ="darkorchid") +labs(x ="Median Income", title ="Median Income Impacted Tracts") +theme_minimal()#create histogram of income in unimpacted tractsunimpacted_tracts <-ggplot(data = no_blackout, aes(x = median_income)) +geom_histogram(fill ="forestgreen") +labs(x ="Median Income", title ="Median Income Unimpacted Tracts") +theme_minimal()#put the histograms side by sidegrid.arrange(impacted_tracts, unimpacted_tracts, ncol =2, nrow =1)
checkout the code
#calculate the average median income in the impacted vs. unimpacted census tractsacs_houston_summary <- blackout_acs_combined |>group_by(blackout) |>summarize(average_income =mean(median_income, na.rm =TRUE))#spatially join census tract data with blackout_homes dataset to get a data frame of census tracts with blackouts, keep only the GEOID and disability percent, and add a blackout column (used later on for a left_join)blackout_acs_disabled <- acs_disability[blackout_homes,] |>select(c("GEOID", "percent_disabled")) |>mutate(blackout ="blackout")#filter census tract disability data to Houston extentacs_disability_houston <- acs_disability[houston_geom_3083,]#drop geometryblackout_acs_disabled_nogeom <-st_drop_geometry(blackout_acs_disabled) |>select(-"percent_disabled")#create combined dataframe of blackout and non blackout tractsblackout_acs_disabled_combined <-left_join(acs_disability_houston, blackout_acs_disabled_nogeom, by ="GEOID") |>mutate(blackout =if_else(is.na(blackout), "no blackout", "blackout"))#plot distribution of disability percentage in impacted vs. unimpacted tractsggplot(data = blackout_acs_disabled_combined, aes(x = blackout, y = percent_disabled, col = blackout)) +geom_jitter(width =0.1) +geom_boxplot(alpha =0.5) +labs(x ="Blackout", y ="Percent Disabled", title ="Houston Power Outage Disabled Percentage Distribution")
See below map of median income for each census tract in the Houston metropolitan area and if they experienced a blackout.
checkout the code
#create map of median income by census tract, designating blackout tractstmap_mode("view")
tmap mode set to interactive viewing
checkout the code
tm_shape(acs_houston) +tm_polygons(col ="median_income", title ="Median Income") +tm_shape(blackout_acs) +tm_dots(title ="Blackout Tracts", col ="blackout", palette =c("black"), legend.show =TRUE) +tm_layout(main.title ="Houston Median Income and Feb 2021 Blackouts")