I am trying to convert spatial data from a CDC/HHS data on hospital strain, as downloadable from here:
https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u
Here's a snippet of the data:
hospital_name hospital_pk geocoded_hospital_address
TRIHEALTH EVENDALE HOSPITAL 360362 POINT (-84.420098 39.253934)
KANE COUNTY HOSPITAL 461309 POINT (-112.52859 37.054324)
CRAIG HOSPITAL 062011 POINT (-104.978247 39.654008)
For entry:
structure(list(hospital_name = c("TRIHEALTH EVENDALE HOSPITAL",
"KANE COUNTY HOSPITAL", "CRAIG HOSPITAL", "JAY HOSPITAL", "HARRISON COUNTY COMMUNITY HOSPITAL"
), geocoded_hospital_address = c("POINT (-84.420098 39.253934)",
"POINT (-112.52859 37.054324)", "POINT (-104.978247 39.654008)",
"POINT (-87.151673 30.950024)", "POINT (-94.025425 40.26528)"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))
I'm trying to import it as an CSV, transform it, and then turn it into a shapefile. The file has a field, termed geocoded_hospital_address, that I am trying to use to convert the dataset. It is in POINT(longitude, latitude) format e.g., "POINT (-100.01382, 37.441504)". I am used to using two variables (longitude/latitude) under the coords option, and I cannot get the "sf_column_name" option to work for me or decompose the field into two parts:
test_sf<-COVID_19_Reported_Patient_Impact_and_Hospital_Capacity_by_Facility%>%
st_as_sf(sf_column_name="geocoded_hospital_address", crs=4326)
Error in st_sf(x, ..., agr = agr, sf_column_name = sf_column_name) :
no simple features geometry column present
Any ideas?
CodePudding user response:
I think the problem is you have NA in geocoded_hospital_address
. Remove them will fix your problem.
library(sf)
df_0 <- COVID_19_Reported_Patient_Impact_and_Hospital_Capacity_by_Facility %>%
filter(!is.na(geocoded_hospital_address))
test_sf = st_as_sf(df_0,crs=4326, wkt = "geocoded_hospital_address")
CodePudding user response:
This is a ridiculous solution, but it's the best I've got since the shapefile isn't downloadable.
library(tidyverse)
library(sf)
x <- read_csv('COVID-19_Reported_Patient_Impact_and_Hospital_Capacity_by_Facility.csv')
# alter geometry column to get just coordinates
# remove 'POINT', parentheses, and whitespace
x$coords <- x$geocoded_hospital_address %>%
str_remove('POINT') %>%
str_remove('\\(') %>%
str_remove('\\)') %>%
str_trim()
# remove NA coords, then separate 'coords' into x & y, transform to an 'sf' object
x_sf <- x %>%
filter(!is.na(coords)) %>%
separate(coords, into = c('x','y'), sep = ' ') %>%
st_as_sf(coords = c('x','y'))
head(x_sf)
#> Simple feature collection with 6 features and 128 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -108.616 ymin: 24.71104 xmax: -80.21099 ymax: 39.10636
#> CRS: NA
#> # A tibble: 6 × 129
#> hospital_pk collecti…¹ state ccn hospi…² address city zip hospi…³ fips_…⁴
#> <chr> <date> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 060054 2020-06-05 CO 0600… COMMUN… 2351 '… GRAN… 81505 Short … 08077
#> 2 100156 2020-06-19 FL 1001… HCA FL… 340 NW… LAKE… 32055 Short … 12023
#> 3 101312 2020-05-15 FL 1013… FISHER… 3301 O… MARA… 33050 Critic… 12087
#> 4 102001 2020-06-12 FL 1020… SELECT… 955 NW… MIAMI 33128 Long T… 12086
#> 5 102013 2020-06-26 FL 1020… KINDRE… 4801 N… TAMPA 33603 Long T… 12057
#> 6 102028 2020-05-01 FL 1020… SELECT… 5050 C… OXFO… 34484 Long T… 12119
#> # … with 119 more variables: is_metro_micro <lgl>, total_beds_7_day_avg <dbl>,
#> # all_adult_hospital_beds_7_day_avg <dbl>,
#> # all_adult_hospital_inpatient_beds_7_day_avg <dbl>,
#> # inpatient_beds_used_7_day_avg <dbl>,
#> # all_adult_hospital_inpatient_bed_occupied_7_day_avg <dbl>,
#> # inpatient_beds_used_covid_7_day_avg <dbl>,
#> # total_adult_patients_hospitalized_confirmed_and_suspected_covid_7_day_avg <dbl>, …