I have a df that houses both addresses and zipcodes. For each zipcode there are multiple addresses, and occasionally, there are repeat addresses within each zipcode (Apartment building and high risers). I am interested in finding the number of unique Addresses within each zipcode. I thought about using group_by
from dplyr
but I cannot get the unique counts.
Data structure is:
df = data.frame(zipcode= c(94613,94613,94613,94615,94615, 94615, 94615, 94615, 95901, 95901, 95901, 95901), Address= c('5000 MACARTHUR BLVD','5000 MACARTHUR BLVD','4000 MACARTHUR BLVD','7000 MACARTHUR BLVD','7000 MACARTHUR BLVD','7000 MACARTHUR BLVD', '9000 MACARTHUR BLVD','8000 MACARTHUR BLVD', '1675 9TH ST', '1675 9TH ST','1675 7TH ST','1670 7TH ST'), buildingName = c('THis', 'THat', 'The', 'Other','Blah','Blah','THat','THat','THat','THat','THat','THat'))
This is the start of my script:
dff = df %>% group_by(zipcode) %>% group_by(Address) %>% summarise(HELP)
CodePudding user response:
We can count()
zipcode
and Address
to get the number of each combination. Is this your desired output?
library(dplyr)
df %>%
count(zipcode, Address)
#> zipcode Address n
#> 1 94613 4000 MACARTHUR BLVD 1
#> 2 94613 5000 MACARTHUR BLVD 2
#> 3 94615 7000 MACARTHUR BLVD 3
#> 4 94615 8000 MACARTHUR BLVD 1
#> 5 94615 9000 MACARTHUR BLVD 1
#> 6 95901 1670 7TH ST 1
#> 7 95901 1675 7TH ST 1
#> 8 95901 1675 9TH ST 2
Created on 2022-10-17 with reprex v2.0.2