I have a dataframe of US zipcodes and I want to add a sequence of numbers to each unique zipcode while repeating the rest of the rows. Right now, my data looks like so:
zip city state_name
<chr> <chr> <chr>
1 01001 Agawam Massachusetts
2 01002 Amherst Massachusetts
3 01003 Amherst Massachusetts
4 01005 Barre Massachusetts
5 01007 Belchertown Massachusetts
For each row, I'd like it to look like this, (for each zipcode.)
zip city state_name Num
<chr> <chr> <chr>
01001 Agawam Massachusetts .8
01001 Agawam Massachusetts 1.0
01001 Agawam Massachusetts 1.2
01001 Agawam Massachusetts 1.4
And so on for the rest of the rows.
Data here:
structure(list(zip = c("01001", "01002", "01003", "01005", "01007"
), city = c("Agawam", "Amherst", "Amherst", "Barre", "Belchertown"
), state_name = c("Massachusetts", "Massachusetts", "Massachusetts",
"Massachusetts", "Massachusetts")), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
CodePudding user response:
If I understand your question you can accomplish this with group_by
and summarize
in dplyr.
library("dplyr")
df |>
group_by(across(everything())) |>
summarize(Num=seq(0.8, 1.4, 0.2)) |>
ungroup()
# A tibble: 20 × 4
zip city state_name Num
<chr> <chr> <chr> <dbl>
1 01001 Agawam Massachusetts 0.8
2 01001 Agawam Massachusetts 1
3 01001 Agawam Massachusetts 1.2
4 01001 Agawam Massachusetts 1.4
5 01002 Amherst Massachusetts 0.8
6 01002 Amherst Massachusetts 1
7 01002 Amherst Massachusetts 1.2
8 01002 Amherst Massachusetts 1.4
9 01003 Amherst Massachusetts 0.8
10 01003 Amherst Massachusetts 1
11 01003 Amherst Massachusetts 1.2
12 01003 Amherst Massachusetts 1.4
13 01005 Barre Massachusetts 0.8
14 01005 Barre Massachusetts 1
15 01005 Barre Massachusetts 1.2
16 01005 Barre Massachusetts 1.4
17 01007 Belchertown Massachusetts 0.8
18 01007 Belchertown Massachusetts 1
19 01007 Belchertown Massachusetts 1.2
20 01007 Belchertown Massachusetts 1.4