Here is my dataframe
df2 <- structure(list(Code = c("ICB-9_label_1", "1", "2", "3",
"4", "5", "1", "ICB-10_label_2", "3", "4", "5",
"1", "2", "3", "3", "5", "1", "2",
"3", "4", "5", "1", "2", "3", "4",
"5", "1", "2", "3", "4", "5", "1",
"2", "3", "4", "5", "1", "2", "3",
"4", "5", "1", "2", "3", "4", "5",
"1"), Description = c("description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here", "description here", "description here", "description here",
"description here")), row.names = c(NA, -47L), class = c("tbl_df",
"tbl", "data.frame"))
And here is what it looks like as a table:
Code Description
ICB-9_label_1 description here
1 description here
2 description here
3 description here
4 description here
5 description here
1 description here
ICB-10_label_2 description here
3 description here
4 description here
I'd like to create a third column called "Labels". It would say "ICB_9_label_1" all the way down until it hits the row number for "ICB_10_label_2", then the column would say "ICB_10_label_2" all the way down. I don't want to override the numbers in the first column, since the 1, 2, 3, 4, 5 values are important.
CodePudding user response:
There are multiple ways to do this. An option is to extract the row that have 'label', while others returns NA and then use fill
to change the NA elements to previous non-NA value
library(dplyr)
library(tidyr)
library(stringr)
df2 <- df2 %>%
mutate(Labels = str_extract(Code, '.*label.*')) %>%
fill(Labels, .direction = 'downup')
-output
df2
# A tibble: 47 × 3
Code Description Labels
<chr> <chr> <chr>
1 ICB-9_label_1 description here ICB-9_label_1
2 1 description here ICB-9_label_1
3 2 description here ICB-9_label_1
4 3 description here ICB-9_label_1
5 4 description here ICB-9_label_1
6 5 description here ICB-9_label_1
7 1 description here ICB-9_label_1
8 ICB-10_label_2 description here ICB-10_label_2
9 3 description here ICB-10_label_2
10 4 description here ICB-10_label_2
# … with 37 more rows
Or using base R
with grep
and cumsum
transform(df2, Labels = grep('label', Code,
value = TRUE)[cumsum(grepl('label', Code))])