It is reproducible example.
df2 <- data.frame(Num = c(1,2,3), Comment = c('nick comment12021.12.01 nickn comment2222021.12.02 nickname333 commennnnt222021.12.01', 'nick comment12021.12.01 nickn comment2222021.12.02 nickname333 commeeeent222021.12.01','nick comment12021.12.01 nickn comment2222021.12.02 nickname3333333 comment22021.12.01') )
Num Comment
----------------------------------------------------------------------------
1 Tom comment1~ Jay comment2 Yun comment 3 ~
2 Tim comment1~ Cristal comment2~ Lomio comment3~
3 Tracer comment1~ Teemo comment2~ Irelia comment3~
--------------------------------------------------------------------------
I have a dataframe with 2 columns and many rows. These are comments I got from crawling a website. However, since it is a very dynamic website, I had no choice but to get nicknames and comments from multiple people at once.
I want to delete nicknames from this irregular chunk of text and create a word cloud with only comments. But I can't think of a way to delete only the nickname. The length of nicknames and comments is irregular, so I can't do it the way I know.
CodePudding user response:
If you have a fixed separator (like exactly seven spaces (" {7}"
using regular expressions) you mentioned in your comments), you can do the following:
dd <- data.frame(
id = 1:3,
comment = c(
"Tom comment1~ Jay comment2~ Yun comment3~",
"Tim comment1~ Cristal comment2~ Lomio comment3~",
"Tracer comment1~ Teemo comment2~ Irelia comment3~"
)
)
extract_comments <- function(comments) {
lapply(
comments,
function(x) {
sp <- strsplit(x, " {7}")[[1]]
sp <- trimws(sp)
ppl <- seq(1, length(sp), by = 2)
data.frame(
ex_person = sp[ppl],
ex_comment = sp[ppl 1]
)
}
)
}
dd$extracted <- extract_comments(dd$comment)
tidyr::unnest(dd, extracted)
#> # A tibble: 9 x 4
#> id comment ex_person ex_comment
#> <int> <chr> <chr> <chr>
#> 1 1 Tom comment1~ Jay ~ Tom comment1~
#> 2 1 Tom comment1~ Jay ~ Jay comment2~
#> 3 1 Tom comment1~ Jay ~ Yun comment 3
#> 4 2 Tim comment1~ Cristal ~ Tim comment1~
#> 5 2 Tim comment1~ Cristal ~ Cristal comment2~
#> 6 2 Tim comment1~ Cristal ~ Lomio comment3~
#> 7 3 Tracer comment1~ Teemo~ Tracer comment1~
#> 8 3 Tracer comment1~ Teemo~ Teemo comment2~
#> 9 3 Tracer comment1~ Teemo~ Irelia comment3~