I'm trying to create a retweet network from raw tweet text I have. The text is formatted like this:
tweet_vector <- c("RT @person: tweet tweet tweet",
"RT @otherperson: tweet tweet",
"Tweet, this isn't a retweet, @3rdperson.",
"RT @4thperson: this retweet also has a mention, @mentioned")
I want to create a function that returns the following:
[1] "person"
[2] "otherperson"
[3] NA
[4] "4thperson"
I can't just use str_extract("\\@*", tweet_vector)
because I don't want to catch @3rdperson
CodePudding user response:
str_extract(tweet_vector, "(?<=@)\\w (?=:)")
[1] "person" "otherperson" NA "4thperson"
str_extract(tweet_vector, "(?<=RT @)\\w ")
[1] "person" "otherperson" NA "4thperson"
sub(".*?@(\\w ):.*|.*", "\\1", tweet_vector)
[1] "person" "otherperson" "" "4thperson"