Home > Software design >  Subsetting a dataframe based on the value in the rows
Subsetting a dataframe based on the value in the rows

Time:06-29

I am trying to sort this string into a data frame, "Population_G", that contains all of the rows beginning with G.

str="(H:3#3,G:3#3)#3;"
treenodes <- stringr:: str_replace_all(str,"[(,)]", " ")
treenodes <- stringr:: str_replace_all(treenodes,"  ", " ")
treenodes <- stringr:: str_replace_all(treenodes,"   ", " ")
treenodes <- strsplit(treenodes," ")
treenodes <-as.data.frame(treenodes)
treenodes <- as.data.frame(tidyr:: separate_rows(treenodes))
colnames(treenodes) = 1

Result:

      1
1      
2 H:3#3
3 G:3#3
4   #3;

I have looked through previous answers; however, they do not fit this case. I am new to R and appreciate your help!

CodePudding user response:

Something like this?

library(tidyverse)

str="(H:3#3,G:3#3)#3;"

Population_G <- str %>% 
  data.frame(G = .) %>% 
  mutate(G = trimws(str_replace_all(G,"[(,)]", " "))) %>% 
  separate_rows(G, sep = " ") %>% 
  filter(str_detect(G, "^G"))

Output

  G    
  <chr>
1 G:3#3

CodePudding user response:

str.split <- unlist(strsplit(str, "[(,)]"))
data.frame(G = str.split[grepl("^G", str.split)])
# G
# 1 G:3#3
  • Related