I have a df that contains a lot of genes and information about those genes. It looks something like this
**Genes** **Log2FC** **SD** **Pvalue**
A2M 2 3 0.001
Aars 4 4 0.001
Actb;Actg1 3 5 0.001
Cxcl1;Cxcl2;Cxcl3 5 6 0.001
What I would typically do to get the list of genes that I want would be something like df[,1]. However, in this case some of the rows contain multiple genes separated by a ";". Is it possible to pull these genes out?
Using df[,1] I would get a list like... A2M, Aars, Actb;Actg1, Cxcl1;Cxcl2;Cxcl3
What I want instead would be this: A2M, Aars, Actb, Actg1, Cxcl1, Cxcl2, Cxcl3
Thank you!
I can accomplish this in Excel using the "Text to Columns" feature. But I would like to be able to do everything in R. If someone could help, I would greatly appreciate it.
CodePudding user response:
If you want each of the genes to have it's own row in the data, tidyr::separate_rows(your_data, Genes)
should work. If you want the genes as a vector not in your data frame, your_data$Genes |> strsplit(split = ";") |> unlist()
.