Home > OS >  add a new column to a df that contains the colnames of another table
add a new column to a df that contains the colnames of another table

Time:07-14

i have a dataframe but the cb column is not complete. i would need to add another column from another object which has to match CB but with additional information.

dataframe A:

> head(totalfrag)
                  CB frequency_count mononucleosomal nucleosome_free reads_count
1 TCTTCAAGTTCCGGCT-1           15939            5356            6417       31398
2 AAGTGAAGTAGTAAGA-1           22532            8572            7956       44711
3 TATGCATCATAAGCAA-1             227              92              87         386
4 TCCATCATCCTAGTAA-1           39909           16084           14439       76768

Column of interest:

> head(colnames(subset))

[1] "KO_d3_r1_TCTTCAAGTTCCGGCT-1" "KO_d3_r1_AAACCGGCACCTCGCT-1" "KO_d3_r1_AAGTGAAGTAGTAAGA-1" "KO_d3_r1_TATGCATCATAAGCAA-1" "KO_d3_r1_TCCATCATCCTAGTAA-1"
[6] "KO_d3_r1_AAAGCGGGTCTAACAG-1"

I don't know if it is possible to replace the conna with the extra information or add a new column that matches CB. My final idea would be:

> head(totalfrag)
              CB                      frequency_count mononucleosomal nucleosome_free reads_count
    1 ko_d3_r1_TCTTCAAGTTCCGGCT-1           15939            5356            6417       31398
    2 ko_d3_r1_AAGTGAAGTAGTAAGA-1           22532            8572            7956       44711
    3 ko_d3_r1_TATGCATCATAAGCAA-1             227              92              87         386
    4 ko_d3_r1_TCCATCATCCTAGTAA-1           39909           16084           14439       76768

CodePudding user response:

First, let's make the data reproducible (you should do that yourself in your next questions).

CB <- c("TCTTCAAGTTCCGGCT-1", "AAGTGAAGTAGTAAGA-1", "TATGCATCATAAGCAA-1", "TCCATCATCCTAGTAA-1", "AAAGCGGGTCTAACAG-1", "AAAGCGGGTCTAACAG-1")
fullinfo <- c("KO_d3_r1_TCTTCAAGTTCCGGCT-1", "KO_d3_r1_AAACCGGCACCTCGCT-1", "KO_d3_r1_AAGTGAAGTAGTAAGA-1", "KO_d3_r1_TATGCATCATAAGCAA-1", "KO_d3_r1_TCCATCATCCTAGTAA-1", "KO_d3_r1_AAAGCGGGTCTAACAG-1")

We can use substr to get the keys and match to match them.

full_cropped <- substr(fullinfo, 10, 27)
result <- fullinfo[match(CB, full_cropped)]
df <- data.frame(CB, result)
> df
                  CB                      result
1 TCTTCAAGTTCCGGCT-1 KO_d3_r1_TCTTCAAGTTCCGGCT-1
2 AAGTGAAGTAGTAAGA-1 KO_d3_r1_AAGTGAAGTAGTAAGA-1
3 TATGCATCATAAGCAA-1 KO_d3_r1_TATGCATCATAAGCAA-1
4 TCCATCATCCTAGTAA-1 KO_d3_r1_TCCATCATCCTAGTAA-1
5 AAAGCGGGTCTAACAG-1 KO_d3_r1_AAAGCGGGTCTAACAG-1
6 AAAGCGGGTCTAACAG-1 KO_d3_r1_AAAGCGGGTCTAACAG-1
  • Related