Home > Blockchain >  does R have a "interpolate()" function?
does R have a "interpolate()" function?

Time:10-05

I have a large df and I'm trying to relocate the columns with patterns instead of manually write each column name in select(). More details here.

  • A glimpse of the issue (edit): All my columns share a pattern ARG_G1_50_AAA or ARG_G2_50_AAA or NARR_G1_50_AAA or NARR_G2_50_AAA. The final parts are: AAA, AAC, AC and AB. I need two subsets of this data.

Set 1: I need to intercalate "G1" and "G2" columns (in the order 50, 100, 150 and 200) and in the order (AAA, AAC, AC and AB). Ex:

NARR_G1_50_AAA, NARR_G2_50_AAA,
NARR_G1_50_AAC, NARR_G2_50_AAC.... so on

Set 2: I need to intercalate "Narr" and "Arg" columns (again, 50 before 100, 150 and 200 and AAA before AAC, AC and AB). No need to intercalate G1 and G2 now. Ex:

NARR_G1_50_AAA, ARG_G1_50_AAA, 
NARR_G2_50_AAA, ARG_G2_50_AAA... so on 
  • Basically, I was able to partially solve my problem (cf. linked post above) with:
dfPaired <- merged_DF %>%
            dplyr::select(ID, str_subset(names(merged_DF), "G?_50\\w*")) 
  
   head(dfPaired)
       ID ARG_G1_50_AAA ARG_G1_50_AAC ARG_G1_50_AC ARG_G1_50_AB 
          ARG_G2_50_AAA ARG_G2_50_AAC, ARG_G2_50_AC ARG_G2_50_AB....

## I know that I'm only getting the "50" here, in fact I need all, but It wouldn't be "A" problem to repeat the code for 100, 150, 200)
  • How can I make R "intercalate" the strings? I mean, I need:
ARG_G1_50_AAA, ARG_G2_50_AAA 
ARG_G1_50_AAC, ARG_G2_50_AAC, 
ARG_G1_50_AC, ARG_G2_50_AC,
ARG_G1_50_AB, ARG_G2_50_AB ... (so on) 

(intercalate G1 and G2 coluns in case of set 1)

Questions :

  • Could I use sth as seq(by = 2) ?
  • Is there a way to pass two patterns to str() and ask it to intercalate the output?
  • Is there an "intercalate()" function that I could pass to str_subset(names(merged_DF), "G?_50\\w*")) ? ** I mean, sth as int(str_subset(names(merged_DF), "G1_50\w*")), str_subset(names(merged_DF), "G2_50\w*")) Thanks in advance :)

EDIT:

dput(merged_DF[1:50])
structure(list(ID = structure(c("P1", "P2", "P3", "P4", "P5", 
"P6", "P7", "P8", "P9", "P10", "P11", "P12", "P13", "P14", "P15", 
"P16", "P17", "P18", "P19", "P20", "P21", "P22", "P23", "P24", 
"P25", "P26", "P27", "P28", "P29", "P30", "P31", "P32", "P33", 
"P34", "P35", "P36", "P37", "P38", "P39", "P40", "P41", "P42", 
"P43", "P44", "P45", "P46", "P47", "P48", "P49", "P50", "P51", 
"P52", "P53", "P54", "P55", "P56", "P57", "P58", "P59", "P60", 
"P61", "P62", "P63", "P64", "P65", "P66", "P67", "P68", "P69", 
"P70", "P71"), class = c("glue", "character")), ARG_G1_100_AAA = c(68.53, 
65.9, 69.78, 68.29, NaN, 69.5, 67.05, 73.74, 73.59, 72.57, 64.33, 
67.79, 72.94, 63.75, 71.56, 75.5, 68.16, NA, 65.64, 68.36, 69.75, 
72.73, 67.67, 66.19, 62.94, 72.48, 72.19, 62.44, 72.5, 71.06, 
70.4, 69.14, NA, 67.59, 69.1, 74.05, NA, 68.6, 68.27, 59.12, 
NA, NA, 63.7, 67.18, NA, 68.38, 63.44, 72.56, 66.06, 66.53, 73.19, 
NA, NA, NA, 73.44, 67.45, 72.91, 65.81, 73.96, 75, 75.89, 72, 
NA, 68.2, 67.29, 69.91, NaN, 69.67, 68.39, 69.2, 67.55), ARG_G1_100_AAC = c(70.18, 
67.65, 71.89, 70.42, NaN, 72.38, 69.67, 75.63, 76.7, 76.21, 66.5, 
70.57, 76.72, 66.4, 74.75, 79.17, 70.84, NA, 67.82, 70, 71.88, 
74.55, 69.33, 69.5, 65.25, 75.05, 75.44, 64.56, 74.88, 74.29, 
72.4, 71.93, NA, 69.12, 71.43, 77.53, NA, 71.93, 70.4, 60.25, 
NA, NA, 64.8, 69, NA, 71.19, 71.12, 75.04, 68.89, 68.26, 75.81, 
NA, NA, NA, 75.89, 68.82, 77.35, 68.38, 76.71, 79.12, 78.89, 
73.5, NA, 69.7, 69.82, 70.91, NaN, 72, 71.17, 71.85, 69.7), ARG_G1_100_AC = c(4.35, 
4.95, 1.44, 2.71, NaN, 3.25, 3.95, 2.26, 0.85, 1.21, 5.33, 5.43, 
0.83, 10.4, 2.56, 0.33, 4.92, NA, 10.55, 3.43, 2.94, 1.55, 5.33, 
6.44, 5.25, 2, 3.12, 8.5, 1.38, 3.76, 1.9, 2.79, NA, 4.06, 5.57, 
1.95, NA, 6.07, 2.67, 7, NA, NA, 8, 4.76, NA, 4.19, 2.68, 3, 
4.94, 4.79, 2.19, NA, NA, NA, 1.78, 5.27, 2.52, 5.88, 1.96, 1.12, 
0.67, 3.28, NA, 3.5, 3.41, 3.73, NaN, 3.83, 6.06, 3.3, 3.9), 
    ARG_G1_100_AB = c(4.94, 6.55, 2.44, 3, NaN, 3.25, 4.71, 2.84, 
    1.07, 2, 5.33, 5.43, 1.72, 10.55, 3, 1.17, 5.8, NA, 10.55, 
    4.21, 2.94, 3.55, 6.33, 8.25, 5.88, 2, 3.44, 9.22, 1.69, 
    4.18, 2.5, 4.71, NA, 4.41, 5.9, 2.21, NA, 6.67, 3.33, 7, 
    NA, NA, 8, 4.76, NA, 4.44, 2.68, 3.16, 4.94, 5.42, 2.81, 
    NA, NA, NA, 1.78, 6.09, 2.52, 6.56, 1.96, 1.12, 0.67, 3.78, 
    NA, 3.5, 3.65, 5.27, NaN, 4.33, 6.78, 3.6, 4.35), ARG_G1_150_AAA = c(93.38, 
    90.2, 98.33, 94.69, NaN, 99, 93.64, 104.22, 104.8, 103.17, 
    87, 93.83, 101.89, 87.5, 100.38, 107, 94.69, NA, 90.75, 91.5, 
    93.88, 99.5, NaN, 89.5, 86.5, 100.55, 101, 84.22, 101.88, 
    94.62, 97.2, 96.5, NA, 87.38, 96.82, 103.67, NA, 97.57, 95.86, 
    84, NA, NA, 85.5, 90.5, NA, 96.29, 89.71, 101.64, 92.33, 
    93.89, 104.43, NA, NA, NA, 101.33, 93.5, 105.42, 90.75, 104.23, 
    108.86, 102.67, 97, NA, 91.9, 91.38, 93.5, NaN, 98, 94.78, 
    95.1, 93.4), ARG_G1_150_AAC = c(96.38, 90.9, 100, 96.08, 
    NaN, 99.5, 95.82, 106.33, 106.6, 106.5, 92, 95.83, 104, 89, 
    103.75, 109, 96.92, NA, 93, 93.17, 95.12, 102.75, NaN, 93.5, 
    89.38, 102.09, 104.12, 85.44, 103.38, 96.75, 99.2, 98.5, 
    NA, 90.38, 99.18, 105.89, NA, 99.43, 97, 84, NA, NA, 86.75, 
    91.88, NA, 96.86, 98.64, 103.71, 94.22, 95.22, 105.71, NA, 
    NA, NA, 102.33, 94.25, 108.08, 91.75, 107, 112.29, 106.33, 
    98.22, NA, 93.5, 93.25, 94.25, NaN, 100, 96.78, 97.8, 95.5
    ), ARG_G1_150_AC = c(8.75, 10.1, 3.67, 5.23, NaN, 6.5, 6.73, 
    4.78, 2.27, 3.17, 12, 9.83, 3.44, 21.1, 4.25, 2, 11.85, NA, 
    17.5, 6.17, 7.25, 3, NaN, 13.5, 10.62, 5, 5.75, 17.44, 4, 
    10.75, 5, 5.5, NA, 9.5, 9.36, 3.56, NA, 10, 6.86, 9.5, NA, 
    NA, 16.25, 10.25, NA, 10.43, 6, 6.21, 9.22, 9.22, 5.14, NA, 
    NA, NA, 3, 10.75, 6, 12.88, 3.77, 2.57, 4.33, 7.22, NA, 8.6, 
    7.88, 10, NaN, 7, 11.67, 7.8, 7.7), ARG_G1_150_AB = c(10.12, 
    12.6, 5.33, 5.77, NaN, 6.5, 7.91, 5.44, 2.53, 4.33, 12, 9.83, 
    4.78, 21.4, 5.25, 3, 13.77, NA, 17.5, 7.33, 7.25, 6, NaN, 
    16.5, 11.5, 5, 6.25, 18.67, 4.5, 11.38, 5.8, 8.5, NA, 10, 
    9.82, 4.33, NA, 11, 7.71, 9.5, NA, NA, 16.25, 10.25, NA, 
    10.86, 6, 7, 9.22, 10.33, 6.43, NA, NA, NA, 3.33, 11.75, 
    6, 14, 3.77, 2.57, 4.33, 8.22, NA, 8.8, 9, 12, NaN, 8, 12.67, 
    8.2, 8.4), ARG_G1_200_AAA = c(121.5, 110.6, NaN, 120.57, 
    NaN, NaN, 115.67, 132.4, 131.11, 128.5, NaN, 114.5, 126.25, 
    107.4, 124.67, NaN, 120.5, NA, 108, 110.5, 114.33, 125, NaN, 
    114.67, 108, 123.5, 126.67, 105.5, 129.67, 117.75, 121, 120, 
    NA, 108.5, 122.83, 130.8, NA, 123.67, 119, NaN, NA, NA, NaN, 
    109.75, NA, 119, 114.75, 128.88, 115.25, 117, 134, NA, NA, 
    NA, NaN, 113, 131.86, 110.67, 133.57, 138.33, 127.5, 118.25, 
    NA, 112.8, 111.5, 113, NaN, NaN, 114.25, 118, 112.8), ARG_G1_200_AAC = c(123.25, 
    111.6, NaN, 121.29, NaN, NaN, 116.33, 133.4, 132.89, 130.5, 
    NaN, 115.5, 129.5, 108.2, 128.33, NaN, 123, NA, 108, 111.5, 
    115.67, 125, NaN, 118, 112, 125.17, 129, 105.75, 130.33, 
    119.5, 121.4, 121, NA, 109.75, 124.33, 133.4, NA, 125, 120.33, 
    NaN, NA, NA, NaN, 110.75, NA, 123, 124, 129.75, 117.5, 117.2, 
    134, NA, NA, NA, NaN, 116, 134.43, 111.33, 135, 141.33, 129.5, 
    119.5, NA, 114, 113.5, 113, NaN, NaN, 115.5, 120.6, 114), 
    ARG_G1_200_AC = c(12, 15.6, NaN, 8, NaN, NaN, 10.83, 7.8, 
    5.33, 6, NaN, 16.5, 6.75, 31.2, 9.33, NaN, 18, NA, 30, 14.5, 
    13, 11, NaN, 19.67, 17, 9, 9.33, 25.5, 8, 16.25, 9.6, 9, 
    NA, 16, 12.67, 6.2, NA, 13.67, 11.67, NaN, NA, NA, NaN, 17.5, 
    NA, 17, 9, 9.5, 14.75, 15.8, 8, NA, NA, NA, NaN, 23, 10.43, 
    21.33, 5.71, 4.67, 10.25, 13.25, NA, 14.6, 13.25, 19, NaN, 
    NaN, 21.5, 13.2, 14.6), ARG_G1_200_AB = c(14, 19.4, NaN, 
    8.71, NaN, NaN, 12.5, 9, 6, 8, NaN, 16.5, 8.5, 31.8, 11, 
    NaN, 21, NA, 30, 15.5, 13, 15, NaN, 24, 18, 9, 10, 27, 9, 
    17.25, 10.8, 12, NA, 17, 13.5, 7.2, NA, 14.67, 14, NaN, NA, 
    NA, NaN, 17.5, NA, 17.67, 9, 10.88, 14.75, 17, 9.67, NA, 
    NA, NA, NaN, 24, 10.43, 23.33, 5.71, 4.67, 10.5, 15, NA, 
    14.8, 14.75, 21, NaN, NaN, 23.25, 13.8, 15.8), ARG_G1_50_AAA = c(36.35, 
    35.88, 36.22, 35.72, 36.12, 36.96, 35.24, 37.62, 36.05, 34.63, 
    34.19, 33.71, 36.22, 34.43, 34.95, 34.59, 36.03, NA, 32.61, 
    35.29, 37.17, 37.13, 35.62, 34.64, 34.4, 35.69, 37.36, 36.4, 
    36.69, 35.8, 36.57, 35.97, NA, 36.44, 34.94, 35.26, NA, 34.44, 
    37.85, 33.15, NA, NA, 36.13, 34.91, NA, 35.54, 29.02, 35.55, 
    35.64, 35.79, 35.93, NA, NA, NA, 37, 32.58, 35.71, 34.98, 
    36.64, 33.29, 35.29, 37.2, NA, 36.29, 36.91, 31.26, 34, 37.48, 
    33.89, 36.34, 35.88), ARG_G1_50_AAC = c(41.19, 38.7, 41.22, 
    40.53, 44.12, 41.04, 40.18, 42.38, 42.17, 41.87, 38, 41.21, 
    42.24, 38.69, 42.64, 42.14, 41.53, NA, 39.65, 40.76, 41.88, 
    42.23, 39.62, 41.55, 38.19, 42.53, 42.24, 39.49, 42.07, 43.3, 
    40.92, 39.92, NA, 40.35, 40.49, 44.11, NA, 41.72, 40.64, 
    36.15, NA, NA, 39.03, 40.86, NA, 40.93, 37.95, 42.27, 39.47, 
    39.72, 42.12, NA, NA, NA, 42.11, 39.81, 42.82, 39.12, 42.67, 
    43.02, 43.58, 42.61, NA, 40.04, 41.42, 40.9, 41.5, 41.62, 
    40.02, 41.08, 40.18), ARG_G1_50_AC = c(0.98, 1.5, 0.37, 0.6, 
    0.88, 0.73, 1.51, 0.23, 0.25, 0.42, 1.67, 1.58, 0.31, 3.27, 
    0.62, 0.05, 0.83, NA, 3.71, 1.47, 1.07, 0.1, 1.81, 1.19, 
    1.62, 0.61, 0.76, 1.73, 0.24, 0.64, 0.33, 0.97, NA, 0.6, 
    1.98, 0.34, NA, 1.69, 0.26, 2.12, NA, NA, 1.5, 1.14, NA, 
    1, 0.65, 0.88, 1.62, 1.3, 0.39, NA, NA, NA, 0.57, 1.48, 0.58, 
    2.21, 0.43, 0.24, 0.16, 0.65, NA, 0.96, 0.4, 1.13, 1.5, 1.05, 
    1.91, 0.7, 0.94), ARG_G1_50_AB = c(1.09, 2.24, 0.74, 0.68, 
    0.88, 0.73, 1.82, 0.38, 0.36, 0.89, 1.67, 1.58, 0.76, 3.27, 
    0.83, 0.45, 1.15, NA, 3.71, 1.82, 1.07, 1.16, 2.25, 1.93, 
    1.86, 0.61, 1, 2.09, 0.31, 0.86, 0.61, 1.73, NA, 0.77, 2.18, 
    0.34, NA, 1.92, 0.49, 2.12, NA, NA, 1.5, 1.14, NA, 1.2, 0.65, 
    0.88, 1.62, 1.49, 0.63, NA, NA, NA, 0.57, 1.77, 0.58, 2.6, 
    0.43, 0.24, 0.16, 0.85, NA, 0.96, 0.4, 1.84, 1.5, 1.05, 2.4, 
    0.76, 1.14), ARG_G2_100_AAA = c(64.9, 63.8, 71.73, 67.67, 
    NA, NA, 52.5, 72.35, 65.28, 57.22, NA, NaN, 69, 66.67, NaN, 
    66.58, 69, 60.55, 56.29, 67.45, 68.4, 64.25, NaN, 50.86, 
    67.83, 65.96, 57, 53.07, 66.89, NaN, NA, 59, 61.5, NA, 65.9, 
    64.07, NA, NA, 57.91, 67.89, 68.75, 68.5, NaN, 63.24, 66.19, 
    60.59, 59.24, 54.33, 64.39, 65.83, 65.71, 63, 63.78, 63.62, 
    64, 65.08, NA, 67.61, 67.57, 72.71, 65.46, 61.71, NA, 57.62, 
    NA, NA, NA, 64, 61.33, 62.64, NA), ARG_G2_100_AAC = c(65.7, 
    65.8, 74.45, 68, NA, NA, 53.75, 73.94, 67.24, 58.22, NA, 
    NaN, 71.07, 68.07, NaN, 69.88, 71.32, 62.18, 58.65, 76.45, 
    71.13, 67.25, NaN, 51.76, 69.33, 68.17, 58, 54.27, 68.05, 
    NaN, NA, 61, 61.67, NA, 67.79, 65.93, NA, NA, 59.27, 69.67, 
    71.38, 70, NaN, 64.88, 68.19, 62.06, 61, 55.48, 65.67, 67.72, 
    68.47, 64, 65.11, 66, 67.5, 66.33, NA, 69.61, 69.33, 75.67, 
    68.17, 63, NA, 58.81, NA, NA, NA, 66.5, 62.33, 65, NA), ARG_G2_100_AC = c(7.1, 
    6.4, 0.18, 3.67, NA, NA, 12.75, 1.24, 2.96, 9.78, NA, NaN, 
    1.43, 1.33, NaN, 5.21, 2.76, 7.91, 8.06, 2.36, 2.87, 4, NaN, 
    15.52, 2.67, 4.17, 13, 10.07, 5.05, NaN, NA, 9.5, 8.17, NA, 
    5.86, 3.87, NA, NA, 7, 3.33, 1.75, 3, NaN, 7.94, 3.11, 5.29, 
    5.29, 13.1, 3.78, 3.33, 3.06, 5.18, 2.56, 5.04, 5.5, 5.75, 
    NA, 2.22, 2.48, 1, 3.83, 4.82, NA, 8.19, NA, NA, NA, 5, 6.44, 
    5.29, NA), ARG_G2_100_AB = c(7.1, 7.4, 1.09, 3.67, NA, NA, 
    12.75, 1.24, 3.28, 9.78, NA, NaN, 1.71, 1.93, NaN, 6.21, 
    2.76, 7.91, 8.65, 3.55, 3.4, 5, NaN, 16.05, 3.39, 4.52, 13, 
    11.6, 5.05, NaN, NA, 9.5, 9.67, NA, 7.03, 3.87, NA, NA, 8, 
    3.33, 2.19, 3, NaN, 8.53, 3.37, 5.47, 7.35, 13.48, 5.33, 
    3.83, 3.65, 5.82, 4, 6.17, 6, 6.42, NA, 3.83, 2.71, 2.19, 
    4.58, 5.18, NA, 9.75, NA, NA, NA, 5, 6.44, 5.36, NA), ARG_G2_150_AAA = c(85.25, 
    NaN, 99, NaN, NA, NA, 66.86, 101, 89.31, 71.33, NA, NaN, 
    94.5, 88.57, NaN, 95, 95.5, 81.5, 78.5, 107.75, 93.43, NaN, 
    NaN, 66.18, 92.33, 92.25, NaN, 67.43, 87.44, NaN, NA, NaN, 
    78, NA, 89.81, 86.43, NA, NA, 75.75, 91.67, 95, NaN, NaN, 
    85.12, 91.47, 81.88, 79.38, 72.45, 87.67, 91.22, 90.88, 83, 
    85, 89.23, NaN, 86.2, NA, 92, 93.09, 100.27, 88.62, 83.88, 
    NA, 75, NA, NA, NA, NaN, 80, 83.5, NA), ARG_G2_150_AAC = c(86.75, 
    NaN, 101, NaN, NA, NA, 67.29, 103.75, 91.15, 71.67, NA, NaN, 
    96.33, 88.86, NaN, 96.23, 97.5, 83.5, 79.12, 109.5, 95, NaN, 
    NaN, 66.45, 93.56, 93.42, NaN, 68, 88.33, NaN, NA, NaN, 78, 
    NA, 91.69, 87, NA, NA, 76.75, 93, 96.88, NaN, NaN, 85.5, 
    92.67, 83.38, 80.25, 73.09, 88.33, 92.44, 92.38, 84.25, 85.33, 
    91.23, NaN, 87.8, NA, 92.67, 94.09, 102.09, 90.15, 84.75, 
    NA, 76.14, NA, NA, NA, NaN, 81, 85.67, NA), ARG_G2_150_AC = c(15.75, 
    NaN, 1, NaN, NA, NA, 25.71, 2.62, 6.85, 19.33, NA, NaN, 3.83, 
    4.57, NaN, 9.85, 6.5, 15.5, 13.88, 3.75, 6.29, NaN, NaN, 
    27.36, 5.67, 8.42, NaN, 18.86, 11.33, NaN, NA, NaN, 19, NA, 
    11.25, 9.57, NA, NA, 12.75, 6, 4.5, NaN, NaN, 15.75, 5.67, 
    10.75, 9.75, 24.82, 8.67, 6.67, 5.88, 13.25, 7, 10, NaN, 
    10.6, NA, 6.56, 4.18, 2.55, 8.54, 9.75, NA, 17.86, NA, NA, 
    NA, NaN, 15.67, 13.17, NA), ARG_G2_150_AB = c(15.75, NaN, 
    2, NaN, NA, NA, 25.71, 2.62, 8.69, 19.33, NA, NaN, 4.33, 
    5.43, NaN, 11.31, 6.5, 15.5, 14.75, 6, 7.14, NaN, NaN, 28.27, 
    7.22, 9, NaN, 21.29, 11.33, NaN, NA, NaN, 22, NA, 13.44, 
    9.71, NA, NA, 14.75, 6, 5.12, NaN, NaN, 16.75, 6, 11.25, 
    12.75, 25.36, 11.11, 7.33, 6.62, 14.25, 9.33, 11.62, NaN, 
    11.8, NA, 9.22, 4.91, 4.64, 10, 10.38, NA, 19.86, NA, NA, 
    NA, NaN, 15.67, 13.33, NA), ARG_G2_200_AAA = c(NaN, NaN, 
    125, NaN, NA, NA, 81.33, 129.5, 112.25, NaN, NA, NaN, 117.5, 
    108.33, NaN, 120, 119.25, 99, 94, 134, 113.67, NaN, NaN, 
    77.67, 112.25, 112.86, NaN, 78.33, 106.6, NaN, NA, NaN, NaN, 
    NA, 112.4, 106.67, NA, NA, 93, NaN, 122, NaN, NaN, 104.25, 
    114.89, 101.25, 96.75, 87, 107, 112.25, 112.25, 100, NaN, 
    111.86, NaN, 101, NA, 114, 114.5, 124.17, 108.86, 103.25, 
    NA, 90.67, NA, NA, NA, NaN, NaN, 99, NA), ARG_G2_200_AAC = c(NaN, 
    NaN, 126, NaN, NA, NA, 82.33, 129.75, 113.5, NaN, NA, NaN, 
    118, 109.33, NaN, 120.71, 120.25, 101, 94.25, 136, 114, NaN, 
    NaN, 78, 114, 114, NaN, 78.67, 106.8, NaN, NA, NaN, NaN, 
    NA, 114, 108.33, NA, NA, 93, NaN, 123, NaN, NaN, 104.25, 
    116.67, 102.75, 97.25, 87.67, 107.75, 113.25, 113.25, 101, 
    NaN, 113.14, NaN, 101, NA, 114.5, 115, 126.17, 111.29, 104.25, 
    NA, 92, NA, NA, NA, NaN, NaN, 99, NA), ARG_G2_200_AC = c(NaN, 
    NaN, 1, NaN, NA, NA, 36, 5.25, 12.25, NaN, NA, NaN, 8.5, 
    8.33, NaN, 14.29, 11.38, 24, 22.25, 6, 11.67, NaN, NaN, 42.5, 
    9.25, 13.14, NaN, 32, 19.4, NaN, NA, NaN, NaN, NA, 15.6, 
    17, NA, NA, 24, NaN, 6.67, NaN, NaN, 21.5, 8.89, 17.5, 16, 
    37.83, 15.75, 12.25, 11.75, 20, NaN, 15.43, NaN, 26, NA, 
    12.25, 7.5, 5.67, 12.86, 14.75, NA, 27, NA, NA, NA, NaN, 
    NaN, 28.5, NA), ARG_G2_200_AB = c(NaN, NaN, 2, NaN, NA, NA, 
    36, 5.25, 16, NaN, NA, NaN, 10, 9.33, NaN, 16.57, 11.38, 
    24, 23.25, 9, 13, NaN, NaN, 44.33, 11.5, 14.29, NaN, 35, 
    19.4, NaN, NA, NaN, NaN, NA, 18.8, 17.33, NA, NA, 26, NaN, 
    7.67, NaN, NaN, 22.5, 9.33, 18.25, 20.25, 38.67, 19, 13.25, 
    13.25, 22, NaN, 18, NaN, 28, NA, 15.75, 8.83, 8.17, 15.14, 
    16, NA, 29.33, NA, NA, NA, NaN, NaN, 29, NA), ARG_G2_50_AAA = c(36.97, 
    35.4, 34.72, 33.81, NA, NA, 32.98, 35.7, 35.59, 35.36, NA, 
    36, 37.66, 36.35, 33.44, 34.72, 36.9, 34.32, 32.28, 33.74, 
    36.38, 35.06, 34.5, 31.47, 36.59, 36.18, 34.75, 31.9, 36.53, 
    32.62, NA, 33.85, 34.86, NA, 35.36, 34.52, NA, NA, 33.68, 
    35.89, 36.24, 37.21, 28, 34.05, 36.3, 34.16, 32.86, 32.06, 
    34.65, 35.57, 35.95, 33.19, 34.61, 34.6, 34.92, 34.24, NA, 
    34.33, 35.65, 36.16, 33.91, 34.37, NA, 33.44, NA, NA, NA, 
    33.93, 33.71, 35.42, NA), ARG_G2_50_AAC = c(40.2, 38.6, 42.09, 
    39.25, NA, NA, 35.68, 41.41, 39.12, 37.68, NA, 39, 41.16, 
    40.67, 36.11, 39.25, 40.65, 37.52, 35.14, 41.26, 41.13, 40.71, 
    36.25, 33.33, 40.59, 39.67, 36.83, 34.44, 40.57, 34, NA, 
    37, 36.45, NA, 39.52, 38.17, NA, NA, 36.52, 40.39, 40.69, 
    41.21, 29, 39.63, 40.23, 37.27, 36.58, 34.45, 38.87, 38.98, 
    39.51, 38.13, 37.68, 37.88, 38.85, 38.48, NA, 40, 40.43, 
    42.73, 39.93, 38.19, NA, 36.41, NA, NA, NA, 39.71, 36.43, 
    38.03, NA), ARG_G2_50_AC = c(0.8, 1.9, 0, 0.5, NA, NA, 2.93, 
    0.52, 0.58, 2.75, NA, 1.25, 0.21, 0.25, 2.11, 2, 0.85, 2.03, 
    2.67, 0.71, 0.82, 0.29, 0.75, 4.27, 0.63, 0.78, 2.92, 2.77, 
    1.17, 4.88, NA, 3, 2.64, NA, 1.78, 0.98, NA, NA, 2.29, 0.82, 
    0.45, 0.93, 6, 1.67, 0.86, 1.27, 1.79, 3.37, 1.11, 0.74, 
    0.79, 1.1, 0.71, 1.11, 1.08, 2.48, NA, 0.17, 0.75, 0.22, 
    0.91, 1.19, NA, 1.66, NA, NA, NA, 1.07, 1.75, 1.42, NA), 
    ARG_G2_50_AB = c(0.8, 2, 0.31, 0.5, NA, NA, 2.93, 0.52, 0.58, 
    2.75, NA, 1.25, 0.34, 0.5, 3.33, 2.44, 0.85, 2.03, 2.91, 
    1.42, 1, 0.94, 0.75, 4.63, 0.85, 0.96, 2.92, 3.49, 1.17, 
    4.88, NA, 3, 3.36, NA, 2.3, 0.98, NA, NA, 2.61, 0.82, 0.52, 
    0.93, 6, 1.91, 1.02, 1.34, 2.58, 3.67, 1.59, 0.96, 1.09, 
    1.39, 1.5, 1.65, 1.15, 2.76, NA, 0.93, 0.8, 0.82, 1.25, 1.44, 
    NA, 2.49, NA, NA, NA, 1.07, 1.75, 1.47, NA), NARR_G1_100_AAA = c(71.32, 
    NA, NA, 67.83, NaN, 71.6, 64.2, 71.68, 73.29, 70.53, 73.35, 
    59.31, 71.08, 74.06, 68.7, 74, 69.08, NA, 68.52, 63.47, 68.33, 
    NA, 65.64, 62.11, 63.9, 70.41, 60.36, 65.88, 68.81, 69.62, 
    70.68, 67.5, NA, 68.45, 67.16, 74.39, 60.6, 65.89, 71.94, 
    68.75, NA, NA, 67, 66.85, NA, NA, 62.56, 73.33, 69.81, 67.68, 
    73.06, 65.8, 63.85, NA, 67.64, 71.6, 68.47, 69.39, 71.16, 
    72.33, NA, 66.68, NA, 66.22, 67, 61.27, NaN, 72.33, 68.29, 
    71.33, 65.57), NARR_G1_100_AAC = c(74.26, NA, NA, 70.94, 
    NaN, 75, 66.14, 74.48, 77.07, 73.47, 76, 60.44, 73.92, 77.19, 
    71.4, 77.59, 72, NA, 70.38, 65.47, 70.54, NA, 68.09, 64.61, 
    66.5, 72.52, 62.59, 69.25, 71.48, 71.88, 74.4, 70.1, NA, 
    70, 69.6, 78.04, 62.3, 68.79, 73.44, 72.25, NA, NA, 67, 68.25, 
    NA, NA, 65.94, 75.71, 72.43, 69.68, 76, 68.6, 65.65, NA, 
    70.43, 74, 71.76, 71.17, 74.63, 74.22, NA, 69.47, NA, 68.72, 
    67, 62.82, NaN, 77.33, 69.76, 75.42, 67.62), NARR_G1_100_AC = c(3.05, 
    NA, NA, 2.33, NaN, 2.4, 1.89, 0.84, 0.07, 5.47, 1.12, 8.81, 
    2.39, 1.38, 3.6, 0.88, 2.65, NA, 2.05, 5.18, 2.38, NA, 5, 
    4.78, 6.4, 1.85, 7.41, 3.69, 1.85, 2.62, 1.28, 3.9, NA, 2.35, 
    3.8, 1.87, 5.1, 6.95, 1.67, 4.5, NA, NA, 4, 4.25, NA, NA, 
    7.17, 1.29, 2.62, 1.37, 1.47, 3.3, 7.27, NA, 3.64, 3.6, 2.59, 
    4.83, 0.63, 2.28, NA, 6.58, NA, 4.56, 6, 4.82, NaN, 0.67, 
    3.95, 1.75, 4.38), NARR_G1_100_AB = c(3.42, NA, NA, 3.17, 
    NaN, 2.5, 3.29, 1.64, 1.07, 6, 1.41, 9.25, 3.25, 2.69, 3.8, 
    1.32, 3.04, NA, 2.38, 5.18, 2.38, NA, 6.18, 6.11, 6.4, 1.85, 
    7.45, 3.69, 1.89, 3.25, 1.6, 4.8, NA, 2.8, 4.32, 2.3, 6.6, 
    7.42, 2.83, 4.75, NA, NA, 5, 4.75, NA, NA, 8, 1.71, 2.67, 
    2.05, 1.47, 4.8, 7.96, NA, 4.43, 3.8, 4.47, 4.91, 1.68, 2.78, 
    NA, 6.58, NA, 6.67, 6, 5.18, NaN, 1.67, 4.86, 2.08, 4.38), 
    NARR_G1_150_AAA = c(102, NA, NA, 96.22, NaN, 105.33, 87.1, 
    100.14, 106.17, 97.67, 99.88, 75.43, 99.62, 106.86, 95.3, 
    105.68, 97.14, NA, 92.82, 87.25, 96.23, NA, 88.5, 83.56, 
    89.75, 98.47, 80.64, 92.14, 96.07, 94.62, 99.46, 100, NA, 
    92.6, 94.54, 106.25, 82.5, 93.6, 100.33, 95, NA, NA, NaN, 
    90.9, NA, NA, 87.89, 101.08, 96.18, 95, 103.12, 92.75, 85.71, 
    NA, 94.17, NaN, 95.25, 97.5, 100.67, 100.44, NA, 90.9, NA, 
    90.11, NaN, 81.5, NaN, NaN, 94.45, 100.4, 91.64), NARR_G1_150_AAC = c(103.2, 
    NA, NA, 97.67, NaN, 106.67, 88.55, 102.43, 109.17, 98.78, 
    103.25, 76.57, 102.05, 109.43, 97.4, 108.42, 99.29, NA, 94.73, 
    89, 98, NA, 89.75, 85, 91.75, 100.47, 81.64, 93.14, 97.73, 
    96, 101.08, 101.33, NA, 94.1, 95.92, 110.33, 83.25, 95.5, 
    101.67, 98, NA, NA, NaN, 93, NA, NA, 90.56, 102.38, 99, 96.78, 
    106.5, 94.25, 87.43, NA, 98.33, NaN, 99, 98.92, 103.44, 103, 
    NA, 93.8, NA, 92, NaN, 82.25, NaN, NaN, 95.45, 102.8, 93.82
    ), NARR_G1_150_AC = c(6.4, NA, NA, 5.78, NaN, 5, 4.85, 2.29, 
    0.5, 12.44, 2.5, 19, 4.71, 3, 8, 1.63, 5.86, NA, 4.82, 9.25, 
    4.08, NA, 10.75, 9.44, 12.25, 3.6, 15.73, 7.14, 3.73, 7.12, 
    4.08, 6.33, NA, 5.1, 6.62, 3.08, 10.25, 12.5, 4.56, 7.5, 
    NA, NA, NaN, 8.6, NA, NA, 13.67, 3.15, 6, 2.22, 2.5, 8, 15, 
    NA, 6, NaN, 5.5, 8.75, 2.44, 4.33, NA, 13.9, NA, 8.78, NaN, 
    13.75, NaN, NaN, 7.73, 4.4, 9.36), NARR_G1_150_AB = c(7, 
    NA, NA, 7.33, NaN, 5.33, 7.4, 3.71, 2.17, 13.33, 2.88, 20.14, 
    6, 5.14, 8.5, 2.42, 6.43, NA, 5.18, 9.25, 4.08, NA, 12.5, 
    11.56, 12.25, 3.6, 15.73, 7.14, 4, 8.12, 4.46, 7.33, NA, 
    5.9, 7.54, 3.67, 13, 13.3, 6.78, 8, NA, NA, NaN, 9.1, NA, 
    NA, 15.11, 4.15, 6.09, 3.22, 2.5, 10.5, 16.29, NA, 7.33, 
    NaN, 8.38, 8.83, 4, 5.22, NA, 13.9, NA, 12.11, NaN, 15.25, 
    NaN, NaN, 9.27, 5, 9.36), NARR_G1_200_AAA = c(127.8, NA, 
    NA, 120.25, NaN, NaN, 105.85, 126.62, 134.5, 121.4, 126.25, 
    89.33, 126.23, 136, 120.4, 133.17, 124, NA, 115.5, 106.5, 
    120.86, NA, 115, 104.25, NaN, 123.22, 100, 114, 120.22, 115.67, 
    124.38, NaN, NA, 112.6, 119, 137.29, NaN, 118.4, 127, NaN, 
    NA, NA, NaN, 113.8, NA, NA, 111.5, 123.57, 122.33, 118.8, 
    130, NaN, 106.38, NA, 123.5, NaN, 123.75, 123.29, 127.2, 
    126.5, NA, 113.8, NA, 113.75, NaN, 101, NaN, NaN, 117.83, 
    125, 114.5), NARR_G1_200_AAC = c(130, NA, NA, 123, NaN, NaN, 
    107.54, 128.75, 136.5, 123, 128.5, 90, 128, 137.33, 121.6, 
    136.92, 125.5, NA, 117, 108.25, 122.29, NA, 115, 105, NaN, 
    125.11, 102, 116, 122.33, 117.33, 126.25, NaN, NA, 114.6, 
    121.12, 138.86, NaN, 119.2, 127.75, NaN, NA, NA, NaN, 114.4, 
    NA, NA, 113, 124.43, 124, 120.6, 133, NaN, 107, NA, 124.5, 
    NaN, 127.75, 123.57, 129, 127.5, NA, 115.6, NA, 117, NaN, 
    101, NaN, NaN, 118.5, 129, 115.5), NARR_G1_200_AC = c(11.2, 
    NA, NA, 12.5, NaN, NaN, 9.31, 4.25, 2, 17.8, 4.5, 32.33, 
    7.77, 5.67, 13.4, 2.67, 9.62, NA, 7.67, 15, 6.14, NA, 16, 
    14.75, NaN, 6.22, 24.33, 11, 6.67, 14.33, 7.62, NaN, NA, 
    9.4, 9.75, 4.86, NaN, 18.6, 8.25, NaN, NA, NA, NaN, 13.8, 
    NA, NA, 21.75, 6.14, 9.33, 6, 4.5, NaN, 23.75, NA, 8.5, NaN, 
    6.75, 13.86, 3.8, 6.75, NA, 21.4, NA, 12.75, NaN, 20, NaN, 
    NaN, 12.83, 7, 15.83), NARR_G1_200_AB = c(12, NA, NA, 14.5, 
    NaN, NaN, 12.85, 6.38, 4.5, 18.8, 5.25, 34.67, 9.54, 8.67, 
    14.4, 4, 10.62, NA, 8.33, 15, 6.29, NA, 18, 17.5, NaN, 6.22, 
    24.33, 11.33, 7, 15.33, 8.12, NaN, NA, 10.8, 11, 5.71, NaN, 
    19.6, 10.75, NaN, NA, NA, NaN, 14.6, NA, NA, 24, 7.57, 9.5, 
    8, 5, NaN, 25.75, NA, 10.5, NaN, 10.5, 14, 6, 8.75, NA, 21.4, 
    NA, 17.75, NaN, 22, NaN, NaN, 15.5, 8, 15.83), NARR_G1_50_AAA = c(37.69, 
    NA, NA, 37.02, 35.38, 34.34, 36.19, 37.25, 36.78, 36.83, 
    36.61, 34.2, 34.24, 37.51, 35.74, 34, 35.02, NA, 37.4, 36.18, 
    36.63, NA, 34.42, 34.38, 35.43, 37.2, 34.49, 34.2, 36.41, 
    37.07, 36.56, 34.93, NA, 36.06, 36.49, 35.31, 33.33, 34.27, 
    36.5, 36.5, NA, NA, 34.21, 36.02, NA, NA, 34.02, 35.59, 37.16, 
    36.02, 37.58, 36.53, 35.46, NA, 36.46, 38.42, 36.05, 37.39, 
    37.3, 36.22, NA, 35.31, NA, 33.96, 35.55, 35.03, 35, 35.31, 
    36.54, 36.06, 34.98), NARR_G1_50_AAC = c(41.85, NA, NA, 40.71, 
    37.5, 42.38, 39.05, 41.98, 42.51, 42.47, 43.43, 36.41, 42.17, 
    43.27, 40.42, 43.1, 40.52, NA, 41.65, 38.82, 40.63, NA, 40.35, 
    39.18, 38.93, 41.44, 38.3, 39.54, 40.73, 41.83, 42.54, 40.34, 
    NA, 40.69, 40.31, 43.51, 36.13, 39.1, 41.65, 41.62, NA, NA, 
    38.57, 40.02, NA, NA, 38.26, 42.66, 41.55, 39.7, 42.91, 40.43, 
    38.87, NA, 40.86, 43.26, 40.55, 40.84, 42.13, 42.09, NA, 
    40.31, NA, 39.69, 39.73, 36.97, 37.71, 43.44, 40.44, 42.33, 
    39.65), NARR_G1_50_AC = c(0.77, NA, NA, 0.69, 2.25, 0.45, 
    0.59, 0.12, 0, 1.15, 0.34, 2.61, 0.61, 0.24, 0.64, 0.26, 
    0.79, NA, 0.19, 1.43, 0.65, NA, 1.39, 1.11, 1.87, 0.31, 1.98, 
    1.07, 0.54, 0.29, 0.24, 0.76, NA, 0.59, 1.05, 0.62, 2.17, 
    2.25, 0.33, 1.62, NA, NA, 1.36, 1.53, NA, NA, 2.22, 0.22, 
    0.65, 0.45, 0.42, 0.9, 2.18, NA, 0.97, 0.05, 0.84, 0.98, 
    0, 0.44, NA, 1.83, NA, 1.71, 0.91, 1.16, 1.86, 0.12, 0.69, 
    0.45, 1.24), NARR_G1_50_AB = c(0.88, NA, NA, 0.82, 2.25, 
    0.45, 1.03, 0.45, 0.54, 1.36, 0.55, 2.71, 0.96, 0.73, 0.64, 
    0.47, 0.97, NA, 0.29, 1.43, 0.65, NA, 1.81, 1.69, 1.87, 0.31, 
    2.02, 1.07, 0.54, 0.52, 0.39, 1.1, NA, 0.8, 1.31, 0.82, 2.9, 
    2.44, 0.74, 1.62, NA, NA, 1.86, 1.76, NA, NA, 2.48, 0.38, 
    0.67, 0.66, 0.42, 1.67, 2.38, NA, 1.43, 0.16, 1.64, 1.04, 
    0.57, 0.69, NA, 1.83, NA, 2.6, 0.91, 1.16, 2.71, 0.75, 0.98, 
    0.58, 1.24), NARR_G2_100_AAA = c(64.25, 59, NA, 67.88, 67.08, 
    NA, 60.75, 64.42, 71.17, 58.42, NA, 49.8, 63.36, 65.2, NaN, 
    70.2, 62.85, NaN, 61.6, 53.92, 62.63, NA, NaN, 50.46, 65.14, 
    60.58, 63.29, NA, 64.33, NaN, NA, 68.57, NA, NA, 66.3, NA, 
    57.29, NA, 53.5, 63.48, NA, 57.07, NaN, 61.82, NA, 68.61, 
    57.1, 62.84, 63, 61.91, 58.38, NaN, 61.56, NA, NaN, 65.55, 
    63.8, 65, 63.14, 67.31, 67.75, 57.62, 63.31, 54.83, 66.43, 
    NA, NA, 64.67, 57.92, 59, NA)), row.names = c(NA, -71L), class = "data.frame")

CodePudding user response:

I would suggest pulling your column names into a data frame, separating them into their components, and ordering them as desired:

library(dplyr)
library(tidyr)
col_df = data.frame(names = names(merged_DF)[-1]) ## -1 to skip the ID col
col_df = col_df %>%
  separate(
    col = names, sep = "_",
    into = c("s1", "gnum", "num2", "astring"),
    remove = FALSE, convert = TRUE
  ) %>%
  arrange(s1, num2, astring, gnum) 

## now we have the names in order:
col_df
#              names   s1 gnum num2 astring
# 1    ARG_G1_50_AAA  ARG   G1   50     AAA
# 2    ARG_G2_50_AAA  ARG   G2   50     AAA
# 3    ARG_G1_50_AAC  ARG   G1   50     AAC
# 4    ARG_G2_50_AAC  ARG   G2   50     AAC
# 5     ARG_G1_50_AB  ARG   G1   50      AB
# 6     ARG_G2_50_AB  ARG   G2   50      AB
# 7     ARG_G1_50_AC  ARG   G1   50      AC
# 8     ARG_G2_50_AC  ARG   G2   50      AC
# 9   ARG_G1_100_AAA  ARG   G1  100     AAA
# 10  ARG_G2_100_AAA  ARG   G2  100     AAA
# ...

## we can use this order to rearrange the columns
merged_DF = select(merged_DF, c(ID, col_df$names))
names(merged_DF)
# [1] "ID"              "ARG_G1_50_AAA"   "ARG_G2_50_AAA"   "ARG_G1_50_AAC"   "ARG_G2_50_AAC"  
#  [6] "ARG_G1_50_AB"    "ARG_G2_50_AB"    "ARG_G1_50_AC"    "ARG_G2_50_AC"    "ARG_G1_100_AAA" 
# [11] "ARG_G2_100_AAA"  "ARG_G1_100_AAC"  "ARG_G2_100_AAC"  "ARG_G1_100_AB"   "ARG_G2_100_AB"  
# [16] "ARG_G1_100_AC"   "ARG_G2_100_AC"   "ARG_G1_150_AAA"  "ARG_G2_150_AAA"  "ARG_G1_150_AAC" 
# [21] "ARG_G2_150_AAC"  "ARG_G1_150_AB"   "ARG_G2_150_AB"   "ARG_G1_150_AC"   "ARG_G2_150_AC"  
# [26] "ARG_G1_200_AAA"  "ARG_G2_200_AAA"  "ARG_G1_200_AAC"  "ARG_G2_200_AAC"  "ARG_G1_200_AB"  
# [31] "ARG_G2_200_AB"   "ARG_G1_200_AC"   "ARG_G2_200_AC"   "NARR_G1_50_AAA"  "NARR_G1_50_AAC" 
# [36] "NARR_G1_50_AB"   "NARR_G1_50_AC"   "NARR_G1_100_AAA" "NARR_G2_100_AAA" "NARR_G1_100_AAC"
# [41] "NARR_G1_100_AB"  "NARR_G1_100_AC"  "NARR_G1_150_AAA" "NARR_G1_150_AAC" "NARR_G1_150_AB" 
# [46] "NARR_G1_150_AC"  "NARR_G1_200_AAA" "NARR_G1_200_AAC" "NARR_G1_200_AB"  "NARR_G1_200_AC" 

CodePudding user response:

I bet that there are simpler ways of doing this but this one seems to work.

intercalate <- function(X, pattern) {
  f <- function(h, n) {
    i <- seq(1, length(h), by = 2)
    j <- seq(2, length(h), by = 2)
    h[order(c(i, j))]
  }
  #
  g <- function(x, y) {
    nx <- length(x)
    ny <- length(y)
    if(nx == ny) {
      h <- c(x, y)
      f(h, nx)
    } else if(nx > ny) {
      h <- c(x[seq_along(y)], y)
      h <- f(h, ny)
      c(h, x[-seq_along(y)])
    } else {
      h <- c(x, y[seq_along(x)])
      h <- f(h, nx)
      c(h, y[-seq_along(x)])
    }
  }
  #
  s <- grepl(pattern = pattern, X)
  s <- abs(c(0, diff(s)))
  sp <- split(X, cumsum(s))
  i_odd <- seq(1, length(sp), by = 2)
  i_even <- seq(2, length(sp), by = 2)
  new_names <- mapply(g, sp[i_odd], sp[i_even])
  unname(unlist(new_names))
}

newnames <- intercalate(names(merged_DF)[-1], pattern = "G2")
newnames <- c(names(merged_DF)[1], newnames)
merged_DF[newnames]

CodePudding user response:

This is probably insufficient to the task:

strings <- c('ARG_G1_50_AAA' ,'ARG_G1_50_AAC', 'ARG_G1_50_AC' ,'ARG_G1_50_AB', 
              'ARG_G2_50_AAA' ,'ARG_G2_50_AAC', 'ARG_G2_50_AC')

substring(strings, regexpr('_\\K[[:upper:]]{2,3}', strings, perl = TRUE), nchar(strings))
[1] "AAA" "AAC" "AC"  "AB"  "AAA" "AAC" "AC" 

idx_strings <- order(substring(strings, regexpr('_\\K[[:upper:]]{2,3}', strings, perl = TRUE), nchar(strings)))

idx_strings
[1] 1 5 2 6 4 3 7

> strings[idx_strings]
[1] "ARG_G1_50_AAA" "ARG_G2_50_AAA" "ARG_G1_50_AAC" "ARG_G2_50_AAC"
[5] "ARG_G1_50_AB"  "ARG_G1_50_AC"  "ARG_G2_50_AC"

whether this extends to _100, _150? Didn't have that data, though probably not.

  • Related