I have a dataset consisting of data in 96 columns. The column names are currently "A-H1" to "A-H12" as seen on the table below:
> head(od,1)
time T..OD2.600 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 B1 B2 B3 B4 B5
1 0.24 25 0.1 0.13 0.13 0.1 0.16 0.12 0.13 0.1 0.09 0.1 0.09 0.09 0.09 0.09 0.13 0.2 0.1
B6 B7 B8 B9 B10 B11 B12 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 D1
1 0.1 0.1 0.12 0.09 0.09 0.09 0.09 0.09 0.12 0.13 0.11 0.1 0.14 0.1 0.1 0.09 0.09 0.09 0.09 0.1
D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 E1 E2 E3 E4 E5 E6 E7 E8 E9
1 0.09 0.11 0.09 0.14 0.09 0.1 0.09 0.09 0.09 0.09 0.09 0.09 0.09 0.1 0.21 0.12 0.1 0.11 0.1 0.09
E10 E11 E12 F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 G1 G2 G3 G4 G5
1 0.09 0.09 0.1 0.1 0.1 0.1 0.1 0.09 0.09 0.09 0.09 0.09 0.12 0.09 0.09 0.09 0.1 0.11 0.11 0.09
G6 G7 G8 G9 G10 G11 G12 H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12
1 0.14 0.1 0.09 0.1 0.14 0.1 0.1 0.09 0.09 0.1 0.1 0.09 0.1 0.1 0.09 0.09 0.09 0.09 0.09
However, i need to change all the column names "A1, A2, A3" etc to corresponding name from this the text in the format as seen below:
" A1 Negative Control A2 Ammonia A3 Nitrite A4 Nitrate A5 Urea A6 Biuret A7 L-Alanine A8 L-Arginine A9 L-Asparagine A10 L-Aspartic Acid A11 L-Cysteine A12 L-Glutamic Acid B1 L-Glutamine B2 Glycine B3 L-Histidine B4 L-Isoleucine B5 L-Leucine B6 L-Lysine B7 L-Methionine B8 L-Phenylalanine B9 L-Proline B10 L-Serine B11 L-Threonine B12 L-Tryptophan C1 L-Tyrosine C2 L-Valine C3 D-Alanine C4 D-Asparagine C5 D-Aspartic Acid C6 D-Glutamic Acid C7 D-Lysine C8 D-Serine C9 D-Valine C10 L-Citrulline C11 L-Homoserine C12 L-Ornithine D1 N-Acetyl-LGlutamic Acid D2 N-Phthaloyl-LGlutamic Acid D3 L-Pyroglutamic Acid D4 Hydroxylamine D5 Methylamine D6 N-Amylamine D7 N-Butylamine D8 Ethylamine D9 Ethanolamine D10 Ethylenediamine D11 Putrescine D12 Agmatine E1 Histamine E2 ß-Phenylethylamine E3 Tyramine E4 Acetamide E5 Formamide E6 Glucuronamide E7 D,L-Lactamide E8 D-Glucosamine E9 D-Galactosamine E10 D-Mannosamine E11 N-Acetyl-DGlucosamine E12 N-Acetyl-DGalactosamine F1 N-Acetyl-DMannosamine F2 Adenine F3 Adenosine F4 Cytidine F5 Cytosine F6 Guanine F7 Guanosine F8 Thymine F9 Thymidine F10 Uracil F11 Uridine F12 Inosine G1 Xanthine G2 Xanthosine G3 Uric Acid G4 Alloxan G5 Allantoin G6 Parabanic Acid G7 D,L-α-Amino-NButyric Acid G8 γ-Amino-NButyric Acid G9 ε-Amino-NCaproic Acid G10 D,L-α-AminoCaprylic Acid G11 δ-Amino-NValeric Acid G12 α-Amino-NValeric Acid H1 Ala-Asp H2 Ala-Gln H3 Ala-Glu H4 Ala-Gly H5 Ala-His H6 Ala-Leu H7 Ala-Thr H8 Gly-Asn H9 Gly-Gln H10 Gly-Glu H11 Gly-Met H12 Met-Ala "
So that,
A1 = Negative control A2 = Ammonia
and so on.
I hope everything makes sense, and thanks a lot in advance!
CodePudding user response:
We convert the string to a key/val dataset
key_val <- read.csv(text = sub("(?<=\\d) ", ":",
strsplit(str1, "\\s (?=[A-Z]\\d \\s)", perl = TRUE)[[1]], perl = TRUE),
header = FALSE, sep = ":")
#or without splitting
#key_val <- read.csv(text = gsub("\\s (?=[A-Z]\\d \\b)", "\n",
# gsub("(?<=\\d)\\s ", ":", str1, perl = TRUE), perl = TRUE),
# header = FALSE, sep=":")
-checking
> head(key_val)
V1 V2
1 A1 Negative Control
2 A2 Ammonia
3 A3 Nitrite
4 A4 Nitrate
5 A5 Urea
6 A6 Biuret
> tail(key_val)
V1 V2
91 H7 Ala-Thr
92 H8 Gly-Asn
93 H9 Gly-Gln
94 H10 Gly-Glu
95 H11 Gly-Met
96 H12 Met-Ala
Now, we rename by matching the column names of the dataset with the 'V1' column to modify with the 'V2' values
library(dplyr)
library(tibble)
key_val_sub <- key_val %>%
filter(V1 %in% names(od))
od1 <- od %>%
rename(!!! deframe(key_val_sub[2:1]))
-output
> od1
time Negative Control Ammonia δ-Amino-NValeric Acid
1 0.24 0.1 0.3 0.1
NOTE: Just for reproducibility, used only a subset of the OP's 'od' data
data
od <- structure(list(time = 0.24, A1 = 0.1, A2 = 0.3, G11 = 0.1), class = "data.frame", row.names = c(NA,
-1L))
str1 <- "A1 Negative Control A2 Ammonia A3 Nitrite A4 Nitrate A5 Urea A6 Biuret A7 L-Alanine A8 L-Arginine A9 L-Asparagine A10 L-Aspartic Acid A11 L-Cysteine A12 L-Glutamic Acid B1 L-Glutamine B2 Glycine B3 L-Histidine B4 L-Isoleucine B5 L-Leucine B6 L-Lysine B7 L-Methionine B8 L-Phenylalanine B9 L-Proline B10 L-Serine B11 L-Threonine B12 L-Tryptophan C1 L-Tyrosine C2 L-Valine C3 D-Alanine C4 D-Asparagine C5 D-Aspartic Acid C6 D-Glutamic Acid C7 D-Lysine C8 D-Serine C9 D-Valine C10 L-Citrulline C11 L-Homoserine C12 L-Ornithine D1 N-Acetyl-LGlutamic Acid D2 N-Phthaloyl-LGlutamic Acid D3 L-Pyroglutamic Acid D4 Hydroxylamine D5 Methylamine D6 N-Amylamine D7 N-Butylamine D8 Ethylamine D9 Ethanolamine D10 Ethylenediamine D11 Putrescine D12 Agmatine E1 Histamine E2 ß-Phenylethylamine E3 Tyramine E4 Acetamide E5 Formamide E6 Glucuronamide E7 D,L-Lactamide E8 D-Glucosamine E9 D-Galactosamine E10 D-Mannosamine E11 N-Acetyl-DGlucosamine E12 N-Acetyl-DGalactosamine F1 N-Acetyl-DMannosamine F2 Adenine F3 Adenosine F4 Cytidine F5 Cytosine F6 Guanine F7 Guanosine F8 Thymine F9 Thymidine F10 Uracil F11 Uridine F12 Inosine G1 Xanthine G2 Xanthosine G3 Uric Acid G4 Alloxan G5 Allantoin G6 Parabanic Acid G7 D,L-α-Amino-NButyric Acid G8 γ-Amino-NButyric Acid G9 ε-Amino-NCaproic Acid G10 D,L-α-AminoCaprylic Acid G11 δ-Amino-NValeric Acid G12 α-Amino-NValeric Acid H1 Ala-Asp H2 Ala-Gln H3 Ala-Glu H4 Ala-Gly H5 Ala-His H6 Ala-Leu H7 Ala-Thr H8 Gly-Asn H9 Gly-Gln H10 Gly-Glu H11 Gly-Met H12 Met-Ala"