Since I am really new in R, I am not sure if I will be able to express my problem correctly so sorry in advance. I have some letters that have a given value. I created a dataframe for those and I also have a string with the same set of letters. I want to correspond the values from the dataframe to each letter of my string and then calculate the mean for a window of length L. I can't find a way to do the first part, since I don't know how to compare the string chars with the dataframe chars and then assign the values to the string chars in order to find the mean of the window. Any tips?
A = data.frame(A = 0.429, C = -0.051, D = -2.024, E = -2.181, F = 0.836,
G = 0.158, H = -1.056, I = 0.959, K = -2.398, L = 0.658,
M = 0.470, N = -1.099, P = -0.675, Q = -1.564, R = -2.501,
S = -0.292, T = -0.182, V = 0.634, W = 0.463, Y = 0.163)
(a <- "MASEFKKKLFWRAVVAEF")
a_split = strsplit(a, "")
L = readline(prompt = "Enter window length: \n")
x = nchar(a)
for(i in 1:x-L)
{
for(j in a_split)
{
}
}
CodePudding user response:
You can do this is a one-liner using your original data format:
sapply(unlist(strsplit(a, "")), \(i) A[[i]])
#> M A S E F K K K L
#> 0.470 0.429 -0.292 -2.181 0.836 -2.398 -2.398 -2.398 0.658
#> F W R A V V A E F
#> 0.836 0.463 -2.501 0.429 0.634 0.634 0.429 -2.181 0.836
Or if you don't want the letter indices, the one-liner is:
as.numeric(sapply(unlist(strsplit(a, "")), \(i) A[[i]]))
#> [1] 0.470 0.429 -0.292 -2.181 0.836 -2.398 -2.398 -2.398 0.658
#> [10] 0.836 0.463 -2.501 0.429 0.634 0.634 0.429 -2.181 0.836
CodePudding user response:
Instead of using a data.frame, use a named numeric vector,
A = c(A = 0.429, C = -0.051, D = -2.024, E = -2.181, F = 0.836,
G = 0.158, H = -1.056, I = 0.959, K = -2.398, L = 0.658,
M = 0.470, N = -1.099, P = -0.675, Q = -1.564, R = -2.501,
S = -0.292, T = -0.182, V = 0.634, W = 0.463, Y = 0.163)
Then use this as a 'map' between the letters and the values
values <- A[ a_split[[1]] ]
values
# M A S E F K K K L F W
# 0.470 0.429 -0.292 -2.181 0.836 -2.398 -2.398 -2.398 0.658 0.836 0.463
# R A V V A E F
# -2.501 0.429 0.634 0.634 0.429 -2.181 0.836
(for the original data.frame, you could write unlist(A)[ a_split[[1]] ]
)
Use convolve()
to calculate the sliding window average
window_size = 4
convolve(values, rep(1, window_size) / window_size, type = "filter")
# M A S E F K K K
# -0.39350 -0.30200 -1.00875 -1.53525 -1.58950 -1.63400 -0.82550 -0.11025
# L F W R A V V
# -0.13600 -0.19325 -0.24375 -0.20100 0.53150 -0.12100 -0.07050