Home > Mobile >  How can I correspond the values of characters from a dataframe to characters of a list
How can I correspond the values of characters from a dataframe to characters of a list

Time:11-06

Since I am really new in R, I am not sure if I will be able to express my problem correctly so sorry in advance. I have some letters that have a given value. I created a dataframe for those and I also have a string with the same set of letters. I want to correspond the values from the dataframe to each letter of my string and then calculate the mean for a window of length L. I can't find a way to do the first part, since I don't know how to compare the string chars with the dataframe chars and then assign the values to the string chars in order to find the mean of the window. Any tips?

A = data.frame(A = 0.429, C = -0.051, D = -2.024, E = -2.181, F = 0.836, 
     G = 0.158, H = -1.056, I = 0.959, K = -2.398, L = 0.658, 
     M = 0.470, N = -1.099, P = -0.675, Q = -1.564, R = -2.501, 
     S = -0.292, T = -0.182, V = 0.634, W = 0.463, Y = 0.163)
(a <- "MASEFKKKLFWRAVVAEF")                                                                                                                                              
a_split = strsplit(a, "")
L = readline(prompt = "Enter window length: \n")
x = nchar(a)
for(i in 1:x-L)
{
  for(j in a_split)
  {
     
      
  }
 
}

CodePudding user response:

You can do this is a one-liner using your original data format:

sapply(unlist(strsplit(a, "")), \(i) A[[i]])
#>      M      A      S      E      F      K      K      K      L 
#>  0.470  0.429 -0.292 -2.181  0.836 -2.398 -2.398 -2.398  0.658 
#>      F      W      R      A      V      V      A      E      F 
#>  0.836  0.463 -2.501  0.429  0.634  0.634  0.429 -2.181  0.836 

Or if you don't want the letter indices, the one-liner is:

as.numeric(sapply(unlist(strsplit(a, "")), \(i) A[[i]]))
#>  [1]  0.470  0.429 -0.292 -2.181  0.836 -2.398 -2.398 -2.398  0.658
#> [10]  0.836  0.463 -2.501  0.429  0.634  0.634  0.429 -2.181  0.836

CodePudding user response:

Instead of using a data.frame, use a named numeric vector,

A = c(A = 0.429, C = -0.051, D = -2.024, E = -2.181, F = 0.836, 
     G = 0.158, H = -1.056, I = 0.959, K = -2.398, L = 0.658, 
     M = 0.470, N = -1.099, P = -0.675, Q = -1.564, R = -2.501, 
     S = -0.292, T = -0.182, V = 0.634, W = 0.463, Y = 0.163)

Then use this as a 'map' between the letters and the values

values <- A[ a_split[[1]] ]
values
#      M      A      S      E      F      K      K      K      L      F      W
#  0.470  0.429 -0.292 -2.181  0.836 -2.398 -2.398 -2.398  0.658  0.836  0.463
#      R      A      V      V      A      E      F
# -2.501  0.429  0.634  0.634  0.429 -2.181  0.836

(for the original data.frame, you could write unlist(A)[ a_split[[1]] ]) Use convolve() to calculate the sliding window average

window_size = 4
convolve(values, rep(1, window_size) / window_size, type = "filter")
#        M        A        S        E        F        K        K        K
# -0.39350 -0.30200 -1.00875 -1.53525 -1.58950 -1.63400 -0.82550 -0.11025
#        L        F        W        R        A        V        V
# -0.13600 -0.19325 -0.24375 -0.20100  0.53150 -0.12100 -0.07050
  • Related