I need a push to the right direction... or a hint where to look further. I have a large dataframe containing results of analyses for samples - a key column and then several columns for analytical targets. I managed to filter certain conditions, summarise some statistics and store them in a second dataframe. (dft2_stat_norm)
Now I want to column-wise divide all target values of the original (unfiltered) tibble (e. g. below "dft_Resp_norm") by the corresponding (row) value stored in the second tibble. I have all targets stored in a vector "analytes". The column and row are linked by the target descriptions, e. g. "G444".
- can you suggest a solution?
- can you suggest a source (comprehensive for lateral entrants ;)) to read further on this?
Thanks!
> dft_Resp_norm
# A tibble: 39 × 7
Datum Lfd_Nr Probe cond G444 G448 S453
\<chr\> \<dbl\> \<chr\> \<chr\> \<dbl\> \<dbl\> \<dbl\>
1 09.01.2020 16 NK NK 0.00586 0.0591 0.0594
2 04.02.2020 37 NK NK 0.00661 0.0609 0.0944
3 12.02.2020 57 NK NK 0.00611 0.0674 0.116
4 13.03.2020 67 NK-2 NK 0.00122 0.0678 0.115
5 13.03.2020 68 NK-007 NK 0.0138 0.115 0.359
> dft2_stat_norm
# A tibble: 3 × 2
An_Names median_Resp_norm
<chr> <dbl>
1 G444 0.00678
2 G448 0.0696
3 S453 0.126
I tried it directly ...
dft_MOM_full <- df_Resp_full %>%
select(1:7) %>%
mutate(across(any_of(analytes); -?-
)
but I can link the column name to the corresponding row in the second tibble, and I tried it by grouping ...
dft_MOM_full <- df_Resp_full %>%
select(1:7) %>%
pivot_longer(any_of(analytes),
names_to = "target",
values_to = "MOM") %>%
mutate(MOM = MOM / -?-
)
... but I didn't manage to find out how to link the group name to the second tibble
CodePudding user response:
Using a left_join
you could do:
library(dplyr)
library(tidyr)
analytes <- dft2_stat_norm$An_Names
dft_MOM_full <- dft_Resp_norm %>%
pivot_longer(any_of(analytes),
names_to = "target",
values_to = "MOM") |>
left_join(dft2_stat_norm, by = c("target" = "An_Names")) |>
mutate(MOM = MOM / median_Resp_norm)
dft_MOM_full
#> # A tibble: 15 × 7
#> Datum Lfd_Nr Probe cond target MOM median_Resp_norm
#> <chr> <int> <chr> <chr> <chr> <dbl> <dbl>
#> 1 09.01.2020 16 NK NK G444 0.864 0.00678
#> 2 09.01.2020 16 NK NK G448 0.849 0.0696
#> 3 09.01.2020 16 NK NK S453 0.471 0.126
#> 4 04.02.2020 37 NK NK G444 0.975 0.00678
#> 5 04.02.2020 37 NK NK G448 0.875 0.0696
#> 6 04.02.2020 37 NK NK S453 0.749 0.126
#> 7 12.02.2020 57 NK NK G444 0.901 0.00678
#> 8 12.02.2020 57 NK NK G448 0.968 0.0696
#> 9 12.02.2020 57 NK NK S453 0.921 0.126
#> 10 13.03.2020 67 NK-2 NK G444 0.180 0.00678
#> 11 13.03.2020 67 NK-2 NK G448 0.974 0.0696
#> 12 13.03.2020 67 NK-2 NK S453 0.913 0.126
#> 13 13.03.2020 68 NK-007 NK G444 2.04 0.00678
#> 14 13.03.2020 68 NK-007 NK G448 1.65 0.0696
#> 15 13.03.2020 68 NK-007 NK S453 2.85 0.126