Home > Mobile >  r: filter_at or filter_all strings that meets two conditions and then pivot_longer
r: filter_at or filter_all strings that meets two conditions and then pivot_longer

Time:01-25

I have a very large dataframe that recorded tumor size on MRI scans on multiple follow-ups.

Let's say I have p:

   id debut_extramea_xy_dimension_MR fu1_extramea_xy_dimension_MR fu2_extramea_xy_dimension_MR fu3_extramea_xy_dimension_MR
1 134                          14x14                        14x14                    12.5x10.5                    12.5x10.5
2 434                   24 x 19 x 13                      24 x 17                      24 x 17                      21 x 16
3 437                        40 x 30                   20 x 20 mm                      20 x 20                      25 x 18
4 440                        26 x 24                      26 x 24                      26 x 24                      26 x 24
5 498                         13x6.4                     14.8x8.7                    19.4x12.3                    21.7x13.5

As you can see, the data records two-dimensional "xy"-axis data on the tumor. However, there are two issues:

(1) those who registered the data accidentally recorded three-dimensions on some patients, "xyz"-axis. This is demonstrated in row 2 of p$debut_extramea_xy_dimension_MR, corresponding to p$id == 434

and

(2) in some cases, the measure unit was accidentally recorded, eg. "mm" as in row 3 of p$fu1_extramea_xy_dimension_MR, corresponding to p$id == 437

I need to filter and pivot_longer so I obtain a dataframe with three columns: (1) the id, (2) at what followup and (3) what error. I need to go into the database manually to alter this, so these informations would be a great help.

Expected output

   id  name        value
1 434 debut 24 x 19 x 13
2 437   fu1   20 x 20 mm

Data

p <- structure(list(id = c(134L, 434L, 437L, 440L, 498L), debut_extramea_xy_dimension_MR = c("14x14", 
                                                                                    "24 x 19 x 13", "40 x 30", "26 x 24", "13x6.4"), fu1_extramea_xy_dimension_MR = c("14x14", 
                                                                                                                                                                  "24 x 17", "20 x 20 mm", "26 x 24", "14.8x8.7"), fu2_extramea_xy_dimension_MR = c("12.5x10.5", 
                                                                                                                                                                                                                                                 "24 x 17", "20 x 20", "26 x 24", "19.4x12.3"), fu3_extramea_xy_dimension_MR = c("12.5x10.5", 
                                                                                                                                                                                                                                                                                                                                 "21 x 16", "25 x 18", "26 x 24", "21.7x13.5")), row.names = c(NA, 
                                                                                                                                                                                                                                                                                                                                                                                               -5L), class = "data.frame")

CodePudding user response:

Here is a solution

library(tidyverse)                                                                                                                                                                                                                                                                                                                                                                                               -5L), class = "data.frame")
p %>%
  filter(if_any(- id, ~ str_detect(., rgx <<- "x.*x|mm"))) %>% 
  pivot_longer(-id) %>% 
  filter(str_detect(value, rgx)) %>% 
  mutate(name = str_remove(name, "_extramea_xy_dimension_MR"))

# A tibble: 2 × 3
     id name  value       
  <int> <chr> <chr>       
1   434 debut 24 x 19 x 13
2   437 fu1   20 x 20 mm  
  • Related