I have this dataframe
a <- c("5*", 7, 9, "11***")
b <- c("-8*", "-10*", -3, -1)
c <- c(-4, -1, "-6**", "3*")
df <- data.frame(a,b,c)
a b c
1 5* -8* -4
2 7 -10* -1
3 9 -3 -6**
4 11*** -1 3*
then I want the rows be ordered for the maximum net value in them (just considering those values with at least one "*"), Independently of the columns. I mean, I expect something like this:
a b c
4 11*** -1 3*
2 7 -10* -1
1 5* -8* -4
3 9 -3 -6**
as you can see it was sorted because the first row has the value of 11***, the second (-10*), the third (-8*), the fourth (-6**). You can notice that in the last row it was omited the 9 in the sorting, that's because we just want to consider those numbers with at least one "*"
CodePudding user response:
One way - get the abs
olute max
among the *
suffixed values, create a order
index based on those and use that to reorder the rows
df[order(-apply(df, 1, FUN = function(x)
max(abs(as.numeric(sub("\\* ", "", x[grep("\\*", x)] )))))),]
-output
a b c
4 11*** -1 3*
2 7 -10* -1
1 5* -8* -4
3 9 -3 -6**
or using tidyverse
, reshape to 'long' format with pivot_longer
, filter
only the elements having *
, arrange
on the numeric part by extraction (parse_number
), slice
the rows of original based on the row number arranged
library(dplyr)
library(stringr)
library(tidyr)
df %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = -rn) %>%
filter(str_detect(value, fixed("*"))) %>%
arrange(desc(abs(readr::parse_number(value)))) %>%
distinct(rn) %>%
pull(rn) %>%
slice(df, .)
-output
a b c
1 11*** -1 3*
2 7 -10* -1
3 5* -8* -4
4 9 -3 -6**