Home > Net >  Finding min across multiple dataframe in R
Finding min across multiple dataframe in R

Time:12-03

I have 3 dataframes with the same dimensions. I want to create a dataframe with min value from each element in 3 dataframes. Is there a more efficient way than running loop on cloumn and then row going through each element one by one?

Dataframe X
| Column A | Column B |
| -------- | -------- |
| Cell X1   | Cell X2   |
| Cell X3   | Cell X4   |

Dataframe Y
| Column A | Column B |
| -------- | -------- |
| Cell Y1   | Cell Y2   |
| Cell Y3   | Cell Y4   |

Dataframe Z
| Column A | Column B |
| -------- | -------- |
| Cell Z1   | Cell Z2   |
| Cell Z3   | Cell Z4   |
Dataframe Target Output
| Column A | Column B |
| -------- | -------- |
| Min (Cell X1,Y1,Z1)   | Min (Cell X2,Y2,Z2)    |
| Min (Cell X3,Y3,Z3)   | Min (Cell X4,Y4,Z4)   |

Thank you!

I tried simple loop for each column and then each row for (c in 1:3){ for (r in 1:2){ ........ } }

CodePudding user response:

You can use the pmin() function from the base R package to get the minimum value of each row across three dataframes.

For example:

df_target = data.frame(c1 = pmin(X$ColumnA, Y$ColumnA, Z$ColumnA), 
                       c2 = pmin(X$ColumnB, Y$ColumnB, Z$ColumnB))

Or you can use the apply() function to apply the pmin() function to each row of the dataframes:

df_target = data.frame(apply(cbind(X$ColumnA, Y$ColumnA, Z$ColumnA), 1, pmin), 
                       apply(cbind(X$ColumnB, Y$ColumnB, Z$ColumnB), 1, pmin))

CodePudding user response:

You could loop over the columns and use pmin:

as.data.frame(lapply(1:ncol(X), \(i) pmin(X[[i]], Y[[i]], Z[[i]])))

If you need to generalize to more data frames, I would put them in a list and use do.call with the pmin.

(Answer is untested as no sample data is provided.)

The more general solution would be to stack the data frames into a 3-d array and use apply:

arr = abind(A, B, C, along = 3)
result = apply(arr, 1:2, min)

CodePudding user response:

Another option is: Tidyverse:

library(purrr)
map_dfc(transpose(df_list), lift(pmin))

data.table:

library(data.table)
rbindlist(df_list, idcol = "group")[,lapply(.SD,min),rowid(group)]

Base R

aggregate(do.call(rbind,df_list), list(sequence(sapply(df_list, nrow))), min)

EDIT:

Base R approach:

Reduce(function(x,y)replace(x, x>y , y[x>y]), list_df)
  •  Tags:  
  • r
  • Related