I have a data frame which looks like this:
df <- structure(list(Image = c("low_1_red", "low_1_red", "low_1_red", "low_1_red", "low_1_red", "low_1_red", "low_1_red", "low_1_red","low_1_red", "low_1_red", "low_1_green", "low_1_green", "low_1_green", "low_1_green", "low_1_green", "low_1_green", "low_1_green", "low_1_green", "low_1_green", "low_1_green", "low_1_blue", "low_1_blue", "low_1_blue", "low_1_blue", "low_1_blue", "low_1_blue", "low_1_blue", "low_1_blue", "low_1_blue", "low_1_blue"), Mean = c(232, 233, 233, 233, 97,232, 232, 233, 232, 95, 98, 100, 98, 98, 231, 99, 98, 99, 99, 231, 232, 233, 233, 233, 231, 232, 232, 233, 232, 231), X = c(146, 39, 1129, 1252, 872, 1002, 300, 275, 17, 1521, 146, 39, 1129, 1252, 872, 1002, 300, 275, 17, 1521, 146, 39, 1129, 1252, 872, 1002, 300, 275, 17, 1521), Y = c(1056, 156, 1553, 878, 310, 108, 383, 726, 199, 1069, 1056, 156, 1553, 878, 310, 108, 383, 726, 199, 1069, 1056, 156, 1553, 878, 310, 108, 383, 726, 199, 1069)), row.names = c(NA,-30L), class = "data.frame")
which is the output from image analysis of 10 points (see X and Y coords), overlayed on one image ("low 1"), which has been split into red, green and blue layers. What I want is to match up the red, green and blue output from each point by their X and Y coordinates so I end up with a dataframe that looks like this:
df2 <- structure(list(Image = c("low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue", "low_1_red", "low_1_green", "low_1_blue"), Mean = c(232, 98, 232, 233, 100, 233, 233, 98, 233, 233, 98, 233, 97, 231, 231, 232, 99, 232, 232, 98, 232, 233, 99, 233, 232, 99, 232, 95, 231, 231), X = c(146, 146, 146, 39, 39, 39, 1129, 1129, 1129, 1252, 1252, 1252, 872, 872, 872, 1002, 1002, 1002, 300, 300, 300, 275, 275, 275, 17, 17, 17, 1521, 1521, 1521), Y = c(1056, 1056, 1056, 156, 156, 156, 1553, 1553, 1553, 878, 878, 878, 310, 310, 310, 108, 108, 108, 383, 383, 383, 726, 726, 726, 199, 199, 199, 1069, 1069, 1069)), row.names = c(NA, -30L), class = "data.frame")
I've sorted them manually here but my actual data frame is around 90,000 rows so I'm looking for a quick fix! Any help much appreciated :)
CodePudding user response:
You can use dplyr::arrange
:
dplyr::arrange(df, X,Y, Image)
Output:
Image Mean X Y
1 low_1_blue 232 17 199
2 low_1_green 99 17 199
3 low_1_red 232 17 199
4 low_1_blue 233 39 156
5 low_1_green 100 39 156
6 low_1_red 233 39 156
7 low_1_blue 232 146 1056
8 low_1_green 98 146 1056
9 low_1_red 232 146 1056
10 low_1_blue 233 275 726
11 low_1_green 99 275 726
12 low_1_red 233 275 726
13 low_1_blue 232 300 383
14 low_1_green 98 300 383
15 low_1_red 232 300 383
16 low_1_blue 231 872 310
17 low_1_green 231 872 310
18 low_1_red 97 872 310
19 low_1_blue 232 1002 108
20 low_1_green 99 1002 108
21 low_1_red 232 1002 108
22 low_1_blue 233 1129 1553
23 low_1_green 98 1129 1553
24 low_1_red 233 1129 1553
25 low_1_blue 233 1252 878
26 low_1_green 98 1252 878
27 low_1_red 233 1252 878
28 low_1_blue 231 1521 1069
29 low_1_green 231 1521 1069
30 low_1_red 95 1521 1069
Faster option (you might consider if you have large data) is to set your df
to data.table
and use order
data.table::setDT(df)[order(X,Y,Image)]
Update: add group column:
library(data.table)
result = setDT(df)[order(X,Y,Image)][, grp:=rleid(X,Y)][]
Output:
Image Mean X Y grp
1: low_1_blue 232 17 199 1
2: low_1_green 99 17 199 1
3: low_1_red 232 17 199 1
4: low_1_blue 233 39 156 2
5: low_1_green 100 39 156 2
6: low_1_red 233 39 156 2
7: low_1_blue 232 146 1056 3
8: low_1_green 98 146 1056 3
9: low_1_red 232 146 1056 3
10: low_1_blue 233 275 726 4
11: low_1_green 99 275 726 4
12: low_1_red 233 275 726 4
13: low_1_blue 232 300 383 5
14: low_1_green 98 300 383 5
15: low_1_red 232 300 383 5
16: low_1_blue 231 872 310 6
17: low_1_green 231 872 310 6
18: low_1_red 97 872 310 6
19: low_1_blue 232 1002 108 7
20: low_1_green 99 1002 108 7
21: low_1_red 232 1002 108 7
22: low_1_blue 233 1129 1553 8
23: low_1_green 98 1129 1553 8
24: low_1_red 233 1129 1553 8
25: low_1_blue 233 1252 878 9
26: low_1_green 98 1252 878 9
27: low_1_red 233 1252 878 9
28: low_1_blue 231 1521 1069 10
29: low_1_green 231 1521 1069 10
30: low_1_red 95 1521 1069 10