I'd like to reorganize a data frame and Imake:
# Open my dataset
res_F <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/ds_bandsIVs.csv")
str(res_F)
# 'data.frame': 16 obs. of 7 variables:
# $ TILE: chr "T22KBC" "T22KBC" "T22KBC" "T22KBC" ...
# $ DATE: chr "2022-04-20" "2022-04-20" "2022-04-20" "2022-04-20" ...
# $ BAND: chr "B2" "B3" "B4" "B8" ...
# $ MEAN: num 1205 1332 1235 3770 1205 ...
# $ SD : num 38.2 63.8 104.7 283.7 38.1 ...
# $ MIN : int 1141 1217 1135 2638 1142 1217 1135 2636 1111 1222 ...
# $ MAX : int 1494 1800 2218 4928 1496 1803 2218 4924 1456 1786 ...
Here, I,d like to create a new data frame with BANDS
in the columns and mean
, sd
, min
, max
in the rows, in a new column STATS
.
For this, I try:
res_F%>%spread(res_F, key = BAND, value = c(MEAN,SD,MIN,MAX))
Error:
! Must extract column with a single valid subscript.
x Subscript `var` has size 4 but must be size 1.
Run `rlang::last_error()` to see where the error occurred.
Doesn't work and if I try with one variable (eg. MEAN
) it not works too:
res_F%>%spread(key = BAND, value = MEAN)
# TILE DATE SD MIN MAX B2 B3 B4 B8
# 1 T22KBC 2022-04-20 38.19472 1141 1494 1205.221 NA NA NA
# 2 T22KBC 2022-04-20 63.80129 1217 1800 NA 1332.218 NA NA
# 3 T22KBC 2022-04-20 104.72500 1135 2218 NA NA 1234.708 NA
# 4 T22KBC 2022-04-20 283.66055 2638 4928 NA NA NA 3770.484
# 5 T22KBC 2022-04-25 34.81989 1111 1456 1182.414 NA NA NA
# 6 T22KBC 2022-04-25 59.19710 1222 1786 NA 1333.160 NA NA
# 7 T22KBC 2022-04-25 93.12933 1152 2236 NA NA 1239.121 NA
# 8 T22KBC 2022-04-25 274.51874 2554 4748 NA NA NA 3646.049
# 9 T22KBD 2022-04-20 38.11758 1142 1496 1205.348 NA NA NA
# 10 T22KBD 2022-04-20 63.74426 1217 1803 NA 1331.933 NA NA
# 11 T22KBD 2022-04-20 104.72817 1135 2218 NA NA 1234.931 NA
# 12 T22KBD 2022-04-20 283.34937 2636 4924 NA NA NA 3770.361
# 13 T22KBD 2022-04-25 35.06260 1106 1454 1177.922 NA NA NA
# 14 T22KBD 2022-04-25 59.53104 1217 1784 NA 1329.845 NA NA
# 15 T22KBD 2022-04-25 93.62821 1149 2238 NA NA 1236.762 NA
# 16 T22KBD 2022-04-25 275.44175 2550 4748 NA NA NA 3647.263
I need some help here. My desirable output is:
# TILE DATE B2 B3 B4 B8 STATS
# 1 T22KBC 2022-04-20 1205.221 1332.218 1234.708 3770.484 mean
# 1 T22KBC 2022-04-20 38.19472 63.80129 104.72500 283.66055 sd
# ...
Please, any help with it?
CodePudding user response:
You can do this with pivot_longer()
, followed by pivot_wider()
, or, equivalently with melt()
, followed by dcast()
- with
tidyverse
library(tidyverse)
res_F %>%
pivot_longer(cols = MEAN:MAX,names_to = "STATS") %>%
pivot_wider(id_cols = c(TILE,DATE,STATS),names_from = BAND,values_from = value)
Output:
TILE DATE STATS B2 B3 B4 B8
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 T22KBC 2022-04-20 MEAN 1205. 1332. 1235. 3770.
2 T22KBC 2022-04-20 SD 38.2 63.8 105. 284.
3 T22KBC 2022-04-20 MIN 1141 1217 1135 2638
4 T22KBC 2022-04-20 MAX 1494 1800 2218 4928
5 T22KBD 2022-04-20 MEAN 1205. 1332. 1235. 3770.
6 T22KBD 2022-04-20 SD 38.1 63.7 105. 283.
7 T22KBD 2022-04-20 MIN 1142 1217 1135 2636
8 T22KBD 2022-04-20 MAX 1496 1803 2218 4924
9 T22KBC 2022-04-25 MEAN 1182. 1333. 1239. 3646.
10 T22KBC 2022-04-25 SD 34.8 59.2 93.1 275.
11 T22KBC 2022-04-25 MIN 1111 1222 1152 2554
12 T22KBC 2022-04-25 MAX 1456 1786 2236 4748
13 T22KBD 2022-04-25 MEAN 1178. 1330. 1237. 3647.
14 T22KBD 2022-04-25 SD 35.1 59.5 93.6 275.
15 T22KBD 2022-04-25 MIN 1106 1217 1149 2550
16 T22KBD 2022-04-25 MAX 1454 1784 2238 4748
- with
data.table
:
library(data.table)
dcast(
melt(setDT(res_F),measure.vars = c("MEAN", "SD", "MIN","MAX"),variable.name = "STATS"),
TILE DATE STATS~BAND,
value.var="value"
)
Output:
TILE DATE STATS B2 B3 B4 B8
<char> <char> <fctr> <num> <num> <num> <num>
1: T22KBC 2022-04-20 MEAN 1205.22129 1332.21834 1234.70774 3770.4842
2: T22KBC 2022-04-20 SD 38.19472 63.80129 104.72500 283.6606
3: T22KBC 2022-04-20 MIN 1141.00000 1217.00000 1135.00000 2638.0000
4: T22KBC 2022-04-20 MAX 1494.00000 1800.00000 2218.00000 4928.0000
5: T22KBC 2022-04-25 MEAN 1182.41399 1333.15975 1239.12132 3646.0489
6: T22KBC 2022-04-25 SD 34.81989 59.19710 93.12933 274.5187
7: T22KBC 2022-04-25 MIN 1111.00000 1222.00000 1152.00000 2554.0000
8: T22KBC 2022-04-25 MAX 1456.00000 1786.00000 2236.00000 4748.0000
9: T22KBD 2022-04-20 MEAN 1205.34830 1331.93288 1234.93142 3770.3607
10: T22KBD 2022-04-20 SD 38.11758 63.74426 104.72817 283.3494
11: T22KBD 2022-04-20 MIN 1142.00000 1217.00000 1135.00000 2636.0000
12: T22KBD 2022-04-20 MAX 1496.00000 1803.00000 2218.00000 4924.0000
13: T22KBD 2022-04-25 MEAN 1177.92235 1329.84519 1236.76162 3647.2630
14: T22KBD 2022-04-25 SD 35.06260 59.53104 93.62821 275.4417
15: T22KBD 2022-04-25 MIN 1106.00000 1217.00000 1149.00000 2550.0000
16: T22KBD 2022-04-25 MAX 1454.00000 1784.00000 2238.00000 4748.0000