I have two data frames
The first one: AEZ_2
head(AEZ_2)
gid AEZ AEZ_area_sum
1 142976 Tropics, lowland; semi-arid 18.20585
2 142977 Tropics, lowland; semi-arid 924.40126
3 142978 Tropics, lowland; semi-arid 509.15215
4 142978 Tropics, lowland; sub-humid 11.47290
5 143696 Tropics, lowland; semi-arid 858.59022
6 143697 Dominantly hydromorphic soils 589.91021
dput(AEZ_2[1:6, c(1: 3)])
structure(list(gid = c(142976, 142977, 142978, 142978, 143696,
143697), AEZ = c("Tropics, lowland; semi-arid", "Tropics, lowland; semi-arid",
"Tropics, lowland; semi-arid", "Tropics, lowland; sub-humid",
"Tropics, lowland; semi-arid", "Dominantly hydromorphic soils"
), AEZ_area_sum = c(18.2058489094, 924.401258895, 509.152149209,
11.4728955973, 858.590216109, 589.9102080814)), row.names = c(NA,
6L), class = "data.frame")
The second one: Farming system
gid Farming_system FS_area_sum
1 142976 5. Cereal-root crop mixed farming system 1.820585e 01
2 142977 5. Cereal-root crop mixed farming system 9.244013e 02
3 142978 5. Cereal-root crop mixed farming system 5.206250e 02
4 143696 2. Agropastoral farming system 6.979757e 02
5 143696 5. Cereal-root crop mixed farming system 1.606145e 02
6 143697 2. Agropastoral farming system 2.107575e 03
dput(FS_2[1:5, c(1:3)])
structure(list(gid = c(142976, 142977, 142978, 143696, 143696
), Farming_system = structure(c(9L, 9L, 9L, 6L, 9L), .Label = c("1. Maize mixed farming system",
"10. Forest-based farming system", "11. Large-scale irrigated farming system",
"12. Perennial mixed farming system", "13. Arid pastoral oasis farming system",
"2. Agropastoral farming system", "3. Highland perennial farming system",
"4. Root and tuber crop farming system", "5. Cereal-root crop mixed farming system",
"6. Highland mixed farming system", "7. Humid lowland tree crop farming system",
"8. Pastoral farming system", "9. Fish-based farming system"), class = "factor"),
FS_area_sum = c(18.205849004, 924.40125911, 520.625044495,
697.975740616, 160.614476324)), row.names = c(NA, 5L), class = "data.frame")
I merged the two data frames: Final.27
head(Final.27)
gid Farming_system FS_area_sum AEZ AEZ_area_sum
1 62356 <NA> NA Land with severe soil/terrain limitations 334.9770
2 79599 9. Fish-based farming system 198.0554185 Sub-tropics, moderately cool; sub-humid 74.4029
3 79599 9. Fish-based farming system 198.0554185 Land with severe soil/terrain limitations 123.7758
4 79599 12. Perennial mixed farming system 0.1306899 Sub-tropics, moderately cool; sub-humid 74.4029
5 79599 12. Perennial mixed farming system 0.1306899 Land with severe soil/terrain limitations 123.7758
6 79600 9. Fish-based farming system 603.2818466 Land with severe soil/terrain limitations 141.4297
dput(Final.27[1:5, c(1:5)])
structure(list(gid = c(62356, 79599, 79599, 79599, 79599), Farming_system = structure(c(NA,
13L, 13L, 4L, 4L), .Label = c("1. Maize mixed farming system",
"10. Forest-based farming system", "11. Large-scale irrigated farming system",
"12. Perennial mixed farming system", "13. Arid pastoral oasis farming system",
"2. Agropastoral farming system", "3. Highland perennial farming system",
"4. Root and tuber crop farming system", "5. Cereal-root crop mixed farming system",
"6. Highland mixed farming system", "7. Humid lowland tree crop farming system",
"8. Pastoral farming system", "9. Fish-based farming system"), class = "factor"),
FS_area_sum = c(NA, 198.0554184899, 198.0554184899, 0.130689938047,
0.130689938047), AEZ = c("Land with severe soil/terrain limitations",
"Sub-tropics, moderately cool; sub-humid", "Land with severe soil/terrain limitations",
"Sub-tropics, moderately cool; sub-humid", "Land with severe soil/terrain limitations"
), AEZ_area_sum = c(334.9769749362, 74.4028953581, 123.77575431,
74.4028953581, 123.77575431)), row.names = c(NA, 5L), class = "data.frame")
They have in common the 'gid' column (=cell identifier of 0.5x0.5). So a gid can have multiple observations (=polygons). That is why we can see that some gid are duplicated. In addition, sometimes the gid is fulfilled by an AEZ and not the farming system. Which is fine for me to have a NA observations (3rd column).
What I would like is a final table where the AEZ column is pivoted and filled with AEZ_area_sum and the FS column is pivoted and filled with FS_area_sum. In order to have many observations by gid and not a gid duplicated according to the different observations.
I tried to do this
First.flanked <- Final.27 %>%
pivot_wider(names_from = AEZ, values_from = AEZ_area_sum)
and then do it a second time for the farming system
Final.flanked <- First.flanked %>%
pivot_wider(names_from = Farming_system, value_from = FS_area_sum)
but I get:
Erreur : Failed to create output due to bad names.
- Choose another strategy with
names_repair
Runrlang::last_error()
to see where the error occurred. De plus : Message d'avis : Values are not uniquely identified; output will contain list-cols. - Use
values_fn = list
to suppress this warning. - Use
values_fn = length
to identify where the duplicates arise - Use
values_fn = {summary_fun}
to summarise duplicates
I know it's not working because of the duplicated gid. But how can I get the results I want ? to say => One gid for multiple observations within the same row ?
CodePudding user response:
Final.flanked <- Final.27 %>%
pivot_wider(id_cols = gid, names_from = c(Farming_system, AEZ), values_from = c(AEZ_area_sum, FS_area_sum))
View(Final.flanked)
I'm not sure you want all these columns, but it's easy to eliminate the unneeded ones from here.