Home > Net >  Difficulty getting frequency/proportion output using prop.table and merge for numeric and categorica
Difficulty getting frequency/proportion output using prop.table and merge for numeric and categorica

Time:06-09

I have data on plant species cover at site and plot level which looks like this:

   SITE    PLOT   SPECIES   AREA
    1         1        A    0.3
    1         1        B    25.5
    1         1        C    1.0
    1         2        A    0.3
    1         2        C    0.3
    1         2        D    0.3
    2         1        B    17.9
    2         1        C    131.2
    2         2        A    37.3
    2         2        C    0.3
    2         3        A    5.3
    2         3        D    0.3

I have successfully used the following code to obtain percentage values for species at various sites,

dfnew <- merge(df1, prop.table(xtabs(AREA ~ SPECIES   SITE, df1), 2)*100)

I am trying now to find the relative proportion of each species within each plot(as a proportion of all species in the plot) with a desired output like the one below:

SITE    PLOT    SPECIES  AREA   Plot-freq
1         1       A       0.3    1.06
1         1       B       25.5   95.39
1         1       C       1.0    3.56
1         2       A       0.3    33.33
1         2       C       0.3    33.33
1         2       D       0.3    33.33
2         1       B       17.9   12.02
2         1       C       131.2  87.98
2         2       A       37.3   99.25
2         2       C       0.3    0.75
2         3       A       5.3    94.94
2         3       D       0.3    5.06

I tried adding the PLOT variable to the original code but ended up with tiny values

a <- merge(df1, prop.table(xtabs(AREA ~ SPECIES   PLOT   SITE, woods2), 2)*100)

I have been looking at similar questions, but most of those don't have similar data and none of the solutions seem to work for me. Any help much appreciated.

data

> dput(df1)
structure(list(SITE = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2), 
    PLOT = c(1, 1, 1, 2, 2, 2, 1, 1, 2, 2, 3, 3), SPECIES = c("A", 
    "B", "C", "A", "C", "D", "B", "C", "A", "C", "A", "D"), AREA = c(0.3, 
    25.5, 1, 0.3, 0.3, 0.3, 17.9, 131.2, 37.3, 0.3, 5.3, 0.3)), class = "data.frame", row.names = c(NA, 
-12L))

CodePudding user response:

I'm not sure I completely understand your calculation, but I believe you can do this:

library(dplyr)
df1 %>% group_by(SITE, PLOT) %>%  mutate(Plot_freq = AREA/sum(AREA))

Output:

    SITE  PLOT SPECIES  AREA Plot_freq
   <dbl> <dbl> <chr>   <dbl>     <dbl>
 1     1     1 A         0.3   0.0112 
 2     1     1 B        25.5   0.951  
 3     1     1 C         1     0.0373 
 4     1     2 A         0.3   0.333  
 5     1     2 C         0.3   0.333  
 6     1     2 D         0.3   0.333  
 7     2     1 B        17.9   0.120  
 8     2     1 C       131.    0.880  
 9     2     2 A        37.3   0.992  
10     2     2 C         0.3   0.00798
11     2     3 A         5.3   0.946  
12     2     3 D         0.3   0.0536 

CodePudding user response:

Very interesting to merge with the prop.table! I also wasn't lucky though, to modify your approach.

However, to avoid dplyr you may want to use ave to calculate plot sums, then just pipe |> it further to calculate the relative areas like so:

transform(df1, Psum=ave(AREA, SITE, PLOT, FUN=sum)) |> transform(Plot_freq=AREA/Psum*100)
#    SITE PLOT SPECIES  AREA  Psum  Plot_freq
# 1     1    1       A   0.3  26.8  1.1194030
# 2     1    1       B  25.5  26.8 95.1492537
# 3     1    1       C   1.0  26.8  3.7313433
# 4     1    2       A   0.3   0.9 33.3333333
# 5     1    2       C   0.3   0.9 33.3333333
# 6     1    2       D   0.3   0.9 33.3333333
# 7     2    1       B  17.9 149.1 12.0053655
# 8     2    1       C 131.2 149.1 87.9946345
# 9     2    2       A  37.3  37.6 99.2021277
# 10    2    2       C   0.3  37.6  0.7978723
# 11    2    3       A   5.3   5.6 94.6428571
# 12    2    3       D   0.3   5.6  5.3571429

Note: R >= 4.1 used.

  • Related