I have a data table that includes two columns (crop and plotnum) that identify a group (crop) and elements of the group (plotnum). I want to calculate the ratio of the value of subgroups within a group to the value of one of the subgroups in the group (e.g. plot_4). The idea is to compare the performance of the subgroup within the group.
Here's a sample data table.
DT <- data.table(value = c(71.03528, 212.1139, 91.72795, 37.65809, 194.7215, 334.2902, 371.2177, 122.9278, 218.7671, 552.3567, 193.3274, 147.1677, 197.4963, 194.9721, 156.7755, 184.1638),
crop = c("BellPepper", "BellPepper", "BellPepper", "BellPepper", "ChiliPepper", "ChiliPepper", "ChiliPepper", "ChiliPepper", "Eggplant", "Eggplant", "Eggplant", "Eggplant", "Kale", "Kale", "Kale", "Kale"),
plotnum = c("plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4"
))
One way to do this would be to duplicate the plot_4 value for the groups in a new column and then calculate the ratio of the value column to the plot_4 column but I can't figure out how to grab the subgroup value and fill just the cells of the group in the plot_4 column. Other approaches also welcome!
CodePudding user response:
We can subset based on the subgroup by group
library(data.table)
DT[, ratio := value/value[plotnum == 'plot_4'], by = crop]
CodePudding user response:
An extension is if you want to compare each row with all unique plotnum
:
DT[, (unique(DT$plotnum)) :=
lapply(setNames(nm=unique(plotnum)), function(pn) value/value[plotnum == pn]),
by = crop]
# value crop plotnum plot_1 plot_2 plot_3 plot_4
# <num> <char> <char> <num> <num> <num> <num>
# 1: 71.03528 BellPepper plot_1 1.0000000 0.3348921 0.7744126 1.8863219
# 2: 212.11390 BellPepper plot_2 2.9860360 1.0000000 2.3124239 5.6326250
# 3: 91.72795 BellPepper plot_3 1.2913013 0.4324467 1.0000000 2.4358099
# 4: 37.65809 BellPepper plot_4 0.5301322 0.1775371 0.4105411 1.0000000
# 5: 194.72150 ChiliPepper plot_1 1.0000000 0.5824924 0.5245480 1.5840314
# 6: 334.29020 ChiliPepper plot_2 1.7167606 1.0000000 0.9005233 2.7194028
# 7: 371.21770 ChiliPepper plot_3 1.9064032 1.1104654 1.0000000 3.0198027
# 8: 122.92780 ChiliPepper plot_4 0.6313006 0.3677278 0.3311475 1.0000000
# 9: 218.76710 Eggplant plot_1 1.0000000 0.3960613 1.1315887 1.4865157
# 10: 552.35670 Eggplant plot_2 2.5248618 1.0000000 2.8571051 3.7532468
# 11: 193.32740 Eggplant plot_3 0.8837133 0.3500046 1.0000000 1.3136537
# 12: 147.16770 Eggplant plot_4 0.6727140 0.2664360 0.7612356 1.0000000
# 13: 197.49630 Kale plot_1 1.0000000 1.0129465 1.2597396 1.0723948
# 14: 194.97210 Kale plot_2 0.9872190 1.0000000 1.2436388 1.0586885
# 15: 156.77550 Kale plot_3 0.7938149 0.8040920 1.0000000 0.8512829
# 16: 184.16380 Kale plot_4 0.9324924 0.9445649 1.1746976 1.0000000
Though this is only useful if you intend to compare one with all other plotnums. If you don't need the extension of all plotnums, then the inner method is based on the other answer, so that one should be accepted instead.