Home > front end >  r data.table calculate ratio of value column to value identified by another column
r data.table calculate ratio of value column to value identified by another column

Time:10-23

I have a data table that includes two columns (crop and plotnum) that identify a group (crop) and elements of the group (plotnum). I want to calculate the ratio of the value of subgroups within a group to the value of one of the subgroups in the group (e.g. plot_4). The idea is to compare the performance of the subgroup within the group.

Here's a sample data table.

DT <- data.table(value = c(71.03528, 212.1139, 91.72795, 37.65809, 194.7215, 334.2902, 371.2177, 122.9278, 218.7671, 552.3567, 193.3274, 147.1677, 197.4963, 194.9721, 156.7755, 184.1638), 
                 crop = c("BellPepper", "BellPepper", "BellPepper", "BellPepper", "ChiliPepper", "ChiliPepper", "ChiliPepper", "ChiliPepper", "Eggplant", "Eggplant", "Eggplant", "Eggplant", "Kale", "Kale", "Kale", "Kale"),
                 plotnum = c("plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4", "plot_1", "plot_2", "plot_3", "plot_4"
                 ))

One way to do this would be to duplicate the plot_4 value for the groups in a new column and then calculate the ratio of the value column to the plot_4 column but I can't figure out how to grab the subgroup value and fill just the cells of the group in the plot_4 column. Other approaches also welcome!

CodePudding user response:

We can subset based on the subgroup by group

library(data.table)
DT[, ratio := value/value[plotnum == 'plot_4'], by = crop]

CodePudding user response:

An extension is if you want to compare each row with all unique plotnum:

DT[, (unique(DT$plotnum)) :=
    lapply(setNames(nm=unique(plotnum)), function(pn) value/value[plotnum == pn]), 
    by = crop]
#         value        crop plotnum    plot_1    plot_2    plot_3    plot_4
#         <num>      <char>  <char>     <num>     <num>     <num>     <num>
#  1:  71.03528  BellPepper  plot_1 1.0000000 0.3348921 0.7744126 1.8863219
#  2: 212.11390  BellPepper  plot_2 2.9860360 1.0000000 2.3124239 5.6326250
#  3:  91.72795  BellPepper  plot_3 1.2913013 0.4324467 1.0000000 2.4358099
#  4:  37.65809  BellPepper  plot_4 0.5301322 0.1775371 0.4105411 1.0000000
#  5: 194.72150 ChiliPepper  plot_1 1.0000000 0.5824924 0.5245480 1.5840314
#  6: 334.29020 ChiliPepper  plot_2 1.7167606 1.0000000 0.9005233 2.7194028
#  7: 371.21770 ChiliPepper  plot_3 1.9064032 1.1104654 1.0000000 3.0198027
#  8: 122.92780 ChiliPepper  plot_4 0.6313006 0.3677278 0.3311475 1.0000000
#  9: 218.76710    Eggplant  plot_1 1.0000000 0.3960613 1.1315887 1.4865157
# 10: 552.35670    Eggplant  plot_2 2.5248618 1.0000000 2.8571051 3.7532468
# 11: 193.32740    Eggplant  plot_3 0.8837133 0.3500046 1.0000000 1.3136537
# 12: 147.16770    Eggplant  plot_4 0.6727140 0.2664360 0.7612356 1.0000000
# 13: 197.49630        Kale  plot_1 1.0000000 1.0129465 1.2597396 1.0723948
# 14: 194.97210        Kale  plot_2 0.9872190 1.0000000 1.2436388 1.0586885
# 15: 156.77550        Kale  plot_3 0.7938149 0.8040920 1.0000000 0.8512829
# 16: 184.16380        Kale  plot_4 0.9324924 0.9445649 1.1746976 1.0000000

Though this is only useful if you intend to compare one with all other plotnums. If you don't need the extension of all plotnums, then the inner method is based on the other answer, so that one should be accepted instead.

  • Related