Home > Back-end >  create a joint frequency distribution table in r
create a joint frequency distribution table in r

Time:09-22

I am trying to create in R a joint frequency distribution table between two continuous variables, electricity price and renewable energy production, for certain ranges, using for instance 10 20-euro ranges for price and 10 10.000-MWh ranges for renewable electricity production. My data looks like this:

structure(list(DATE = structure(c(18628, 18629, 18630, 18631, 
18632, 18633, 18634, 18635, 18636, 18637, 18638, 18639, 18640, 
18641, 18642, 18643, 18644, 18645, 18646, 18647), class = "Date"), 
    price = c(45.5804166666667, 47.11125, 43.4683333333333, 47.4579166666667, 
    50.8945833333333, 52.84375, 44.64375, 55.7233333333333, 51.64125, 
    51.9604166666667, 51.6475, 52.4683333333333, 67.7625, 46.1591666666667, 
    63.68875, 50.955, 53.9758333333333, 66.94625, 71.73, 63.4458333333333
    ), RES = c(23534.08, 21286.57, 42159.53, 45660.36, 36967.42, 
    33727.61, 45660.92, 59931.54, 48146.49, 48424.06, 60207.69, 
    45786.85, 38909.96, 49843.98, 43003.73, 41491, 41655.27, 
    42133.29, 31207.87, 36045.62)), row.names = c(NA, 20L), class = "data.frame") 

I haven't been able to construct this kind of table with certain frequency packages, does anyone have any idea on how to do this? Thank you in advance.

CodePudding user response:

Data:

df <- structure(list(DATE = structure(c(18628, 18629, 18630, 18631, 
                                        18632, 18633, 18634, 18635, 18636, 18637, 18638, 18639, 18640, 
                                        18641, 18642, 18643, 18644, 18645, 18646, 18647), class = "Date"), 
                     price = c(45.5804166666667, 47.11125, 43.4683333333333, 47.4579166666667, 
                               50.8945833333333, 52.84375, 44.64375, 55.7233333333333, 51.64125, 
                               51.9604166666667, 51.6475, 52.4683333333333, 67.7625, 46.1591666666667, 
                               63.68875, 50.955, 53.9758333333333, 66.94625, 71.73, 63.4458333333333
                     ), RES = c(23534.08, 21286.57, 42159.53, 45660.36, 36967.42, 
                                33727.61, 45660.92, 59931.54, 48146.49, 48424.06, 60207.69, 
                                45786.85, 38909.96, 49843.98, 43003.73, 41491, 41655.27, 
                                42133.29, 31207.87, 36045.62)), row.names = c(NA, 20L), class = "data.frame") 

Code:

# check min and max of your respective vectors price and RES

# use cut to create ranges 
df$priceRanges <- cut(df$price, seq(40, 80, by = 10))
df$RESRanges <- cut(df$RES, seq(20000, 70000, by = 10000))

# table the created variables with relative frequencies
prop.table(table(df[, c("priceRanges", "RESRanges")]))

## 1
# export as txt with rownames to working directory
new <- prop.table(table(df[, c("priceRanges", "RESRanges")]))
write.table(new, file = "new.txt", sep = ",", quote = FALSE, row.names = TRUE)

## 2
# export as csv ...
write.csv2(new, file = "new.csv", quote = FALSE, row.names = TRUE)
  • Related