I am working on some data in R
. For reproducibility, the data is as follows:
month,source,amount1,amount2,total
jan,central,200,400,600
jan,tax,100,200,300
jan,fines,100,200,300
jan,east,150,50,200
jan,tax,100,25,125
jan,fine,0,75,75
jan,levies,0,0,0
Jan,tithe,0,0,0
Note that the amount for central
is the sum of tax
and fines
. I want to rearrange the data by adding a column that will hold the term central, as follows.
month,source,amount1,amount2,total,new_column
jan,tax,100,200,300,central
jan,fines,100,200,300,central
jan,tax,100,25,125,east
jan,fine,0,75,75,east
jan,levies,0,0,0,east
Jan,tithe,0,0,0,east
I appreciate any help. Note that the rows are not uniform.
CodePudding user response:
I recommend that you change the source data, if possible. Fines, tax and central or east seem like data that should not be in the same column. Central or East probably represents location data while Fines and Tax are the source of income.
CodePudding user response:
Assuming there is a pattern of row order: "source", "tax", "fine", we can split every 3 rows, remove 1st row, add new column from the first row "source" column, then rbind the result:
do.call(rbind,
lapply(split(x, rep(1:(nrow(x)/3), each = 3)), function(i){
res <- i[2:nrow(i), ]
res$new_column <- i[1, "source" ]
res
}))
# month source amount1 amount2 total new_column
# 1.2 jan tax 100 200 300 central
# 1.3 jan fines 100 200 300 central
# 2.5 jan tax 100 25 125 east
# 2.6 jan fine 0 75 75 east