Total rows and divide by columns of all tables in a list object?-CodePudding

I have a list object that contains several tables that contains a year column and then frequencies of particular words. Each table might be slightly different dimensions depending on range of years and words used.

Year	word1	word2	word3
2009	1	5	4
2010	2	3	5

I would like to create a table that sums every row (not including the year) and then divides the column value by the row sum so that it produces a table like this:

Year	word1	word2	word3
2009	0.1	0.5	0.4
2010	0.2	0.3	0.5

Is there a way to do this to a list object? TIA

CodePudding user response：

Does this work:

cbind(df[1], t(apply(df[-1], 1, function(x) x/sum(x))))
  Year word1 word2 word3
1 2009   0.1   0.5   0.4
2 2010   0.2   0.3   0.5

If you have a list of such dataframes :

mylist <- list(df, df)
mylist
[[1]]
  Year word1 word2 word3
1 2009     1     5     4
2 2010     2     3     5

[[2]]
  Year word1 word2 word3
1 2009     1     5     4
2 2010     2     3     5

lapply(mylist, function(y) cbind(y[1], t(apply(y[-1], 1, function(x) x/sum(x)))))
[[1]]
  Year word1 word2 word3
1 2009   0.1   0.5   0.4
2 2010   0.2   0.3   0.5

[[2]]
  Year word1 word2 word3
1 2009   0.1   0.5   0.4
2 2010   0.2   0.3   0.5

Data used:

df
  Year word1 word2 word3
1 2009     1     5     4
2 2010     2     3     5

CodePudding user response：

For a single data.frame, you can use the following function:

doit <- function(df) {
  cbind(df[1],sweep(df[-1],1,rowSums(df[-1]),"/"))
}

e.g.

df <- data.frame(Year = 1:3, Word1 = c(1,2,3), Word2 = c(3,2,1), Word3 = c(6,6,6))
doit(df)
#  Year Word1 Word2 Word3
#1    1   0.1   0.3   0.6
#2    2   0.2   0.2   0.6
#3    3   0.3   0.1   0.6

If you have multiple data.frames in a list, just wrap everything with lapply, like lapply(dfList,doit).