R count by group / Loop function and output to csv-CodePudding

I have a dataframe containing user data :

age = c(45, 21, 32, 33, 46)
gender = c('female', 'female', 'male', 'male', 'female')
income = c('low', 'low', 'medium', 'high', 'low')
education = c('high', 'high', 'high', 'medium', 'medium')

df = data.frame(age, gender ,income, education)

From this i would like to obtain a legible list with a count & share of total for every attribute that i then would append to a table / csv that should be rather legible for further use than be a functioning dataframe. For one attribute that would be something like this:

nusers = nrow(users)
df = count(users, gender)
df['sot']=df['n']/totuser
write.table(df,'stat.csv',sep=';', row.names = FALSE, append = T)

With the following result desired for multiple attributes:

gender,n,sot
female,10,0.526315789
male,9,0.473684211
income,Freq,sot
low,4,0.210526316
medium,10,0.526315789
high,5,0.263157895
education,Freq,sot
low,8,0.421052632
medium,1,0.052631579
high,10,0.526315789

My (not very proficient) attempts to put this into a loop failed. How would i best go about this ?

CodePudding user response：

You can use sink() for this:

library(dplyr)
n_gen <- df %>% group_by(gender) %>% summarise(Feq = n(), sot = n()/nrow(df))
n_inc <- df %>% group_by(income) %>% summarise(Feq = n(), sot = n()/nrow(df))
n_edu <- df %>% group_by(education) %>% summarise(Feq = n(), sot = n()/nrow(df))

sink('export.csv')

write.csv(n_gen, row.names = F)
write.csv(n_inc, row.names = F)
write.csv(n_edu, row.names = F)

sink()

You could shorten it and write it in a for loop. Depending on how many columns you have (in df) that might be preferred

CodePudding user response：

You should use 'count_()' instead of 'count()' it is the same function but it take variable instead of string in 'var'.

for (i in class) {
   df = count_(users, i)
   write.csv(df, row.names = T, file = paste0('Title_',i,'.txt'))
}