here little example of my data.
sales.data=structure(list(MDM_Key = c(370L, 370L, 370L, 370L, 370L, 370L,
370L, 371L, 371L, 371L, 371L, 371L, 371L, 371L), sale_count = c(30L,
32L, 32L, 24L, 20L, 15L, 23L, 30L, 32L, 32L, 24L, 20L, 15L, 23L
), iek_disc_price = c(38227.08, 38227.08, 33739.7, 38227.08,
38227.08, 28844.16, 31649.255, 38227.08, 38227.08, 33739.7, 38227.08,
38227.08, 28844.16, 31649.255)), class = "data.frame", row.names = c(NA,
-14L))
i perform regression analysis
str(sales.data)
m1<-lm(formula=sale_count~iek_disc_price,data=sales.data)
summary(m1)
But the main difficulty is that for each group (MDM_Key) I don't need all the regression results from the summary
, but only one beta
coefficient.
here
B=0.0008559.
but then i need calculate mean value for sale_count and the mean for iek_disc_price (also for each mdm key group)
so the desired result would be like this
MDM_Key beta mean(sale_count) mean(iek_disc_price)
370 0.0008559 25.14 35305
371 0.0008559 25.14 35305
How to take only beta (nor intercept)regression coefficient for each group mdm_key
and also for each group, calculate the mean values for sale_count and iek_disc_price to get the summary table indicated above.
Thank you for your help.
CodePudding user response:
If I understood correctly, you want to apply one regression per MDM_Key
.
library(dplyr)
library(purrr)
library(broom)
sales.data %>%
group_by(MDM_Key) %>%
mutate(
mean_sale_count = mean(sale_count),
mean_iek_disc_price = mean(iek_disc_price)
) %>%
nest(-MDM_Key,-mean_sale_count,-mean_iek_disc_price) %>%
mutate(
coefs = map(.x = data,.f = ~tidy(lm(formula=.$sale_count~.$iek_disc_price,data=.)))
) %>%
unnest(coefs) %>%
filter(term != "(Intercept)") %>%
select(MDM_Key,beta = estimate,mean_sale_count,mean_iek_disc_price)
# A tibble: 2 x 4
# Groups: MDM_Key [2]
MDM_Key beta mean_sale_count mean_iek_disc_price
<int> <dbl> <dbl> <dbl>
1 370 0.000856 25.1 35306.
2 371 0.000856 25.1 35306.
CodePudding user response:
Using R base and the split apply combine strategy:
do.call(rbind, lapply(split(sales.data, sales.data$MDM_Key), function(i) {
c(beta=coef(lm(sale_count~iek_disc_price, data=i))[2],
sale_count_mean=mean(i$sale_count),
iek_disc_price_mean=mean(i$iek_disc_price))
} ))
beta.iek_disc_price sale_count_mean iek_disc_price_mean
370 0.0008558854 25.14286 35305.92
371 0.0008558854 25.14286 35305.92
CodePudding user response:
Get the means using aggregate
and the beta values using lmList
and then put them together and rearrange the columns in the order shown in the question. Omit [, c(2:1, 3:4)]
if the column order doesn't matter. Note that nlme comes with R and does not have to be installed.
library(nlme) # lmList
means <- aggregate(. ~ MDM_Key, sales.data, mean)
fm <- lmList(sale_count ~ iek_disc_price | MDM_Key, sales.data)
cbind(beta = coef(fm)[, 2], means)[, c(2:1, 3:4)]
## MDM_Key beta sale_count iek_disc_price
## 1 370 0.0008558854 25.14286 35305.92
## 2 371 0.0008558854 25.14286 35305.92