How to combine similar rows of data into one in R?-CodePudding

I have a dataset as shown in the first table below. I would like to create a new dataframe that combines different versions of the same item (i.e. "googleV1" and "googleV2" are combined) and adds (quantity, revenue fields) or averages (ratings field) the numbers associated so that it reads as just a single line item. The second table below illustrates my desired output. How would I go about combining and adding/averaging values as such? Thank you in advance!

I would like the solution to apply for when there may be multiple columns that work like the "group" column. I would also like to refer to the columns "quantity", "revenue", and "rating" by column number.

item	group	quantity	revenue	rating
googleV1	blue	4525	$523513	94%
googleV2	blue	2452	$134134	82%
123V5	red	3563	$134134	82%
123V6	red	34534	$2345	34%
123V7	yellow	4574	$34535	64%

item	group	quantity	revenue	rating
google	blue	6977	$657647	88%
123	red	38097	$136479	58%
123	yellow	4574	$34535	64%

CodePudding user response：

We may remove the substring in the first column and then do a group by summarise approach

library(dplyr)
library(stringr)
df1 %>% 
  group_by(item = str_remove(item, "V\\d $")) %>% 
  summarise(quantity = sum(quantity), 
   revenue = str_c("$", sum(as.numeric(str_remove(revenue, fixed("$"))))), 
    rating = str_c(mean(readr::parse_number(rating)), "%"), .groups = 'drop')

-output

# A tibble: 2 × 4
  item   quantity revenue rating
  <chr>     <int> <chr>   <chr> 
1 123       42671 $171014 60%   
2 google     6977 $657647 88%

data

df1 <- structure(list(item = c("googleV1", "googleV2", "123V5", "123V6", 
"123V7"), quantity = c(4525L, 2452L, 3563L, 34534L, 4574L),
 revenue = c("$523513", 
"$134134", "$134134", "$2345", "$34535"), rating = c("94%", "82%", 
"82%", "34%", "64%")), class = "data.frame", row.names = c(NA, 
-5L))