Like in title i need to converte some data frame
data1 <- data.frame(Year = rep(c(2016, 2017, 2018, 2019), each = 12), Month = rep(month.abb, 4), Expenses = sample(50e3:100e3, 48))
Create a list year_y in which each year (element) contains a data frame with expenses in each month. Then using list year_y create a list containing for each year(element) the month with biggest expenses. Here is what the final result should look like:
$‘2016‘
[1] "Jul"
$‘2017‘
[1] "Nov"
$‘2018‘
[1] "May"
$‘2018‘
[1] "May"
And the thing is i need to use apply function family in both steps
CodePudding user response:
In base R, we can use tapply
as.list(tapply(ata1$Expenses, ata1$Year, function(x) month.abb[which.max(x)]))
#> $`2016`
#> [1] "Jul"
#>
#> $`2017`
#> [1] "Mar"
#>
#> $`2018`
#> [1] "Sep"
#>
#> $`2019`
#> [1] "Dec"
CodePudding user response:
We group by 'Year', slice
the row where the 'Expenses' is the max
and then split
the 'Month' by 'Year' column
library(dplyr)
data1 %>%
group_by(Year) %>%
slice_max(n = 1, order_by = Expenses) %>%
{split(.$Month, .$Year)}
Or another option is deframe
library(tibble)
data1 %>%
group_by(Year) %>%
slice_max(n = 1, order_by = Expenses) %>%
ungroup %>%
select(Year, Month) %>%
deframe() %>%
as.list
$`2016`
[1] "Nov"
$`2017`
[1] "Dec"
$`2018`
[1] "Dec"
$`2019`
[1] "Mar"
Or with base R
- subset
the data where the 'Expenses' is the max
value and split
with(subset(data1, Expenses == ave(Expenses, Year, FUN = max)),
split(Month, Year))
-output
$`2016`
[1] "Nov"
$`2017`
[1] "Dec"
$`2018`
[1] "Dec"
$`2019`
[1] "Mar"
CodePudding user response:
Using base R.
Use the the split()
function to divide the original data frame by year. Then use which.max()
to determine which month has the highest expenses.
data1 <- data.frame(Year = rep(c(2016, 2017, 2018, 2019), each = 12), Month = rep(month.abb, 4), Expenses = sample(50e3:100e3, 48))
lapply(split(data1, ~Year), function(mon) {
mon$Month[which.max(mon$Expenses)]
})
CodePudding user response:
Here is one more tidyverse approach which makes use of dplyr::pull
s name
argument.
library(dplyr)
data1 %>%
group_by(Year) %>%
filter(max(Expenses) == Expenses) %>%
pull(var = Month, name = Year) %>%
as.list()
#> $`2016`
#> [1] "Feb"
#>
#> $`2017`
#> [1] "Apr"
#>
#> $`2018`
#> [1] "Mar"
#>
#> $`2019`
#> [1] "Dec"
Created on 2022-03-26 by the reprex package (v0.3.0)
CodePudding user response:
Here is one more solution using purrr
map_chr
:
library(purrr)
library(dplyr)
data1 %>%
group_by(Year) %>%
arrange(desc(Expenses), .by_group = TRUE) %>%
slice(1) %>%
group_split() %>%
setNames(unique(data1$Year)) %>%
map_chr(., 2) %>%
as.list()
$`2016`
[1] "Apr"
$`2017`
[1] "Jan"
$`2018`
[1] "Mar"
$`2019`
[1] "Nov"