Home > Mobile >  Plotting grouped data in R (same column)
Plotting grouped data in R (same column)

Time:06-16

I have data for formula-1 drivers in 3 columns and want to make a time series plot of the cumulative points for every driver.

Problem is: All my drivers are in the first column, the points in the second and the cumulative sum in the third column.

testdf <- c("Driver A", "Driver A", "Driver A", "Driver B", "Driver B", "Driver B")

values <- c(1,5,7,3,5,8)

driversmatrix <- cbind(testdf, values); driversmatrix

example data picture here

Link to picture of View of dataframe

How could I make a time series out of this where every drivers cumulative points are plotted against each other?

CodePudding user response:

library(data.table)

# set as data table if yours isn't one already
setDT(df)

# dummy data
df <- data.table(driver = c("Driver A", "Driver A", "Driver A", "Driver B", "Driver B", "Driver B")
                 , points = c(1,5,7,3,5,8)
                 ); df

# calculate cumulative sum and date (assumes data sorted in ascending date already)
df[, `:=` (cum_sum = cumsum(points)
           , date = 1:.N
           )
   , driver
   ]

# plot
ggplot(data=df, aes(x=date, y=cum_sum, group=driver))  
  geom_line(aes(linetype=driver))  
  geom_point()

Notice, plotting one line per driver as we are currently doing may not be optimum if we have many drivers (cluttered plot)

CodePudding user response:

First you would need to have a column that that indicates a race number or date, assuming that your data has the same number of races per driver:

library(tidyverse)
testdf <- data.frame(Driver= c("Driver A", "Driver A", "Driver A", "Driver B", "Driver B", "Driver B") , Points=c(1,5,7,3,5,8))

testdf <- testdf %>% group_by(Driver) %>% mutate(Cum_Points=cumsum(Points), Race_No=row_number())

Then plot cumulative points against the race number with driver as the colour variable

ggplot(testdf, aes(Race_No, Cum_Points, colour=Driver)) geom_line()
  • Related