I am trying to remove docked_bike
from the results table with the ff:
library(dplyr)
a_v2 <- a_v1[!a_v1$rideable_type == "docked_bike" | a_v1$ride_length<0,]
a_v2 %>% group_by(member_casual) %>%
summarise(number_of_rides = n(),
average_duration = mean(ride_length))
a_v2 %>% group_by(member_casual, rideable_type) %>%
summarise(number_of_rides = n())
Output:
member_casual | rideable_type | number of rides |
---|---|---|
casual | classic_bike | 1132892 |
casual | docked_bike | 5 |
casual | electric_bike | 1162202 |
member | classic_bike | 1922749 |
member | electric bike | 1456488 |
What change on the code should I do so that I can remove the docked_bike?
Suggested Answer:
a_v2 <- a_v1 [a_v1$rideable_type != "docked_bike" & a_v1$ride_length<0,]
Updated Table:
member_casual | rideable_type | number of rides |
---|---|---|
casual | classic_bike | 24 |
casual | electric_bike | 37 |
member | classic_bike | 51 |
member | electric bike | 32 |
From 1 million plus rides to double digits, I do not think that is the correct result.
Follow up question:
Is there a code to remove the docked_bike
from the original table?
member_casual | rideable_type | number of rides |
---|---|---|
casual | classic_bike | 1132892 |
casual | docked_bike | 5 |
casual | electric_bike | 1162202 |
member | classic_bike | 1922749 |
member | electric bike | 1456488 |
CodePudding user response:
The following line:
a_v2 <- a_v1 [!a_v1$rideable_type == "docked_bike" | a_v1$ride_length<0,]
It is selecting rows where
a_v1$rideable_type
IS NOT "docked_bike"
OR
ride_length
< 0.
So if you have a "docked_bike" with a negative ride_length (<0) it will be selected.
Also, instead of ! <something> == <something>
, you can use <something> != <something>
.
CodePudding user response:
following @RobertoT answer try to do:
a_v2 <- a_v1 [a_v1$rideable_type != "docked_bike" & a_v1$ride_length<0,]