I would like to filter all rows between 2 patterns which follow a numerical order. For e.g. how could I filter all rows > 1st.7.1.* & < 1st.13.1.*
Here is how the dataframe looks like
CodePudding user response:
We may use parse_number
to get the numeric part and then do the filter
library(dplyr)
df1 %>%
filter(between(readr::parse_number(ball), 7.1, 13.1))
Or another option is to extract the substring and filter
library(stringr)
df1 %>%
filter(between(as.numeric(str_extract(ball, "\\d (\\.\\d )?$")), 7.1, 13.1))
-output
# A tibble: 61 × 2
ball team
<chr> <chr>
1 1st.7.1 New Zealand
2 1st.7.2 New Zealand
3 1st.7.3 New Zealand
4 1st.7.4 New Zealand
5 1st.7.5 New Zealand
6 1st.7.6 New Zealand
7 1st.7.7 New Zealand
8 1st.7.8 New Zealand
9 1st.7.9 New Zealand
10 1st.8 New Zealand
# … with 51 more rows
data
df1 <- tibble(ball = str_c('1st.', seq(0.1, 13.5, by = 0.1)), team = 'New Zealand')
CodePudding user response:
You can extract the numerical part and subset on this:
library(stringr)
df %>%
mutate(num = as.numeric(str_extract(ball, "(?<=st\\.).*"))) %>%
filter(num > 7.1 & num < 13.1) %>%
select(-num)
ball
1 1st.10.9
2 1st.12.7
Data:
df <- data.frame(
ball = c("1st.7.1","1st.7.9", "1st.12.7", "1st.13.1")
)
CodePudding user response:
We could remove the constant 1st.
and use the numbers. Here I changed the range to show the effect on the the provided data.
library(dplyr)
library(stringr)
df %>%
filter(between(as.numeric(stringr::str_remove(ball, "1st.")), 0.1, 1.1))
ball team batsman bowler nonStriker byes legbyes noballs
1 1st.0.1 New Zealand MJ Guptill Shaheen Shah Afridi DJ Mitchell 0 0 0
2 1st.0.2 New Zealand MJ Guptill Shaheen Shah Afridi DJ Mitchell 0 0 0
3 1st.0.3 New Zealand MJ Guptill Shaheen Shah Afridi DJ Mitchell 0 0 0
4 1st.0.4 New Zealand MJ Guptill Shaheen Shah Afridi DJ Mitchell 0 0 0
5 1st.0.5 New Zealand MJ Guptill Shaheen Shah Afridi DJ Mitchell 0 0 0
6 1st.0.6 New Zealand MJ Guptill Shaheen Shah Afridi DJ Mitchell 0 0 0
7 1st.1.1 New Zealand DJ Mitchell Imad Wasim MJ Guptill 0 0 0
structure(list(ball = c("1st.0.1", "1st.0.2", "1st.0.3", "1st.0.4",
"1st.0.5", "1st.0.6", "1st.1.1", "1st.1.2", "1st.1.3", "1st.1.4",
"1st.1.5", "1st.1.6", "1st.2.1", "1st.2.2"), team = c("New Zealand",
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand",
"New Zealand", "New Zealand", "New Zealand", "New Zealand", "New Zealand",
"New Zealand", "New Zealand", "New Zealand"), batsman = c("MJ Guptill",
"MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill", "MJ Guptill",
"DJ Mitchell", "DJ Mitchell", "MJ Guptill", "MJ Guptill", "DJ Mitchell",
"MJ Guptill", "DJ Mitchell", "DJ Mitchell"), bowler = c("Shaheen Shah Afridi",
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Shaheen Shah Afridi",
"Shaheen Shah Afridi", "Shaheen Shah Afridi", "Imad Wasim", "Imad Wasim",
"Imad Wasim", "Imad Wasim", "Imad Wasim", "Imad Wasim", "Shaheen Shah Afridi",
"Shaheen Shah Afrid"), nonStriker = c("DJ Mitchell", "DJ Mitchell",
"DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "DJ Mitchell", "MJ Guptill",
"MJ Guptill", "DJ Mitchell", "DJ Mitchell", "MJ Guptill", "DJ Mitchell",
"MJ Guptill", "MJ Guptill"), byes = c(0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), legbyes = c(0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), noballs = c(0L, 0L,
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), class = "data.frame", row.names = c(NA,
-14L))