I have a trip_duration column in interval format. I want to remove all observations less than 90 seconds and count how many observations match this condition.
My current SQL query is
WITH
org_table AS (
SELECT
ended_at - started_at as trip_duration
FROM `cyclistic-328701.12_month_user_data_cyclistic.20*`
)
SELECT
COUNTIF(x < 1:30) AS false_start
FROM trip_duration AS x;
I returns Syntax error: Expected ")" but got ":" at [8:16]
I have also tried
SELECT
COUNTIF(x < "0-0 0 0:1:30") AS false_start
FROM trip_duration AS x
It returns Table name "trip_duration" missing dataset while no default dataset is set in the request.
I've read through other questions and have not been able to write a solution. My first thought is to cast the trip_duration from INTERVAL to TIME format so COUNT IF statements can reference a TIME formatted column instead of INTERVAl.
~ Marcus
CodePudding user response:
Below example shows you the way to handle intervals
with trip_duration as (
select interval 120 second as x union all
select interval 10 second union all
select interval 2 minute union all
select interval 50 second
)
select
count(*) as all_starts,
countif(x < interval 90 second) as false_starts
from trip_duration
with output
CodePudding user response:
To filter the data without the durations less than 90 secs:
SELECT
* # here is whatever field(s) you want to return
FROM
`cyclistic-328701.12_month_user_data_cyclistic.20*`
WHERE
TIMESTAMP_DIFF(ended_at, started_at, SECOND) > 90
You can read about the TIMESTAMP_DIFF function here.
To count the number of occurrences:
SELECT
COUNTIF(TIMESTAMP_DIFF(ended_at, started_at,SECOND) < 90) AS false_start,
COUNTIF(TIMESTAMP_DIFF(ended_at, started_at,SECOND) >= 90) AS non_false_start
FROM
`cyclistic-328701.12_month_user_data_cyclistic.20*`