Home > Enterprise >  Filter data in SQL based on multiple column in a table
Filter data in SQL based on multiple column in a table

Time:03-09

I have a table geo_loc which contains the geo location (latitude and longitude) of each delivery guy along with time stamps in other columns.

GEO_LOC Table view on any day: -

agent_id       date            geo_lat  geo_long
1134    13-02-2021T09:09:54    17.66    89.44
1134    13-02-2021T10:10:47    19.99    76.56
1134    13-02-2021T10:50:47    19.99    76.56
1134    13-02-2021T11:57:47    19.99    33.33
1134    13-02-2021T13:13:23    34.44    89.67
2678    13-02-2021T10:25:11    45.55    34.67
4657    13-02-2021T11:55:33    22.34    66.78
4657    13-02-2021T12:20:27    22.34    66.78
4657    13-02-2021T15:15:13    33.45    45.67
7545    13-02-2021T08:17:55    12.45    56.56
7545    13-02-2021T11:55:23    18.56    87.77
0908    13-02-2021T16:55:56    19.99    79.99
0908    13-02-2021T17:43:12    19.99    79.99
0908    13-02-2021T18:12:34    19.99    79.99

GEO_LOC Table has entries like above for each day and for multiple delivery agent_id.

For any day I want to filter all the records for all those agents who have more than one gps entry (geo_lat and geo_long) on any day.

For ex:
0908 has same geo_lat and geo_long on 13-02-2021, so I don’t want this row.
But 1134 has multiple geo_lat and geo_long entry on 13-02-2021 so I want all the rows for this agent on this day.
2678 has single entry on 13-02-2021 so I don’t what this row also.

Desired output: -

agent_id       date           geo_lat   geo_long
1134    13-02-2021T09:09:54    17.66    89.44
1134    13-02-2021T10:10:47    19.99    76.56
1134    13-02-2021T10:50:47    19.99    76.56
1134    13-02-2021T11:57:47    19.99    33.33
1134    13-02-2021T13:13:23    34.44    89.67
4657    13-02-2021T11:55:33    22.34    66.78
4657    13-02-2021T12:20:27    22.34    66.78
4657    13-02-2021T15:15:13    33.45    45.67
7545    13-02-2021T08:17:55    12.45    56.56
7545    13-02-2021T11:55:23    18.56    87.77

CodePudding user response:

There are a couple of things we need to do to get the data you want

  1. We need to convert the Date column to show only the date and not the time
  2. We need to then group the data by ID and Date with a distinct count of a concatenated lat/long column
  3. We can then select from the original table using the agent_id with a WHERE IN clause

For 1, we can use CONVERT to change the ISO8601 datetime into an NVARCHAR date:

convert(nvarchar,date,103)

For 2, we use the above as well as COUNT DISTINCT and CONCAT; CONCAT to create a single column with both the lat and long:

concat(geo_lat, ',', geo_long)

And then COUNT DISTINCT to only return unique lat/long combinations:

concat(geo_lat, ',', geo_long)

We can then put these together along with the GROUP BY clause on agent_id and the new date column to give you a filtered table

select 
agent_id
, convert(nvarchar,date,103)
, count(distinct(concat(geo_lat, ',', geo_long))) 
from [71405703] 
GROUP BY agent_id, convert(nvarchar,date,103)
agent_id    date    count
908     13/02/2021  1
1134    13/02/2021  4
2678    13/02/2021  1
4657    13/02/2021  2
7545    13/02/2021  2

I then put that query into a CTE so that I can easily write WHERE clauses against the columns.

The final script would look like:

WITH TableFilter (agent, date, count) 
AS 
(
select 
agent_id
, convert(nvarchar,date,103)
, count(distinct(concat(geo_lat, ',', geo_long))) 
from [71405703] 
GROUP BY agent_id, convert(nvarchar,date,103))

SELECT * FROM [71405703] 
WHERE agent_id IN (select agent FROM TableFilter WHERE count > 1)
  • Related