Home > Back-end >  SQL aggregate similar records
SQL aggregate similar records

Time:06-21

I have a dataset that I need to query from and aggregate the selected records based on a string similarity. For example a simple SELECT, COUNT query can return a set of records like:

NAME COUNT
CAR 6
CAR 1 12
CAR 22 2
CAR 45 9
Van 2 4
Van 22 5
Van 12 16
UNKNOWN 6

I want the result to be aggregate by a substring as:

NAME COUNT
CAR 29
VAN 25
UNKOWN 6

A sample of the current query I have is:

SELECT "vehicle_name" AS "NAME", count(*) AS "Count"
FROM "vehicles"
WHERE ("registered" = 'true'GROUP BY "public"."players"."device_name" 
ORDER BY "vehicle_name" ASC

CodePudding user response:

Please try like this

SELECT 
SUM(COUNT),
SUBSTRING(vehicle_name,1,ISNULL(NULLIF(PATINDEX('%[^A-Za-z.''0-9]%',LTRIM(RTRIM(vehicle_name))),0)-1,LEN(vehicle_name))) AS Value
FROM
vehicles
GROUP BY SUBSTRING(vehicle_name,1,ISNULL(NULLIF(PATINDEX('%[^A-Za-z.''0-9]%',LTRIM(RTRIM(vehicle_name))),0)-1,LEN(vehicle_name)))  

CodePudding user response:

You can use the metabase regexextract function to solve your problem, matching the first word by exploiting the start of string symbol.

SELECT REGEXEXTRACT([NAME], "^[A-Za-z] ") AS "NAME", 
       SUM([COUNT])                       AS "COUNT"
FROM vehicles
GROUP BY REGEXEXTRACT([NAME], "^[A-Za-z] ")

Does it solve your problem?

  • Related