Given a name_loc column of text like the following:
{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}
I'm trying to extract the names, ideally separated by commas:
Charlie, Wrigley, Ana
I've gotten this far:
SELECT SUBSTRING(CAST(name_loc AS VARCHAR) from '"([^ –] )')
FROM table;
which returns
Charlie
How can I extend this query to extract all names?
CodePudding user response:
You can do this with a combination of regexp_matches
(to extract the names), array_agg
(to regroup all matches in a row) and array_to_string
(to format the array as you'd like, e.g. with a comma separator):
WITH input(name_loc) AS (
VALUES ('{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}')
, ('{"Other - somewhere}') -- added this to show multiple rows do not get merged
)
SELECT array_to_string(names, ', ')
FROM input
CROSS JOIN LATERAL (
SELECT array_agg(name)
FROM regexp_matches(name_loc, '"(\w )', 'g') AS f(name)
) AS f(names);
array_to_string |
---|
Charlie, Wrigley, Ana |
Other |
CodePudding user response:
My two cents, though I'm rather new to postgreSQL and I had to copy the 1st piece from @Marth's his answer:
WITH input(name_loc) AS (
VALUES ('{"Charlie – White Plains, NY","Wrigley – Minneapolis, MN","Ana – Decatur, GA"}')
, ('{"Other - somewhere"}')
)
SELECT REGEXP_REPLACE(name_loc, '{?(,)?"(\w )[^"] "}?','\1\2', 'g') FROM input;
regexp_replace |
---|
Charlie,Wrigley,Ana |
Other |
CodePudding user response:
Your string literal happens to be a valid array literal.
(Maybe not by coincidence? And the column should be type text[]
to begin with?)
If that's the reliable format, there is a safe and simple solution:
SELECT t.id, x.names
FROM tbl t
CROSS JOIN LATERAL (
SELECT string_agg(split_part(elem, ' – ', 1), ', ') AS names
FROM unnest(t.name_loc::text[]) elem
) x;
Or:
SELECT id, string_agg(split_part(elem, ' – ', 1), ', ') AS names
FROM (SELECT id, unnest(name_loc::text[]) AS elem FROM tbl) t
GROUP BY id;
db<>fiddle here
Steps
Unnest the array with
unnest()
in aLATERAL CROSS JOIN
, or directly in theSELECT
list.Take the first part with
split_part()
. I chose' – '
as delimiter, not just' '
, to allow for names with nested space like "Anne Nicole". See:Aggregate results with
string_agg()
. I added no particular order as you didn't specify one.