Home > Mobile >  Is there a way to write a Postgres query to select all rows that two columns are the same after remo
Is there a way to write a Postgres query to select all rows that two columns are the same after remo

Time:03-15

I wanted to see if I could get some help with a PostgreSQL query I am trying to update:

 SELECT games.* 
 FROM games 
 INNER JOIN games_players 
   ON games.id = games_players.game_id 
 WHERE games.status = 'closed' 
 AND games_players.player_id = $1 
 AND games.away ~* '/. ?(?=-)/' <--- attempt at REGEX pattern match 
 ORDER BY scheduled DESC 
 LIMIT 5

I need to edit this query to only return the last 5 rows with existing conditions, but I also only want to return rows in that 'games' table that contain the same course name in the away column. Those column strings look like this: "TPC Four Seasons - RD 1". I attempted to do this above with a REGEX, but it does not seem to work and returns 0 rows. I would like to return the last 5 rows ordered by the scheduled column in descending order and also constrained to a regex for the away columns with matching substring before that "-" in the string. Is this possible and would anyone be able to help?

Update:

Additional attempt at query:

SELECT games.* 
FROM games 
INNER JOIN games_players ON games.id = games_players.game_id 
WHERE SUBSTRING(games.away, 0, STRPOS('-', games.away)) IN 
(
 SELECT SUBSTRING(games.away, 0, STRPOS('-', games.away)) FROM games 
 GROUP BY SUBSTRING(games.away, 0, STRPOS('-', games.away)) 
 HAVING  COUNT(*) > 1
) 
AND games.status = 'closed' AND games_players.player_id = $1 
ORDER BY scheduled DESC LIMIT 5

Update with final answer:

I kept playing with the logic and doing inner selects to drill down to the info I needed and ended up getting the following correct result from this query:

select games.* 
FROM games 
INNER JOIN games_players ON games.id = games_players.game_id 
WHERE games.status = 'closed' AND 
games_players.player_id = $1 and
games.away LIKE concat(regexp_replace((select games.away FROM games ORDER BY scheduled DESC LIMIT 1 ), '-.*', ''), '%') 
ORDER BY scheduled desc
limit 5

CodePudding user response:

Rephrase the requirement as two columns are the same after removing everything after the dash, for which the easiest way is

regexp_replace(games.away, '-.*', '')

which matches a dash and everything following with nothing (effectively deleting it)

Putting that into the query:

SELECT distinct games.* 
FROM games
JOIN games g2 on regexp_replace(games.away, '-.*', '') = regexp_replace(g2.away, '-.*', '')
  AND g2.id != games.id
JOIN games_players ON games.id = games_players.game_id
WHERE games.status = 'closed' 
AND games_players.player_id = $1 
ORDER BY scheduled DESC 
LIMIT 5

BTW those slash characters in your regex match the character /, which is not what you want. Some languages (eg JavaScript) use the slash to delimit regexes, but slashes have no special meaning in regex.

  • Related