I am trying to implement the social network problem from the Graph Databases book, but in SQL (using PostgreSQL). I have these 2 tables, that will define a person and their friendship with someone else.
CREATE TABLE IF NOT EXISTS Person(
ID INTEGER PRIMARY KEY,
Person VARCHAR(200)
);
CREATE TABLE IF NOT EXISTS PersonFriend(
PersonID INTEGER,
FriendID INTEGER,
PRIMARY KEY (PersonID, FriendID)
);
What I am looking for is a query that will return all the friends, friends of friends, friends of friends of friends, etc, up to the 5th level, of a person.
I've attached a graph example that I made in Neo4j that should help the visualization. I solved it easily using Cypher, but I'm having some trouble to do the equivalent in SQL. Using Alice as reference, my query should return Olivia, Zach, Bob, Mary, Caitlyn, Violet and Zerus, as they're her (direct and indirect) friends up to the 5th level.
The book includes this snippet of code, but it only goes up to the 2nd level (and does not return the first one).
SELECT p1.Person AS PERSON, p2.Person AS FRIEND_OF_FRIEND
FROM PersonFriend pf1 JOIN Person p1
ON pf1.PersonID = p1.ID
JOIN PersonFriend pf2
ON pf2.PersonID = pf1.FriendID
JOIN Person p2
ON pf2.FriendID = p2.ID
WHERE p1.Person = 'Alice' AND pf2.FriendID <> p1.ID
I would very much appreciate any suggestions to solve this problem.
CodePudding user response:
Recursive queries are the easiest way to handle this:
WITH RECURSIVE FriendCTE (PersonID, depth)
AS (
SELECT 1, 0 -- 1 being the ID of Alice
UNION
SELECT pf.FriendID, depth 1
FROM FriendCTE cte
JOIN PersonFriend pf ON (cte.PersonID = pf.PersonID)
WHERE depth < 5
)
SELECT PersonID, Person
FROM FriendCTE cte
JOIN Person p ON (p.ID = cte.PersonID)
WHERE depth > 0
GROUP BY PersonID, Person; -- to deal with loops
PersonID | Person
---------- ---------
2 | Olivia
3 | Zach
4 | Bob
5 | Mary
6 | Caitlyn
7 | Violet
8 | Zerus
(7 rows)
Sidenote, you should be using the text
type and not VARCHAR(200)
.