Cypher: slow query optimization-CodePudding

I am using redisgraph with a custom implementation of ioredis. The query runs 3 to 6 seconds on a database that has millions of nodes. It basically filters (b:brand) by different relationship counts by adding the following match and where multiple times on different nodes.

(:brand) - 1mil nodes
(:w) - 20mil nodes
(:e) - 10mil nodes

// matching b before this codeblock
MATCH (b)-[:r1]->(p:p)<-[:r2]-(w:w)
WHERE w.deleted IS NULL
WITH count(DISTINCT w) as count, b
WHERE  count >= 0   AND count <= 10

The full query would look like this.

MATCH (b:brand)
WHERE  b.deleted IS NULL
MATCH (b)-[:r1]->(p:p)<-[:r2]-(w:w)
WHERE w.deleted IS NULL
WITH count(DISTINCT w) as count, b
WHERE  count >= 0   AND count <= 10
MATCH (c)-[:r3]->(d:d)<-[:r4]-(e:e)
WHERE e.deleted IS NULL
WITH count(DISTINCT e) as count, b
WHERE  count >= 0   AND count <= 10
WITH b ORDER by b.name asc
WITH count(b) as totalCount, collect({id: b.id)[$cursor..($cursor $limit)] AS brands
RETURN brands, totalCount

How can I optimize this query as it's really slow?

CodePudding user response：

A few thoughts:

Property lookups are expensive; is there a way you can get around all the .deleted checks?
If possible, can you avoid naming r1, r2, etc.? It's faster when it doesn't have to check the relationship type.
You're essentially traversing the entire graph several times. If the paths b-->p<--w and c-->d<--e don't overlap, you can include them both in the match statement, separated by a comma, and aggregate both counts at once
I don't know if it'll help much, but you don't need to name p and d since you never refer to them
This is a very small improvement, but I don't see a reason to check count >= 0

Also, I'm sure you have your reasons, but why does the c-->d<--e path matter? This would make more sense to me if it were b-->d<--e to mirror the first portion.