I have more than 10 million rows. Counting the number of rows is too slow. I want to estimate the number of rows.
Current query is like this:
SELECT count(*) FROM `database` WHERE `location` LIKE '%New York%'
This gives me the exact number and its too slow. What I want is something like this:
ESTIMATE count(*) FROM `database` WHERE `location` LIKE '%New York%'
Maybe this can be done by limiting the query to a subset of the table (every nth row or something). Maybe there is a built-in function that can be used.
Edit: Perhaps I can limit the query to every nth row? and then multiple by n.
Edit2: I just tried counting every 10th row and then multiplying the count by ten, but there was no increase in speed, despite the fact that the auto-incrementing 'counter' is a primary key.
SELECT count(*) FROM database WHERE `location` LIKE '%Los Angeles%' and counter = 0
CodePudding user response:
The performance of COUNT(*) is not the biggest performance problem in this case. The performance cannot be improved, because your search condition LIKE '%...%'
is bound to run a table-scan. It must examine every row in the table anyway, to check if it matches your condition.
Alternatives to make it quicker:
Match the pattern using fulltext indexing instead of
LIKE
predicates with wildcardsStore the count and keep it updated as you add or change data, as the comment above suggests (but you'd need a separate counter for each possible search pattern)
Get a more powerful computer
Store less data
CodePudding user response:
This would run immensely faster if you have INDEX(city)
:
WHERE `city` = New York'
This would run nearly as fast as that:
WHERE `city` = New York%'
If you currently have location
as a combination of address, city, state, zip, then you may need to rethink the choice of having location
in a single column. Or go with a FULLTEXT
index, as Karwin suggests.