I've developed a site with basic search function. The site accepts input in GET params and the site is hosted on a shared hosting server. Thus there are limits on SQL execution.
My goal is to stop [or at least lower the chance] of processing automated search query so the site does not reach the SQL limit for bogus search queries where there is no real user.To prevent this I've used CSRF in the landing page from where search is initiated.
What else can I try to make sure that the search is performed only for real users and not for automated/bot search. I've thought of CAPTCHAs but asking to confirm CAPTCHA for every search query will make it really worse.
CodePudding user response:
Welcome to the eternal dichotomy between useability and security. ;)
Many of the measures that are used to detect and block bots, also impact usability (such as the extra steps required by opaque captchas). None of which solve the bot problem 100% either (ala captcha farms).
The trick is to use a reasonable mix of controls, and to try hard not to impact the user experience as much as possible.
A good combination that I have used with success to protect high-cost functions on high-volume sites is:
CSRF: this is a good basic measure in itself to stop blind script submission, but won't slow down or stop a sophisticated attacker at all;
response caching: search engines tend to get used for the same thing repeatedly, so by caching the answers to common searches, you avoid making the SQL request altogether (which avoids the resource consumption, and also improves regular usage too);
source throttling: track the source IP and restrict the number of operations within a reasonable window (not rejecting any outside this, just queueing them and so throttling the volume to a reasonable level); and
transparent captcha: something like Google's CAPTCHAv3 in transparent mode will help you drop a lot of the automated requests, without impacting the user experience.
CodePudding user response:
You could also look at developing your search function to search an XML file instead of via the database. This would enable you to search as many times as you like without any issues.