Home > database >  "The page you requested was removed" with 410 and redirect
"The page you requested was removed" with 410 and redirect

Time:01-03

I have following requirement:

  • the user comes to a job page in our customer's website, but the job is already taken, so the page does not exist anymore
  • the user should NOT get a 404 but a 410(Gone) and then be redirected to a job-overview-page where he gets the information that this job is not available anymore and a list of available jobs
  • but instead of a 302(temp. moved) or a 404(current behavior) google should get a 410(gone) status to indicate that this page is permanently unavailable
  • so the old url should be removed from the index and the new not be treated as a replacement

So how i can redirect the user with a 410 status? If i try something like this:

string overviewUrl = _urlContentResolver.GetAbsoluteUrl(overviewPage.ContentLink);
HttpContext context = _httpContextResolver.GetCurrent();
context.Response.Clear();
context.Response.Redirect(overviewUrl, false);
context.Response.StatusCode = 410;
context.Response.TrySkipIisCustomErrors = true;
context.Response.End();

I get a static error page in chrome with nothing but:

The page you requested was removed

But the status-code is correct(410) and also the Location is set correctly, just no redirect.

If i use Redirect and set the status before:

context.Response.StatusCode = 410;
context.Response.Redirect(overviewUrl, true); // true => endReponse

the redirect happens but i get a 302 instead of the desired 410.

Is this possible at all, if yes, how?

CodePudding user response:

I think you're trying to bend the rules of http. The documentation states

The HyperText Transfer Protocol (HTTP) 410 Gone client error response code indicates that access to the target resource is no longer available at the origin server and that this condition is likely to be permanent.

If you don't know whether this condition is temporary or permanent, a 404 status code should be used instead.

In your situation either 404 or 410 seems to be the right status code, but 410 does not have any statement about redirection as a correct behavior that browsers should implement, so you have to assume a redirect is not going to work.

Now, to the philosophically right way to implement your way out of this...

With your stated requirements, "taken" does not mean the resource is gone. It means it exists for the client that claimed it. So, do you 302 Redirect a different client to something else that might be considered correct? You implemented that, and it seems like the right way to do it.

That said, I don't know if you "own" the behavior across the client and server to change the requirements to this approach. Looking at it from the "not found" angle, a 404 also seems reasonable. It's not found because "someone" already has the resource.

In short if your requirements are set in stone, they may be in opposition to the HTTP spec. If you still must have a 410 then you would need to change the behavior on the client-side somehow. If that's JavaScript, you'd need to expect a 410 from the server that returns a helpful payload that the client interprets to do something else (e.g. like a simulated redirect).

If you don't "own" the client code... well that's a different problem.

There's a short blog post by Tommy Griffth that backs up what I am saying. Take a read. It says in part,

The “Gone” error response code means that the page is truly gone—it’s no longer available on the origin server and no redirect was set up.

Sometimes, webmasters want to be very explicit to Google and other search engines that a page is gone. This is a much more direct signal to Google that a page is truly gone and never coming back. It's slightly more direct than a 404.

So, is it possible? Yes, but you're going to need to "fake" it by changing both client and server code.

CodePudding user response:

I will accept Kit's answer since he's right in general, but maybe i have overcomplicated my requirement a bit, so i want to share my solution:

What i wanted actually?

  1. provide crawlers a 410 so that the taken job page is delisted from search engine indexes
  2. provide the user a better exeprience than getting a 404, so redirect him to a job-overview where he can find similar jobs and gets a message

These are two separate requirements and two separate users, so i could simply provide a solution for a crawler and one for a "normal" user.

In case someone needs something similar i can provide more details, just a snippet:

if (HttpContext.Current.IsInSearchBotMode())
{
    Deliver410ForSearchBots(HttpContext.Current);
}
else
{
    // redirect(301) to job-overview, omitting details
}

private void Deliver410ForSearchBots(HttpContext context)
{
    context.Response.Clear();
    context.Response.StatusCode = 410;
    context.Response.StatusDescription = "410 job taken";
    context.Response.TrySkipIisCustomErrors = true;
    context.Response.End();
}

public static bool IsInSearchBotMode(this HttpContext context)
{
    ISearchBotConfiguration configuration = ServiceLocator.Current.GetInstance<ISearchBotConfiguration>();
    string userAgent = context.Request?.UserAgent;
    return !(string.IsNullOrEmpty(userAgent) || configuration.UserAgents == null)
        && configuration.UserAgents.Any(bot => userAgent!.IndexOf(bot, StringComparison.InvariantCultureIgnoreCase) >= 0);
}

These user-agents i have used for the crawler detection:

<add key="SearchBot.UserAgents" value="Googlebot;Googlebot-Image;Googlebot-News;APIs-Google;AdsBot-Google;AdsBot-Google-Mobile;AdsBot-Google-Mobile-Apps;DuplexWeb-Google;Google-Site-Verification;Googlebot-Video;Google-Read-Aloud;googleweblight;Mediapartners-Google;Storebot-Google;LinkedInBot;bitlybot;SiteAuditBot;FacebookBot;YandexBot;DataForSeoBot;SiteCheck-sitecrawl;MJ12bot;PetalBot;Yeti;SemrushBot;Roboter;Bingbot;AltaVista;Yahoobot;YahooCrawler;Slurp;MSNbot;Lycos;AskJeaves;IBMResearchWebCrawler;BaiduSpider;facebookexternalhit;XING-contenttabreceiver;Twitterbot;TweetmemeBot" />
  • Related