We learned that we need to No Index and request re-crawl BEFORE excluding from robots.txt and removing a page from the sitemap.

Because we have historically done it all at once or maybe just one and not the other, we’ve caused these issues:

  • Indexed, though blocked by robots.txt > if it’s blocked by robots.txt, we should NOINDEX and we’re not
  • Submitted URL blocked by robots.txt > if it’s blocked by robots.txt, we should NOINDEX and we’re not

We updated excluded articles so that they are removed from the sitemap and have the NOINDEX robot tag. All errors have been submitted for validation.