Search Engines like Google & Bing provide free tools to allow a webmaster to monitor, submit and withdraw/remove pages from search listings. As Google is the largest volume of search queries and traffic, we will show how to use their tool, but Bing also has similar features and processes.
After running through these instructions. - You will have your web-property added to a Google Search Console.
Know how to allow another person to access the same reporting through their own Google Search Console account.
Know the options and processes for removing content from search engines (primarily Google).
Know how to prevent pages from being indexed.
We will not cover SEO topics or how to improve your search ranking.
Getting Started with Search Engine Console.
Sign up for Google Search Console.
Add the property that you are interested in monitoring. If you are not asked to add one when you first sign in, use the menu on top left allows you to search for registered and add new ones
+ Add Property
.Select Property Type. Initially, I recommend using
URL prefix
rather than the newDomain
method, because it allows more granularity if you wish to grant/share access, and also offers more verification options.Verify the property using the option that you prefer. From experience I have found that the dns option (adding a specified txt entry) is often the fastest, providing you have access to your domain dns settings.
Once you have been granted access, you can now explore the options in the left-hand menu. If this is the first time this domain has been added to the Search Engine Console, you may have to wait for 24 hours before the reports are populated. (There is also Python API to access some of this data. )
Removing Urls
Why would I want to remove a url from search indexes? A Common reason is
that the page is for an older product, but Google is giving it too much emphasis in search results vs newer content. You aren’t ready to remove the pages from the site and you still want them accessible to existing users, but you want new people to discover the new product when they search. If the old product was popular, it may continue to outrank the newer one, possible even get listed in the SiteLinks section for the site’s search results. The only real option here is removal from the search index.
The Removals
option in the left menu, will allow you to quickly remove index pages temporarily, or flag outdated content. Whilst this sounds useful. In many cases it is only a short term remedy as Google will re-index on the next crawl.
The long-term method is to add meta tags to the pages to give search engines a strong hint. Good search engines will obey, and the page will be removed in time (may take up to 14 days depending on the Search Engine refresh cycle.)
The two main meta tags are noindex
and nofollow
.
noindex - Do not show this page, media, or resource in search results
nofollow - Do not follow the links on this page. If you don’t specify this directive, Google may use the links on the page to discover those linked pages.
To prevent (well-behaved) search engine web crawlers from indexing a page on your site, add the tags into the <head>..</head>
of the page/s you wish to remove from the index.
<meta name="robots" content="noindex">
<meta name="robots" content="nofollow">
or combine them
<meta name="robots" content="noindex,nofollow">
If you want to do this for all pages on a particular site, you could add this to the your page template.
Other Methods - Robots.txt
Another method to control crawler access, without altering the pages, is to use a robots.txt file. But it is the wrong mechanism to use to removing existing indexed pages. As Google as says
A robots.txt file tells search engine crawlers which URLs the crawler can access on your site. This is used mainly to avoid overloading your site with requests; it is not a mechanism for keeping a web page out of Google. To keep a web page out of Google, block indexing with noindex or password-protect the page.
Other Methods - Site Map
Sitemaps are great way allow search engine to discover all of your pages, and hint at the emphasis for each page you list. However they are not effective for removing pages from the search index. You can find out more about sitemaps here (What is a Sitemap?)[https://developers.google.com/search/docs/advanced/sitemaps/overview].
Reference: Remove a page hosted on your site from Google