Google tools update: Want to remove content from Google’s index? No problem!

It looks like Vanessa Fox and the Google team have listened to suggestions and comments made by Rand Fish (several questions were offered by SEOmozzers) during an interview they had a few days ago. Deleting URL’s from the Google index was one topic they discussed and now Google just updated their tools to help site owners remove content from its search engine.

In a new post by Vanessa Fox in the Official Google Webmaster Central Blog, Vanessa writes that any site owner, can control what content of your site you would like to have indexed in search engines or not. She goes on to say that “The easiest way to let search engines know what content you don’t want indexed is to use a robots.txt file or robots meta tag. But sometimes, you want to remove content that’s already been indexed. What’s the best way to do that?”

Well, it always depends on the type of content that you want to remove.

The Google webmaster help center provides detailed information about each situation. Once Google recrawl the page, they will remove the content from their index automatically.

This is great, but if you cannot wait for the next crawl, and would like to expedite the removal, then the way to do that has just gotten easier.

For all who have verified accounts through a Google webmaster tools account, you will now see a new option under the Diagnostic tab called URL Removals. To get going, simply click the URL Removals link, then New Removal Request. Choose the option that matches the type of removal you’d like.

These options include and will allow you to remove the following:

  1. Individual URLs (a specific URL or image that listed in Google)
  2. Directories: (to remove all files and folders within a directory on your site)
  3. Entire Site: (to remove your entire site from the Google index - if ever you wanted to do that)
  4. Cached Copies: (to remove cached copies of pages from the Google index)
  1. How to remove individual URLs
    In order for the URL to be eligible for removal, one of the following must be true:

    Once the URL is ready for removal, enter the URL and indicate whether it appears in the web search results or image search results. Click Add. You are able to add up to 100 URLs in a single request. Once you’ve added all the URLs you would like removed, click Submit Removal Request.

  2. Removing a directory
    This will remove all the files and folders within a directory on your site.For example, if you request removal of the following: http://www.mydomain.com/myfolder, then this will remove all URLs that begin with that path, such as:

    http://www.mydomain.com/myfolder
    http://www.mydomain.com/myfolder/mypage.html
    http://www.mydomain.com/myfolder/images/myimage.jpg

    In order for the whole directory to be eligible for removal, you must block it using a robots.txt file. In other words, using the example above, http://www.mydomain.com/robots.txt should include the following:

    User-agent: Googlebot
    Disallow: /myfolder

  3. Removing your entire site
    I’ve never seen the use of removing your entire site from the Google index but if you ever wanted to, here’s how. Choosing this option will remove all subdirectories and files. Do not use this option to remove the non-preferred version of your site’s URLs from being indexed. For instance, if you want all of your URLs indexed using the www version, don’t use this tool to request removal of the non-www version. Instead, specify the version you want indexed using the Preferred domain tool (and do a 301 redirect to the preferred version, if possible). To use this option, you must block the site using a robots.txt file.
  4. Removing cached copies
    Choose this option to remove cached copies of pages in our index. You have two options for making pages eligible for cache removal.Using a meta noarchive tag and requesting expedited removal
    If you don’t want the page cached at all, you can add a meta noarchive tag to the page and then request expedited cache removal using this tool. By requesting removal using this tool, Google will remove the cached copy right away, and by adding the meta noarchive tag, they will never include the cached version. (If you change your mind later, you can remove the meta noarchive tag.)

    Changing the page content
    If you want to remove the cached version of a page because it contained content that you’ve removed and don’t want indexed, you can request the cache removal here. Google checks to see that the content on the live page is different from the cached version and if so, will remove the cached version.

To reinclude content
If a request is successful, it appears in the Removed Content tab and you can reinclude it any time simply by removing the robots.txt or robots meta tag block and clicking Reinclude. Otherwise, we’ll exclude the content for six months. After that six month period, if the content is still blocked or returns at 404 or 410 status message and we’ve recrawled the page, it won’t be reincluded in our index. However, if the page is available to our crawlers after this six month period, we’ll once again include it in our index.

Requesting removal of content you don’t own
What if you want to remove content that’s located on a site that you don’t own? The new Webpage removal request tool steps through the process for each type of removal request.

It is important tp note that since Google indexes the web and doesn’t control the content on web pages, they generally can’t remove results from the index unless the webmaster has blocked or modified the content or removed the page. If you would like content removed, you have to work with the site owner to do so, and then use this tool to expedite the removal from our search results.

Removal of personal information
If you have found search results that contain specific types of personal information, you can request removal even if you’ve been unable to work with the site owner. For this type of removal, provide your email address to work directly with Google.

To check the status of removal requests
Removal requests show as pending until they have been processed, at which point, the status changes to either Denied or Removed. Generally, a request is denied if it doesn’t meet the eligibility criteria for removal.

You can check on the status of pending requests, and as with the version available in webmaster tools, the status will change to Removed or Denied once it’s been processed. Generally, the request is denied if it doesn’t meet the eligibility criteria. For requests that involve personal information, you won’t see the status available here, but will instead receive an email with more information about next steps.

Popularity: 50% [?]

del.icio.us   Digg!   Furl   Furl   Furl   Furl

Related Posts

  • Interview with Vanessa Fox on Google’s Webmaster Central
  • Beta launch of Dynamic URL Rewriting in Site Explorer
  • Microsoft announes new Webmaster Tools
  • 3 Responses to “Google tools update: Want to remove content from Google’s index? No problem!”

    1. […] www.mantaseosolutions.com/blog.....em/… […]

    2. Marleen Walmsleyon 13 Jun 2007 at 1:27 am

      I already submitted this request. (?)

      Am so grateful for your help because I am being inundated by telemarketers. And creeps wanting a date - I’m scared because they have my contact info.

      Thanks in advance for your genius. I just found out how to contact Google to accomplish this.

      Marleen Walmsley

    3. Melt du Plooyon 13 Jun 2007 at 10:51 am

      Marleen, Google provides all the tools and how-to’s to do it, i have no control over how and when does the removal. Best is to follow their guidelines

    Trackback URI | Comments RSS

    Leave a Reply