Google Not Following Own Best Practices for Own Web Site

We all have issues with our web sites at one time or another. Certainly when designing, programming, developing, and optimizing our web sites we try to make sure that we follow the industry’s best practices. But that doesn’t always happen, even if you are Google. I have uncovered a few instances where Google is not following its own standards when it comes to the industry’s best practices.

One of the more serious, yet basic, optimization issues that I found was the fact that Google is creating all sorts of duplicate content on their very own web site: by linking internally to more than one copy of their directory home pages. And Google is not just doing this a few times: Google is doing this over 26,000 times! Take a look at all of the duplicate pages of content that Google has on their web site:

Recently, while reading Google’s own Webmaster Guidelines, I read the Duplicate Content section. What caught my eye specifically was the 2nd bullet point on that page:

Be consistent: Try to keep your internal linking consistent. For example, don’t link to http://www.example.com/page/ and http://www.example.com/page and http://www.example.com/page/index.htm.

It appears to me that Google is not following their own guidelines when it comes to duplicate content. In fact, Google has an extra 26,000 pages on their web site that should NOT be there: they’re duplicates of other pages on the site.

Here’s an example where Google is not being consistent in their internal linking. If you go to http://www.google.com/agencyland/ you will notice that the logo on that page, right there at the top, links to a duplicate page, http://www.google.com/agencyland/index.html. That is a classic “No, No” when it comes to search engine optimization best practices. Never link to the index.html, default.html, index.htm, default.asp pages. They are always going to be duplicates of what’s on your home page of that directory–and, if you ever change the CMS on your site, or go with another web server (i.e., you move from apache to Windows), you will be left with a bunch of URLs that you will have to redirect. I always recommend linking internally on your web site to the directory URLs (e.g., http://www.domain.com/directory/ ) rather than specific files.

But apparently those who are creating pages for certain services at Google haven’t read their own Google Webmaster Guidelines. Let’s look at another example:

If you look at the Urchin download page (http://www.google.com/urchin/download.html) you will notice that they have a “Home” link to Urchin’s home page: http://www.google.com/urchin/index.html. OOPS!! That’s NOT the home page for Urchin, the page is http://www.google.com/urchin/ and that’s where that “Home” link should link: not to the index.html page.

Like I mentioned previously, this is not a few isolated cases where Google has issues with their internal linking that’s leading to duplicate content on their site. In fact, it appears that they are doing this over 26,000 times, adding an additional 26,000 pages to their site.

It just goes to show that not even Google can optimize their very own web site properly based on the industry’s best practices. Is that a bad thing? Not necessarily. But, I have to admit that when you are on the top of your game and demand that other web sites adhere to the industry’s best practices, it’s more helpful if you lead by example.