It appears that Google has removed the www subdomain of LinkedIn from their organic, natural search engine results pages. This is commonly referred to as being banned in Google when pages are removed. Initially we didn’t know who removed the pages. We now know, as I’ll explain in this post, what really happened and what you need to know as a result of this whole incident. This was a very fluid situation, so this post has been heavily changed and updated as we learned more about what really happened.
Over the past few years, LinkedIn profile pages have typically ranked very well for personal names, especially for my personal name. Google, for some reason, has decided to remove the www subdomain of LinkedIn’s pages. There are a few reasons why pages are removed all of a sudden: the site is given a manual action (penalty) from Google. Initially we didn’t know for certain if it was a manual action from Google or if the pages were mistakenly removed by LinkedIn. In either case, all of the www subdomain pages have been removed from Google’s search engine results.
For many, this would have a very large impact, especially at a time when so many people are out of work. As I mentioned, LinkedIn’s profiles has typically ranked very well for personal names. For example, when you searched for “Bill Hartzer” in Google, my LinkedIn profile page has typically ranked very well, as it’s almost like my resume that’s there online. Granted, it’s probably been ranking well for my name because I’ve optimized it, am connected with thousands of people on LinkedIn, and I’ve even done some link building to get the profile page to rank well. If I didn’t have a website, I would probably rely on my LinkedIn profile to rank well. So many people do, and many get business directly because their LinkedIn profile ranks in the Google search engine results.
Let’s take a look at what LinkedIn being gone from Google’s search engine results looks like:
Using the site: command in Google, you can see that profiles no longer are in the search engine results. An example LinkedIn profile is my profile here. It uses the www subdomain, and there are no results in Google for that subdomain when I search for it.
When I search for my LinkedIn profile URL in Google, another URL comes up in the results, which is a result from another country. It’s the www subdomain that I typically see rank well for my name in Google’s results:
The am subdomain shows up for a search or my LinkedIn profile URL, which is is the Amsterdam version, which is in German. Sneakily enough, though, LinkedIn apparently sees this, and they are redirecting the Amsterdam version to the www subdomain when I click through from Google. That in itself, could potentially be a violation of Google’s search engine guidelines: they are showing one version to Google and another version to me, the user. I was actually looking to see what the Amsterdam version of my LinkedIn profile page is, but I’m not able to see that version. This is a concern for me, as I do not know what LinkedIn is showing to Google, they could be showing different content. Well, they are: it’s in another language. My recommendation to LinkedIn is to NOT redirect users to the local language version. If someone goes to the .am subdomain of a profile, they SHOULD be able to see that .am version (the same for other subdomains on LinkedIn).
We are going through a recent CORE update from Google, and Google is updating it’s CORE algorithm, which was announced just the other day. However, I do not think the fact that LinkedIn has been banned in Google has anything to do with this latest algorithm update. It appears to me, because all pages on the www subdomain of LinkedIn have been removed, that Google did this on purpose: it’s a manual action penalty. Google’s users do expect that LinkedIn should be there in the search engine results, many people rely on it. However, it’s been removed. In extreme cases like this, usually the site itself (Linkedin) has done something to violate Google’s search engine guidelines, therefore they removed the pages from their search engine results.
Whatever it is LinkedIn did, Google just doesn’t remove pages from LinkedIn, and usually, before a large site is banned, Google may try to reach out to the site to see if they can fix the issue. Nonetheless, LinkedIn has pages removed in Google, essentially, with the www subdomain being removed completely.
And in case you’re wondering, I don’t see anything in LinkedIn’s robots.txt file (https://www.linkedin.com/robots.txt) that stops Google’s crawlers from crawling, and there is no noindex tag on pages. There is a disallow all directive, but the directive in the robots.txt file that is specifically for Google essentially “overrides” the directive is “disallow all” for the other bots and crawlers.
Update 1: LinkedIn Removed Pages
It appears that it’s quite possible that LinkedIn’s www subdomain was removed from Google’s search engine results due to something that LinkedIn did, which was to remove the http:// version of their site. See John Mueller’s tweet. LinkedIn may have done this to themselves, and it might not be a ‘ban’ after all:
PSA: Removing the "http://" version of your site will remove all variations (http/https/www/non-www). Don't use the removal tools for canonicalization.https://t.co/yTfRzWZGtd
— ? John ? (@JohnMu) May 6, 2020
LinkedIn Pages Back in Hours
Around 1:30pm on the same day, I confirmed that LinkedIn’s pages were back in the index–the pages that were previously removed. Well, as I’ll explain in the next section, the pages were never really removed in Google’s index. They were hidden from the search results. Based on information that I’ve received, we now know what most likely happened: Someone from LinkedIn used the Google Removal Tool to attempt to remove pages from Google’s index. However, what really happened is that all of the pages on the www subdomain were removed. Based on John Mueller’s tweet, as referenced above, they probably wanted to remove the http:// pages in Google’s search engine results, but ended up removing the www subdomain pages at the same time. Once this was corrected in the Google Removal Tool, the pages re-appeared back in Google’s search engine results pages, all 233,000,000 of them.
The fact that all of these 233,000,000 pages re-appeared back in Google’s search engine results pages so quickly brought up a very good question: why, or how is it that millions of web pages can re-appear in Google’s search engine results so quickly? We know that if pages are removed from the search engine results pages in Google, then it typically takes a long time for Google to recrawl and reindex pages, especially if it involves millions of web pages. Something like that just doesn’t happen overnight, it can take days, weeks, or even months for most sites to get that many pages recrawled, reindexed, and to appear again in Google’s search engine results pages. So, how did a website like LinkedIn get all those pages (millions of pages) back into Google’s search engine results pages after they were removed? Surely something else is at play here. And that’s where this all gets very interesting. It has led to us learning more about the functionality of the Google Removal Tool, and how it really works.
In other words, the Google Removal Tool doesn’t exactly remove pages from Google’s index. Here’s what we learned.
What We Learned from This
We know that LinkedIn, allegedly, used the Google Removal Tool incorrectly, and millions of pages, especially LinkedIn profile pages, were de-indexed in Google’s search engine result pages. These were pages on the www subdomain of LinkedIn.com. Later in the day, LinkedIn’s pages came back into the index just as quickly as they had vanished. Here is what we learned from this:
- It’s possible to screw things up when using the Google removal tool in Google Search console, especially if you’re using parameters and not removing one URL at a time.
- Google will remove the pages quickly if you use the Google Removal Tool. And in this case they removed millions of URLs. That can be a good thing (or a bad thing!)
- The Google removal tool is just a “mask” and the pages aren’t really moved from Google’s index. Technically speaking, the URLs are still in Google’s index, they’re just “hidden” and won’t show up at the time of the search.
Probably the most important takeaway here is the fact that the Google Removal tool doesn’t actually remove pages from Google’s index. They’re actually still in Google’s index. Rather, the Google Removal Tool actually just hides those URLs from being shown at the time of a search. As in, they don’t show up in the search engine results. As I’ve mentioned before, the Google Removal Tool actually just hides those URLs in Google search engine results pages for 90 days. Every 90 days you need to use the tool again to “hide” the URLs. There’s a chance that those URLs will come back after 90 days.
Are you confused? Yes, probably just as confused as I am at this point. The name of the Google Removal Tool has the word removal in it. So, we assume that the tool actually removes web pages from Google’s index. However, what we have learned after this LinkedIn fiasco, is that the tool only really hides the URLs from the search engine results pages so they don’t show up in the search engine results. They’re still in Google’s index.
So What Do You Do Now?
At this point, if you truly want Google to remove web pages from their search engine index you need to completely remove those pages from your web server–or make sure that a “410 Gone” is served in the server header. Also, you need to remove all of the internal links pointing to those URLs on your website, as well as make sure they are removed from the website’s XML sitemap file(s), as well. Google’s crawlers will crawl them and remove them from their index, but it may take some time. If you’re truly in a hurry, and need URLs from showing in the search engine results pages quickly, then you still need to make sure they’re delivering a “410 Gone” in the server header for that URL. At this point, I would only use the Google Removal Tool for URLs that desperately need to be removed. For example, URLs that contain sensitive data. Keep in mind that using the tool only hides the URLs and doesn’t really remove them.