The Provenance of Panda

Like all good bandits in the tumbleweed-strewn landscape that is the search engine morphological ecosystem (I’m coming down with a bad case of logorrhea, here, but you get my drift…), Google Panda has iterative ‘form’.

Previous Panda Updates

Here’s the Panda update schedule so far, as confirmed by Google:

  • Panda Update 1.0: Feb. 24, 2011
  • Panda Update 2.0: April 11, 2011 (about 7 week gap)
  • Panda Update 2.1: May 10, 2011 (about 4 week gap)
  • Panda Update 2.2: June 16, 2011 (about 5 week gap)
  • Panda Update 2.3: July 23, 2011 (about 5 week gap)
  • Panda Update 2.4: August 12, 2011 (about 3 week gap)
  • Panda Update 2.5: September 28, 2011 (about 7 week gap)

Panda 1.0

While Google has come under intense pressure in the past month to act against content farms, the company’s focus on change has been in the works since last January 2010.

In January 2011, Google promised that it would take action against content farms that were gaining top listings with ‘shallow’ or ‘low-quality’ content, subsequently announcing a change to its ranking algorithm –  the, like, software thingey that Google uses to rank web pages – designed take out such material.

In short, Google said it was going after sites that had low-levels of original content in January and delivered, shortly thereafter, a new algorithm known as ‘Google Panda’, ‘panda’ being the surname of one of Google’s lead developers on the project. Also known as the ‘Farmer Update‘. You couldn’t make it up, could you?

Panda 2.0

Okay, so late February 2011, Google launches a substantial algorithm change (known as ‘Farmer’ or ‘Panda’) aimed at identifying low-quality pages and sites; these are pages (often seen on so-called ‘content farms’) with text that is relevant for a query, but may not provide the best user experience.

The original algorithm update impacted only U.S. queries. As of today, this change is live for all English queries worldwide. This includes both English speaking countries (such as searches on google.co.uk, and google.com.au) and English queries in non-English countries (for instance, for a searcher using google.fr who’s chosen English-language results).

Google has always used a number of signals in determining relevant search results. Some of these are on the pages themselves (such as the text on a page), some are on other sites (such as anchor text in links to a page), and some are based on user behaviour (for instance, Google gathers data about how long pages take to load by using toolbar data from users who access those pages).

Google Panda – what Google calls a ‘high quality sites algorithm’ – is clear about its ranking objectives: enough low quality content on a site can reduce the entire site’s rankings, not just the low quality pages.

Panda 2.1, 2.2, 2.3

2.1, some tweaks. 2.2, improves scraper detection, 2.3 more tweaks. This from Google:

We’re continuing to iterate on our Panda algorithm as part of our commitment to returning high-quality sites to Google users. This most recent update is one of the roughly 500 changes we make to our ranking algorithms each year.

2.3 update incorporates some new signals that help differentiate between higher- and lower-quality sites. As a result, some sites are ranking higher after this most recent update.

The 2.1 to 2.3 dates are approximate. Google may have rolled them out just a bit earlier or later than those exact dates cited above. The bigger issue is that you can see a trend developing. About every month, Google reruns its Panda algorithm.

Each time Panda is run, there’s a chance some hit from the last update will improve, while other sites might see traffic drops. In between updates, changes people may make specifically in hopes of fixing a Panda problem won’t show any impact until the next update is run.

Panda 2.4

Google’s Panda, first launched in the United States in late February and rolled out to English language indices internationally in April, now launch internationally in all languages other than Chinese, Japanese, and Korean.

2011, then, sees Google focused on identifying sites with a large number of low quality pages as part of their overall goal of providing the best possible search experience. The Panda updates have been evolutions of algorithms that increasingly detect this and lower those sites in search rankings. Since we’re talking about algorithms with many inputs, there’s no one thing that can cause a site to lose rankings due to Panda. Rather it’s an accumulation of factors, such as:

  • can visitors easily find their way around?
  • is it obvious what topic each page is about?
  • is the content original or is it aggregated (‘pulled’) from other sources?
  • do the number and placement of the ads obscure the visitor’s ability to quickly access the content?
  • when looking objectively at the site, is the primary focus the user need or the business goal?
  • is the content on the page authoritative and valuable? Does it answer the query  better than other pages on the web?
  • if some of the pages on the site are very high quality and engaging, are other pages on the site not as high quality? (Google has stated that enough low quality content on a site can reduce the entire site’s rankings, not just the low quality pages.)

You can’t look for one specific thing to fix and you can’t compare one specific thing on your site to that same thing on another site: there are too many moving parts.

Panda 2.5

Late September, Google launched what’s being called 2.5 of its Panda algorithm. On October 5th, Google’s Matt Cutts tweeted:

“Weather report: expect some Panda-related flux in the next few weeks, but will have less impact than previous updates (~2%).”

Panda-related flux? Indeed, this seems to be the case, with site owners reporting Panda-related changes on at least two occasions in early.

  • some Panda updates are due to new signals being incorporated into the overall Panda algorithms
  • some Panda updates are recalculation of how sites perform within the Panda algorithms, based on updated data about the sites since the last recalculation
  • the only difference with this update vs. the previous ones is that there will be (and have been) several updates (presumably of both types) within days or weeks of each other

Panda 2.5, then, is a series of Panda algorithm and site recalculation updates over a period of several weeks. September 27th, October 3rd, and October 13th have been confirmed by Google, but it appears that there may have been several other updates (of either Panda algorithm changes or site recalculations) as well during this period.

The key thing to remember, here, is that Panda is a site-wide assessment (so even high quality pages will be impacted)  and key recovery strategies are around:

  • creating valuable content (so the page is the best answer to the query on the web)
  • consolidating approximate duplication (if lots of pages on the site about the same topic)
  • getting rid of exact duplication (syndication, manufacturer feeds and other measures that result in the exact text appearing on multiple sites)
  • improving usability (such as ensuring a valuable and engaging user experience, providing easy and useful navigation, not obscuring the content with an overwhelming amount of ads)
  • working on engagement (building a site that people want to stay on, link to, return to, share, and otherwise show happiness towards)

Some Google Panda LinkLuv
What is Google Panda?
What is Google’s Farmer Update?
What is Google Page Rank?

John Hargaden is CEO of wevolution digital, your full-service digital marketing partner focussed on growing your business online. Case Studies.