[SOLVED] Google Indexing ‘?tmpl=component&type=raw’ Webpages

Problem

When doing Search Engine Optimisation tests you discover that your web pages are duplicated in the Google index. One record is correct and the other includes ‘?tmpl=component&type=raw’

The ‘?tmpl=component&type=raw’ website returns a copy of the web page without CSS styling. For example  http://www.itsupportguides.com/?tmpl=component&type=raw

The Google search results below shows a duplicate returned. The second is the ‘?tmpl=component&type=raw

Joomla-DuplicateGoogle1

Resolution

Before going into the fix it is important to understand what caused this problem and why to ensure it doesn’t continue.

So what is this ?tmpl

‘?tmpl=component&type=raw’ is a common modifier switch which can be used to return a text (or raw) version of the webpage. It is commonly used with AJAX programming, however it can also be used by screen readers (accessibility software) or small screen devices like phones and PDA’s.

How did the ?tmpl pages end up on Google

There are two common possibilities, either somewhere on your website there is a link to a ‘?tmpl=component&type=raw’ page OR a part of your web page is using this link in an AJAX reference – Google is then indexing the new link and all following links. Either way you end up with a duplicate of your web page in Google Searches.

How do I fix this?

Due to the wide nature of the issue there are a few steps which can be taken to minimise the risk of this happening. After applying the changes it may take several days before the duplicates are removed from the Google search results.

Step 1: Check your web site for any links to ‘?tmpl=component&type=raw’ pages

Have you created any links on your website which include the tmp string? If so, change the link so it does not include this.

Step 2: Are you using AJAX for your ‘More Articles’ or ‘Read More’ links

  1. Using Firefox, open a web page of your site.
  2. Right click and select ‘View page source’
  3. Press ‘F3’ to open the find bar
  4. Enter in =component

If you found a result you’ve found the source of the ‘?tmpl’ duplicates.

Joomla-DuplicateGoogle2

Step 3: Review your AJAX implementation. Is it really required?

If you’ve come this far it’s likely you’re happy to disable it to help improve your search engine results.

Assuming you’re using a Rocket Theme template (as this issue seems to be related to the way they implement their AJAX)

  1. From the Joomla template manager, open your Rocket Theme template
  2. Expand ‘Features’ and scroll down until you see ‘More Articles’
  3. Disable the feature.

Joomla-DuplicateGoogle3

Step 4: Confirm that the ?tmpl link is no longer in the page source

  1. Repeat Step 2

Step 5: Remove the duplicates from the Google search results

Now that you’ve removed the source of the issue you can go about fixing the Google search results.

To do this you will need to do three things – create a new sitemap.xml of you website, add a ‘Disallow’ to your robots.txt file and request the removal of each ?tmpl page which is causing issues.

The steps assume you already have your site added to Google Webmaster Tools and have a sitemap.xml and robots.txt file implemented.

Step 5.1: Create a new sitemap.xml

  1. Go to http://www.xml-sitemaps.com/ and create a new sitemap file
  2. Open the file and check that it does not include any ‘?tmpl=component&type=raw’ links
  3. Upload the file to the root level of your website. E.g. www.itsupportguides.com/sitemap.xml
  4. Log into Google Webmaster Tools
  5. Select the ‘Site Configuration’ menu then ‘Sitemaps’
  6. Either submit the new sitemap file or select the sitemap and then click ‘Resubmit’

Step 5.2: Add a ‘Disallow’ to your robots.txt file

Using your cPanel (or other access to the website files) open the robots.txt file on the root level of your website (e.g. www.itsupportguides.com/robots.txt)

Add a ‘Disallow’ entry for each page you want removed from the Google Search ressults. Note that each entry is relative to your website domain. Also you’re unable to use wildcards, so each directory needs to be listed separately.

For example:

User-agent: *
Disallow: /?tmpl=component&type=raw
Disallow: /Windows7/?tmpl=component&type=raw

Save the file back to your webhost.

Step 5.3: Request the removal of each ?tmpl page which is causing issue

  1. Log into Google Webmaster Tools
  2. Select the ‘Site Configuration’ menu then ‘Crawler Access’
  3. Select the ‘Remove URL’ tab
  4. Click on ‘New removal request’ and enter the address to remove (repeat if you have more than one directory, e.g. /Windows7/?tmpl=component&type=raw)

Joomla-DuplicateGoogle4

Now wait several days for the changes to take affect and good luck!