Sometimes we rank well on one engine for a specific keyphrase and assume that each one search engines will like our pages, and hence we have a tendency to will rank well for that keyphrase on a range of engines. Unfortunately this is often rarely the case. All the majorsearch engines differ somewhat, so what is get you ranked high on one engine may really help to lower your ranking on another engine.
It’s because of this that some individuals prefer to optimize pages for each explicit search engine. Usually these pages would only be slightly totally different however this slight difference may build all the difference when it comes to ranking high.
However as a result of search engine spiders crawl through sites indexing each page it will find, it might come across your search engine specific optimizes pages and because they’re terribly similar, the spider may assume you are spamming it and will do one in every of two things, ban your website altogether or severely punish you in the form of lower rankings.
The solution is that this case is to stop specific Search Engine spiders from indexing some of your net pages. This is often done employing a robots.txt file which resides on your webspace.
A Robots.txt file is a vital part of any webmasters battle against obtaining banned or punished by the search engines if she styles different pages for various search engine’s.
The robots.txt file is just a straightforward text file because the file extension suggests. It’s created using a easy text editor like notepad or WordPad, difficult word processors like Microsoft Word can only corrupt the file.
You’ll be able to insert bound code during this text file to make it work. This is how it can be done.
User-Agent: (Spider Name)
Disallow: (File Name)
The User-Agent is that the name of the search engines spider and Disallow is the name of the file that you do not need that spider to index.
You have to begin a new batch of code for each engine, however if you would like to list multiply disallow files you’ll be able to one below another. For example
User-Agent: Slurp (Inktomi’s spider)
The on top of code disallows Inktomi to spider two pages optimized for Google (gg) and two pages optimized for AltaVista (al). If Inktomi were allowed to spider these pages as well because the pages specifically made for Inktomi, you will run the chance of being banned or penalized. Hence, it is often a smart idea to use arobots.txt file.
The robots.txt file resides on your webspace, however where on your webspace? The root directory! If you upload your file to sub-directories it will not work. If you needed to disallow all engines from indexing a file, you simply use the “*” character where the engines name would sometimes be. But beware that the “*” character won’t work on the Disallow line.
Here are the names of a few of the massive engines:
Excite – ArchitextSpider
AltaVista – Scooter
Lycos – Lycos_Spider_(T-Rex)
Google – Googlebot
Alltheweb – FAST-WebCrawler
Be sure to check over the file before uploading it, as you’ll have made a easy mistake, that might mean your pages are indexed by engines you do not need to index them, or perhaps worse none of your pages may be indexed.