| What is a Spider ? |
What is a spider ?A 'spider' is a searchbot - the tool that a search engine uses to crawl a website and index its pages for the search engine results. Because it 'crawls the web' it is popularly termed a spider.In fact there is of course no entity leaving a search engine's premises, setting out on a journey across the web, entering websites, listing their pages, and going on to the next site. It's simply a program that resides at the search engine's datacentre - but it's convenient to think of there being something physical that goes out there and grabs website data. Why do we need to bother about spiders ?Without spiders there would be no websites listed or appearing in the search results. So we need to make their task as easy as possible, and ensure that spiders are both welcome at our site, and have a straightforward job indexing our resources.Once you realise this fact, it becomes easier to arrange things so that your website is indexed properly - for example you will remove all Flash from your website navigation, as spiders hate it. Most cannot pass through Flash links, and those that can, do not do so easily. What is a good website for spidering ?Best practice for SEO means that a website must be set up to be easily spidered. All these points can be covered:
What is a bad website for spidering ?A poorly-arranged website will be indexed badly, and have low or erratic search results. These faults may exist:
It can be seen that websites need to be built to be easily indexed by search engines, as this is a commercial necessity. Sites can generally be repaired although it is cheaper to build them correctly in the first place. What is spider food ?'Spider food' is said to be useful resources that are not web pages, such as images, video, pdf files, and forums.This is because it can be seen that such alternative resources are favoured by search engine spiders and comparatively well-spidered compared to basic web pages. This is probably because search engines are looking for useful resources for their customers, and such items are slightly more favoured as there are less of them and they may present more useful or popular information. Therefore, a site should include such additional resources where possible. These can include such items as gfx, pdf files for download, mpeg video, Flash vid, charts, tutorials, reviews, forum, blog, wiki, directory, net resources, photos, images etc How does a search engine work ?The final part of the question 'what is a spider' is an explanation of how a search engine works, and how it uses the resources a spider finds. Here is a sequence that explains how search engines find a web page, how they index and rate it, and how it appears in search results.A search engine is a group of computers that may exist at one place, but is more commonly located at many computer centres, often called datacentres. There are research computers, storage computers, spidering computers, and server computers, which work as follows:
The search-answer sequence goes like this: - A person who wants to know something, opens their computer and asks their browser to find a suitable resource. - The browser connects to the search engine's datacentre. It asks a server computer there for information. - The server asks a storage computer for the listings, and a list of results is delivered to the enquirer. - The enquirer chooses a result from the list and clicks on it, and is passed along to the web resource chosen. They may choose a regular listing, aka an organic result - or they may choose an advertisement, and these are normally of the PPC type. Approximately 40% of clickthroughs are supposed to be for the #1 slot, the first result in the organic results. Probably the most impressive feature of modern search engines, especially the top performers, is their sheer speed. The way they can produce a ranked list of results for any enquiry in a second or so, from billions of resources, is impressive if not miraculous. The majority of new visitors to most websites come from search engines. If the site is a good resource, people bookmark it, and return later. Most conversions (orders, sign-ups etc) occur on a second or subsequent visit, not on the first visit. |
