Web Hosting, Web Design, Graphic design with MOJO!
PC Mojo Services
Technical Support with MOJO
Portfolio of web sites designed by PC Mojo
PC Mojo Technical Support
Technical Support with ANSWERS!
Web Technical Support
Contact Web Designer with MOJO
home < support < web site promotion < search engine tutorial < spiders
The PC Mojo Difference:
One of the tasks we will perform for you if you have your web site designed by PC Mojo is optimize your site for effective indexing by the search engines.
PC Mojo has over 15 years experience in computer programming and networking... We work closely with you until your technical problem is resolved...
GUARANTEED!

EEEEWWWWW - a SPIDER!!!

What's a Spider?
A lot of search engines send out what are called robots, spiders, webbots, crawlers and other things to visit your site and gather information about all your web pages. If you have the ability to get to your access logs, you can look for their tracks. If you don't have access to these logs, you should try and get it from your domain host or switch hosts. If you know what to look for, you can tell when a spider has come to call. If you spent a lot of time registering your site and were waiting for their spider to come calling, it's good to know that they actually showed! The logs will also tell you what they have and have not found, and you can go make corrections as necessary.

For a good explanation of spiders, please take a look at the following link, there's no sense in us reinventing the wheel and telling you all over again when they do it better!
http://info.webcrawler.com/mak/projects/robots/robots.html

How can you tell if the Spiders are visiting?
You have to be able to get into the access logs for your server and see who has visited your site. You should have direct access to these, but may be provided with a more 'user friendly' front-end or control panel type thing. As long as you can get the information on EACH visitor to your site, including who they are and what they looked at, you should be cookin'.

There are log analysis tools out there for free and for sale. We won't review them here, but the industry leader and the best that PC Mojo has seen is WebTrends. Be prepared to spend some money to purchase the product.

WebTrends is a GREAT product, it's the one PC Mojo uses. It's not cheap, but it isn't even close to being the most expensive software package out there. You should go take a look at their stuff, try it out, and buy it if you want to do your own analysis. OR... you can just have PC Mojo do it for you!

What are the names of the Spiders?
It should be fairly obvious when you look at your logs 'who' the spiders are and who the real people are. Depending on the log analysis software you're using, you may be able to filter exclusively for the robots if you know what their names are.
There are other people who stay right on top of all this stuff, PC Mojo reads their pages now and then, as well as using other resources, so we might as well send you over there. Just don't stay TOO long, come on back to get your Mojo workin'!

Spider Chart
http://searchenginewatch.com/webmasters/spiderchart.html
This is a great site, you should bookmark their home page, subscribe to their newsletter, and stop in now and then to see what's cookin' search engine-wise.

The Web Robots Database
http://www.robotstxt.org/wc/active.html
This place has a bunch of techie info on the whole spider deal, you just have to cruise around a little to dig out the good stuff.

How can you kill the Spiders?
If the idea of spiders crawling through your web site is just too icky for you, you can squash the better behaved ones. It's simple to do, actually, IF you have access to the root directory of your server. If you don't, if your web is co-hosted on somebody else's domain, you can ask them to set your web site up so that spiders are squashed right out of your site. If they won't cooperate, and having NO spiders rooting around in your site is important to you, it's time to shop for another place to host your site. PC Mojo, for example. <== blatant plug,
Basically, if you DO have access to the root directory of your server, place a file called 'robots.txt' in there with the following lines:

User-agent: *
Disallow: /

The first line tells robots that ALL of them are affected by the next line, which tells them that ALL directories of your site are off limits. Easy, huh? If you want to delve further into spider control via the robots.txt file, please check out the following link:
http://ihttp://www.robotstxt.org/wc/exclusion.html

You should also use the following meta tag in pages you do NOT want indexed:
<META NAME="ROBOTS" CONTENT="NOINDEX">

home | services | support | portfolio | contact us