|Log Analyzers VS Hosted Statistics
|Article ID: 34
|Created On: 22 Dec 2004
|Author: Joshua Johanson
|Edited On: 06 Jan 2005
|Edited By: Joshua Johanson
|Before you read, it is important that you review the definitions for page views, visitors, unique and returning visitors, visits and hits. Once you have some of the basic definitions out of the way, read this analysis between log analyzers and hosted statistics/trackers.
Log analyzers are pieces of software that parse through the log files generated by whichever webserver follows the standard webserver logging standards. These log files can be very large and are VERY detailed on which files were requested from your webserver and when. Every time a page, or image, or movie, or any other kind of file is downloaded off of your webserver, the date/time and IP address of the requestor is logged in the webserver's logfile. Whenever you want to generate a report with a log analyzer, the log analyzer will have to parse your webserver's logfile and generate a report based on the line by line play of what was requested from your webserver, categorizing that data into meaningful reports that can be read by the user.
PROS OF USING A LOG ANALYZER
Log Analyzers record every single request for download of a file, and also that requestor's IP address. This means that Log Analyzers keep track of a few things that hosted statistics/trackers cannot. For example, Log Analyzers can record visits from spiders, or crawlers. This is especially useful if you'd like to know when Google is parsing your site. Hosted statistics, up to this point, do not have a way to do this. Also, log analyzers can keep a tally of how much actual bandwidth may have been used over time. Because the log files contain the names and sizes of the files, they can be added together to find the total amount used.
CONS OF USING A LOG ANALYZER
Log Analyzers must be installed on the webserver. They can be painful to install and configure. You have to have a good network administrator that can administer the software itself.
Analyzers ONLY count a visitor by IP address. This is a very far from accurate method for tallying how many actual visitors came to your site in a given time. Log Analyzers will give a MUCH more inflated count of a site's visitors than they actually see, because of a few reasons:
1. IP addresses are often "DYNAMIC". This means they can change each time a person connects to the internet with their computer (usually when they use a dialup account). Other ISPs, like AOL, will actually CHANGE their IP address with every single visit from a visitor. So, for example, if I were an AOL user and I came to your site and viewed 10 pages, I would be counted as 10 visitors with a log analyzer.
2. Visitors can be counted that aren't really visitors. For example, when Google or any other search engine parses your site, they will all be counted as visitors because the log analyzer doesn't know the difference between an automated machine and a real person. There are many automated systems that can parse a site each day.
Another negative aspect of using a log analyzer is the fact that every time you want the latest reports, you have to have the log analyzer parse the log file. Parsing of the log file can take an enormous amount of time if the log file is very big. If you decide to move your site to another webserver or website hosting server, the log analyzer has to be installed on that webserver to continue your stats.
PROS OF USING A HOSTED TRACKER
CONS OF USING A HOSTED SERVICE
Currently there is no way for WebSTAT to know when Google or any other search engine parses a site. This is valuable data that we wish we could capture but haven't found a way yet to do it. Also, there is no way for WebSTAT to know about how much actual bandwidth a site uses, because WebSTAT cannot know the file sizes of the files downloaded.
With that said, there are some things that should be understood. Log analyzers will log EVERY SINGLE request for a page/file made. Hosted stats will ONLY log those pages that have actually been in a browser long enough to load most or all of the page so only the pages which can be actually LOOKED at, even if for VERY BRIEF moments of time, are counted. You make up your mind on which you think is best. I, for one, prefer to known only about the pages that have been actually SEEN by human eyes.