Answer Question
Why is Webalizer showing less page views than what reported in raw log files?
My Webalizer server stats is reporting a number of page views far and away lower than what I found by looking into my Raw Log files. As I'm not sure if I am interpreting my raw log correctly, I would appreciate your advice about the method I am using in doing it. Here's an example of how a line of my actual raw log files looks
62.77.164.550 - - [20/May/2009:09:28:04 -0700] "GET /index.php/ HTTP/1.1" 200 45626 "http://www.google.ie/search?hl=en&q=..." "Mozilla/4.0 (compatible; ....)"
below I write it in broken out form (with each numbered value corresponding to a field in the line):
1. 62.77.164.550
2. -
3. -
4. [20/May/2009:09:28:04 -0700]
5. "GET /index.php/ HTTP/1.1"
6. 200
7. 45646
8. "http://www.google.ie/search?hl=en&q=..."
9. "Mozilla/4.0 (compatible; ....)"
In order to know the number of page views I take only the files with extension .php included in the field 5. Now, what I find through this method is a number of pages hugely bigger than what Webalizer reports.
Also I noticed in 84% of raw log lines the field 8. (referrer) looks like "-". I know that when it happens the user typed in the URL or used a bookmark to load the page. However I think 84% of pages views from bookmarks or direct requests would not be realistic. When the field 8.(referrer) looks like "-" could it be the case of visits only from web spiders or crawler bots? Again I think 84% would not be realistic.
Is there a method to know which of log-entries are caused by robots?
Also, as most of my pages views have a dynamic Url I wonder does Webalizer read all of pages views having dynamic Url?
TEXT
Uploading file and scanning for virus...
Please Wait