A web server keeps record of visitors to a website in a file called web log. Thus a log file contains records of client visits. Typically, a log file contains information about visitor (user or client) IP address, date and time of visit, referral information, and so on. See table 1 for a complete listing of logging format fields available with IIS 6.0 in the W3C Extended Log File Format.
| Table 1 Listing available W3C Extended Log File fields in IIS 6.0 | |
|---|---|
| Field | Comments |
| date | The date on which the visit took place. |
| time | The time when visit occurred. |
| c-ip | The IP address of the visitor (user or more accurately client) making the visit. |
| cs-username | Indicates name of the authenticated user who made the visit. (Note: anonymous users are indicated by a hyphen.) |
| s-sitename | Contains client's internet service and instance number. |
| s-computername | The server name on which the log file entry was generated. |
| s-ip | The server IP address on which the log file entry was generated. |
| s-port | Refers to server port number. |
| cs-method | Indicates requested action, i.e., GET method. |
| cs-uri-stem | Indicates target of the action, i.e., active-server-pages.asp. |
| cs-uri-query | Contains query string used in dynamic pages. |
| sc-status | HTTP status code |
| sc-win32-status | Windows status code |
| sc-bytes | Number of bytes sent from the server |
| cs-bytes | Number of bytes the server received |
| time-taken | Length of action time (in milliseconds) |
| cs-version | The protocol version (FTP or HTTP) that the client/visitor used. |
| cs-host | Host header name |
| cs(User-Agent) | The type of browser the client used. |
| cs(Cookie) | Content of any sent or received cookie |
| cs(Referrer) | URI of the previous visited site (or web page) |
| sc-substatus | Substatus error code |
Note: in IIS Manager you can check/uncheck the fields shown in table 1 to suit you specific web site needs. For example, you may choose to not log cs-username (user name) if your site user are not authenticated.
So your web server has all these fields of records of a particular visit for each visit. As the number of visits increase, so will the records in the log file. The log file may become huge depending on the traffic volume and how many times is written to the log file. You may now be asking:
The answer to the first question is that we not only want to make our website available but also want to learn about what and when clients are accessing the website. Once a website is publicly accessible, it can be accessed by anyone, be it a hacker or a legitimate user, having internet connection. Although log files may not readily provide (assuming logs are not analyzed and viewed frequently) details of what happened (whether the server generated an error for particular request or if the request was successful) in each visit, they may reveal meaningful and useful information that you may not have thought possible. With the analysis of the log files, you may learn:
Answers to these questions may help you may make your website more appealing or satisfying to the user. If, for instance, you know the most frequently visited pages on your website, you may enhance those and perhaps consider promoting any new product or service on those pages for maximum exposure. Additionally, you may emulate the content and styling of most of the frequently visited pages to the rest of the pages of your website.
If the log files reveal the presence of foreign visitors, you may consider translating your website to the needs of those language speakers.
The analysis of the logs may also reveal that you have almost short visits. This may suggest you to improve your site so people can spend more time viewing it.
The motivation behind tracking visits (users or clients) is to learn wealth of information that can potentially help improve the website for future visits/users.
We have already alluded to the answer to our second question (What do we do with the collected data?): we analyze the collected data. We analyze the log files to further improve our websites. The collected data, among other things, helps us find click pattern. A click pattern refers to the route a single user takes to navigate through a web site. Click patterns are also sometimes referred to simply as paths. Because the web server log file can contain many lines of records, it would be difficult to analyze or find any other desired information without a sophisticated analysis tool. Examples of web analysis tools include WebTrends, ClickTracks, and Web Log Explorer. (If you already have a log file to analyze, you may download a free trail version by clicking on the following link: Web Log Explorer Log Analyzer. While this tool is free to try and inexpensive to purchase, it does provide useful information.)
Why would we be interested in click pattern? A click pattern is very important to any website manager. The information about click patterns can be used to improve the navigation and structure of a web site. If, for example, it takes minimum of 10 clicks to the destination page, how likely is it that page will be ever visited. The chances of a user navigating to such a deep page are next to nothing. Such deep level navigation structure would suggest that the web site is not very useful and has much less efficient click pattern. If, on the other hand, a website that allows its visitors to find any page with at most say 4 or 5 clicks, that would suggest that users will not have much difficulty finding desired page. Consequently, that site will probably have a more efficient click pattern than the other example. An efficient click pattern may also suggest that the user is very familiar with the website.
We have already hinted to the answer to our third question: how do we analyze those records? We can analyze the web log records with number of software including Web Log Explorer Log Analyzer. As you learn the process of analyzing logs with a web log analysis software, you will notice that you need more than one tool. ClickTracks Analyzer 6.1 (the latest version), for instance, does not provide you with spider activity or the break down of the visits by country or city. Web Log Explorer Log Analyzer gives you that information but does not show neatly show click patterns like ClickTracks does. So there is a tradeoff. Try the free trail versions of these software to learn more about analyzing web logs.