Browser_calc Specifications

Browser_calc is a command line program, written in C, that will search through the agent_log from a webserver and produce statistics about the browsers and machines found. The performance of browser_calc is good, 3 million lines in the agent_log (or 160Mbytes) were processed in ca. 20 minutes on a SGI R4000. This has been obtained by using binary trees to store all the information from the agent log. The usage is:

browser_calc -f agent_log

alternatively using the stdin:

cat agent_log* | browser_calc

Produces results like this:

     Browser Results Feb/1996        Mar/1996        Apr/1996       

         Mozilla 1.1       0(  0.0)        0(  0.0)      234( 30.2) 
         Mozilla 2.x     748(100.0)      498( 99.8)      532( 68.6) 
         NCSA Mosaic       0(  0.0)        1(  0.2)        9(  1.2) 
               Total      748            499            775      

     Machine Results Feb/1996        Mar/1996        Apr/1996       

       Macintosh 68K       4(  0.5)        0(  0.0)        0(  0.0) 
       Macintosh PPC      10(  1.3)       20(  4.0)        0(  0.0) 
         Windows 3.x      24(  3.2)        0(  0.0)        1(  0.1) 
          Windows 95      18(  2.4)        4(  0.8)        0(  0.0) 
                 X11     692( 92.5)      475( 95.2)      774( 99.9) 
               Total      748             499             775       

In version 1b6, each line from the agent_log must have a section in square brackets, the time section, and a section in round brakets, the machine information, before it will be looked at.

Firstly the machine is looked at, the words between the opening round bracket and the first semicolon is taken as the machine unless:

  1. the word Mac appears in the input line, then if PPC or PowerPC are also in the line then Macintosh PPC is decided otherwise the machine is assumed to be Macintosh 64K.
  2. the word Win is found in the input line, if the words Win95 or Windows 95 exist then the machine is Windows 95, unless NT is found then the machine is Windows NT otherwise the machine is assumed to be Windows 3.x.

Next the browser information is considered. The initial assumption is that the word after the time section, but before the first slash (if it exists for the version number) is the browser name. Otherwise:

  1. the word MSIE exists in the input line, the Microsoft Internet Explorer is assumed. Usually this pretends to be Mozilla/1.22, so this is first checked.
  2. each of the main Netscape browsers are searched for in turn, to accumulate the beta versions together.
  3. all the versions of NCSA Mosaic are brought together.

Mr C. Leach
Last modified: Mon Dec 9 16:22:32 gmt