Real-time analysis of the web server log with GOACCESS
GOACCESS is a small analysis tool to be installed on the Linux web server that allows you to monitor log data in real time through a report visible both via the SSH console and in an HTML file.
In a previous article I explained why it is important to analyze the web server log and how to analyze it with Excel. Today we see HOW to analyze it in real time with a small server-side software called GOACCESS .
Index:
- Install GOACCESS
- Configure GOACCESS
- View the LOG in real time via SSH
- Save the log in HTML
- Update the report via crontab
- View the LOG in real time via HTML
- Extra commands
To install GOACCESS you need to have root access to the web server via SSH.
Install GOACCESS
I created this guide while installing GOACCESS on my server which is based on Ubuntu and uses NGINX as a web server . For installations in different environments, refer to the official website or to the repository on Github .
To install GOACCESS on Ubuntu connect via SSH console to your web server. The following instructions allow you to install the latest version of the software.
Now that you have GOACCESS installed you need to configure it to help interpret the log format. Apache in fact has a slightly different format than Nginx which in turn is different from other logs.
Configure GOACCESS
In particular, it is necessary to configure the time-format , date-format and log-format . These instructions tell GOACCESS how to read the log strings and extract the time, date and information entered in the log.
The GOACCESS configuration file is usually found here: /etc/goaccess.conf, depending on the type of installation you have done you can also find it in / usr / etc / or / usr / local / etc /.
To edit the file type in the SSH console
According to your web server you have to enter the 3 parameters seen above. For Nginx web server add these three lines or, if present, remove the comment “#”:
For Apache web server use this configuration:
Attention, uncomment only one log-format based on your installation, do not enable both!
View the LOG in real time via SSH
You are now ready to launch GOACCESS and view the data in real time. GOACCESS was born as a tool to be used via SSH, so let’s see how to view the log from the console. First of all you have to locate your logs, in my environment I find them under: / var / log / nginx / .
Now that we know where the log is, just run GOACCESS from the command line:
- The -f parameter is used to indicate the path of the log file
- The -a parameter allows you to view the user agents
- The -m parameter enables the mouse in the dashboard
Replace the log name with your own! Also remember that the real-time log is the most recent (access.log) but nothing prevents you from opening a previously saved log (for example access.log.3).
The dashboard contains the critical information extracted from the web server log:
- Total Requests
- Unique Visitors
- Unique Files
- Referrers
- Valid Requests
- Processed Time
- Static Files
- Log Size
- Failed Requests
- IP Hits
- Unique 404
- Bandwidth
- Log File Location
- Unique visitors per day – Including spiders
- Requested Files (URL)
- Static Requests
- Not Found (URLs)
- Visitors Hostname and IPs
- Operating Systems
- Browsers
- Time Distribution
- Referring SItes
- Geo Location
- HTTP Status Codes
The data is updated in real time, run a scan with Screaming Frog to verify. You can use quick commands to navigate the dashboard data:
- F1 / h: help
- F5: window refresh
- q: to exit GOACCESS or to close the active module
- o / ENTER: expand the highlighted module
- 0-9 and Shift + 0: set the active module
- j: Scroll down within the active module
- k: Scroll up within the active form
- c: change the color scheme
- ^ f: Scroll forward within the active form
- ^ b: Scroll backward within the active form
- TAB: switch from one module to another (next)
- SHIFT + TAB: switch between modules (back)
- s: Active form options
- /: search the form via Regex
- n: searches for the position of the next occurrence
- g: to move to the first element of the page
- G: to move to the last element of the page
Save the log in HTML
In case you don’t like using the dashboard via SSH you can always export the report in HTML.
You must specify a path for the HTML file, which has the right permissions and which is reachable not only from localhost. In my case the string is the following:
Here you can see a demo of the GOACCESS HTML report.
To refresh the report you need to run the command again or create a cronjob.
Update the report via crontab
From the SSH console type
crontab -e
select the editor you prefer (advice nano or vim) and add a line following this logic:
This cronjob updates the report every hour. Save the link to the report.html so you can view it updated whenever you want, without having to log into the server via SSH.
View the LOG in real time via HTML
GOACCESS also allows you to view a report in real time simply by typing this command:
- The -oserver parameter to send the output to another file (HTML, JSON or CSV).
- The –real-time-htmlparameter activates the real-time report.
- The –ws-urlparameter is optional, it is used to view the report on a host other than the monitored server. The address of the host without http is entered as an attribute of the parameter.
Real-time log analysis
Extra commands
Below I have collected some console commands that are useful for launching filtered analysis of the web server log. Remember to change the path to your log file.
GOACCESS also allows you to export reports in JSON and CSV
To generate the report in JSON:
To export to CSV:
Through the Grep function it is possible to filter, for example, only the hits of Googlebot :
To process only the last log:
To process all historical logs:
To process today’s data only:
To process Googlebot only:
To process AMP pages only:
To process just one page:
Here you will find more examples. What software do you use to monitor the web server log in real time?