At Enrise we love to monitor things. We would monitor our coffeemaker if it had an interface to do so. Having extended monitoring does also have its downsides as they do have to be configured at one point be it via configuration management or by hand. Sometimes you’ll have to make things a little easier on yourself and automate things even more.
This blogpost is about how we took care of automatic SSL monitoring.
For our monitoring we make use of the open source monitoring solution Zabbix.
Background info on Zabbix
Our monitoring server takes care of executing and coordinating the checks, presenting the results and generating alerts whenever there is an issue detected. Configuring Zabbix can be cumbersome since there are so many things you can do with it.
Lets take a look at the inner workings of Zabbix first. There are many ways of monitoring but the most common is the use of Zabbix Agent on the machine to monitor which are generally used in active or passive mode. For passive mode, the server asks the agent at configured intervals to provide information about a key. Active checks are being sent by the agents themselves.
Each host has one or more templates assigned which contain items and triggers. Items are the things to check (e.g. diskspace usage, system load) and the triggers are the thresholds and actions. For instance, a trigger that monitors diskspace becomes active when there is 20% left.
Zabbix has many checks already built in but in order to integrate custom checks, for instance for your own custom software, you’ll need to configure “UserParameters”. This makes it very flexible since there is virtually no limit to what you can monitor as long as the output generated by the scripts/commands configured in UserParameters matches the item and trigger configuration.
Now we’re up-to-speed to the basics of Zabbix let’s get to the cool stuff shall we?
At Enrise we manage a lot of webservers for ourselves but also for the web applications we have developed for our customers. Since security is important these web applications are secured via SSL. Whenever a certificate has expired it raises security warnings in the visitor’s browsers so it is important that these are monitored closely and renewed when they are about to expire.
We started automating these checks by using the macro support which only required us to configure the hostname to check instead of creating new items and triggers for each host to be checked.
Using macros in Zabbix made it easier to maintain but this only scales to a certain point until it becomes hard to handle. Especially whenever the amount of items to monitor exceed the amount of configured items requiring new items/triggers to be added by hand.
For one of our customers we have reached this point since the environments kept on growing steadily which made us look for alternative means of monitoring the certificates.
Luckily Zabbix provides a “Low Level Discovery” functionality to make our lives easier.
Low Level Discovery with Zabbix
Low Level Discovery is quite an interesting concept. The LLD items provide JSON formatted output to the master which automatically creates items and triggers based on prototypes using the received content. The JSON could look like this:
With this information Zabbix automatically created 2 items and triggers (e.g. inform when there is less than 25% free space of any of the given filesystems) based on the given prototype.
Custom UserParameters can do the same which is what we applied to our SSL monitoring.
First, we wrote a script that retrieves all vhosts from the server. Since we have mixed environments we had to deal with that first. The script checks what webserver is available, executes what is necessary to retrieve the vhosts and outputs them in a format Zabbix can understand.
In order to make this available to Zabbix it has to be added to the Agent configuration:
After a restart this item returns a list of all vhosts that are available:
In Zabbix we need to configure the Discovery Rules.
First, an item to retrieve the vhosts:
Every hour this retrieves “discover.vhosts” from the agent. In the Filters tab we apply a filter so only 443 vhosts are being processed.
No need for fancy regexps here since the data is already clean.
If you want to leave out specific domains (e.g. localhost) you can add more filters to filter these out as well.
In order to act upon the received information prototypes have to be created.
The Item Prototype is a “template” which is being applied to each entries.
The External Check is one processed on the Zabbix server. The zext_ssl_cert.sh does the heavy lifting. In the prototype we have to provide the script the right information.
An item on itself doesn’t do much so we’ll also need triggers. This is where there is little snag in the plan. Normally with Zabbix its possible to create dependent triggers which reduces the amount of triggers active at the same time.
For our SSL checking we’d like to be informed at different priorities and intervals. For Trigger prototypes this is not possible and we’ll have adjust the trigger.
Once the triggers have been created it’ll look like this:
And that’s it! We now have a self-configuring setup for our SSL certificates with zero human intervention needed to add new or remove others. Every hour it retrieves new vhosts and every 6 hours (or whatever interval you’ve configured) the vhosts are being checked for their expiration date.
A next step would be automating webchecks but unfortunately these are not possible (yet) using Low Level Discovery.
We do however have a few ideas up our sleeves to work around this current limitation, we’ll share this on our techblog once we have come up with something.