Gauging your federation's performance
It is important to constantly monitor your infrastructure on all levels, in order to react to system failure and see upcoming problems. There is a multitude of monitoring solutions on the market, and it is not possible to describe ways to monitor eduroam infrastructure for all of them; but we have provided a selection below.
First, for Europe, some parts of monitoring are done by the eduroam Operation Team which we will describe in the following section; please contact your own regional operator for the corresponding monitoring solution in your area if you are operating outside Europe.
In the then-following sections, we provide general tips for infrastructure monitoring.
Federation monitoring in Europe: the eduroam Operational Team
When you set up a federation-level RADIUS server, the OT will start monitoring your server availability and will send out email alerts in case of failure. This is done by the OT sending authentication requests for the special realm @eduroam.<TLD> from their monitoring server to your server, and your server is expected to mirror these back to the OT monitoring infrastructure. The technical set-up of this is described in the corresponding HOWTOs for federation-level RADIUS servers.
Server availablitity is tested every hour and the results are summarised on the following web page: http://monitor.eduroam.org/
Note that you can also get more detailed info, including a history, by navigating on the left-hand pane on that website.
There is also a more detailed diagnosis test, where a federation operator can request that a specific path (i.e. from federation A via the European root to federation B) is tested real-time on-demand. The web interface for this testing facility is online at: http://monitor.eduroam.org/inter/test_otm.php (access is restricted to eduroam federation operators only).
Monitoring inside the federation
There are several dimensions to infrastructure monitoring; most of which are unrelated to eduroam: system utilisation, hardware health, network reachability, a.s.o. There are many market solutions to monitor these aspects. It is beneficial to use a monitoring solution which can use plugins to execute some more eduroam-specific monitoring. Nagios and its fork Icinga have proven to be valuable to many eduroam participants, and the following plugins are considered useful.
Nagios/Icinga: EAP Login checks
The tool "rad_eap_test", which is a frontend to wpa_supplicant's "eapol_test", can be used for scripted authentication checks in Nagios. The added value over eapol_test is that eapol_test requires a configuration file on disk by the time of execution. rad_eap_test is completely command-line driven; it generates a temporary configuration file and deletes it again after eapol_test execution.
You can download rad_eap_test from here: http://www.eduroam.cz/rad_eap_test/
It requires eapol_test, part of wpa_supplicant from here: http://hostap.epitest.fi/
To compile eapol_test, unpack the wpa_supplicant distribution, change into the wpa_supplicant/ subdirectory and create the default config file by executing
Then, enable compilation of eapol_test by editing the .config file and setting (i.e. uncommenting)
You can then compile eapol_test with
Now, you need to tell the shell script rad_eap_test where to find the eapol_test executable; and tell the eduroam F-Ticks system that these are monitoring-only requests by setting a corresponding MAC address. Edit the rad_eap_test file and replace the lines
That's it for the prerequisites - we can now start defining Nagios/Icinga checks.
Implementing the checks
You would typically execute the Nagios checks by defining your Nagios server as a client to your FLR server, and send requests for known test accounts of your realms to that server.
You can define check commands like the following:
and later use the arguments as follows in your individual checks:
- ARG1 = anonymous outer identity
- ARG2 = inner username
- ARG3 = password
- ARG4 = EAP type (TTLS/PEAP)
You can also define similar checks for other EAP types; simply execute rad_eap_test without arguments to see which parameters it supports.
Example: You want to test a participating realm foobar.aq which uses PEAP, and for which you have the test credentials "testuser" and "testpass", and you want to test whether anonymous outer identities work properly. The corresponding service check is:
Nagios/Icinga: RADIUS/TLS certificate validity checks
You can use the commodity Nagios plugin "check_ssl_cert" from: https://trac.id.ethz.ch/projects/nagios_plugins/wiki/check_ssl_cert for this purpose. The check command is then:
and will warn you two weeks in advance that your certificate is about to expire when added to the host as a service check.
It is also important to measure how successful the service is in your area of responsibility. eduroam Operations has set up a statistics system called F-Ticks, which is able to capture all roaming events both on a national as well as an international level. It does not cover local campus usage though.
If your FLR server is configured to support F-Ticks (it is, if configured according to this cookbook), statistics will be generated automatically for that federation. They are accessible at the following website: http://monitor.eduroam.org/f-ticks/
On that web page, you can find historical evolution of roaming service usage in federations, as well as an overview which realms were most active, and from which countries visitors come from. In the future, detailed views per SP and per IdP can be made available if your federation opts to send the data in the extended detail level. Please contact your federation operator to find out which level of statistics your federation provides.
If you have configured your federation