If you see that only half the graphs are working, this is probably due to one of two reasons:

  • An instance has lost SNMP connectivity (devices appear to be down)
  • An instance is unable to rsync updated RRD files to its partner instance.

Both depend on Puppet modules working accurately.

If you have SSH access to the Cacti server prod-cacti01-fra-de.geant.net, you can check the latter reason by looking to see if there are plenty of RRD files that are being updated in the last few minutes:

ssh -l <username> prod-cacti01-fra-de.geant.net
ls -lart /opt/cacti/rra | less

You will see many old obsolete files for Terminated services that haven't been updated recently but there should be a decent amount at the end of the list.

  • No labels