In the incident page I have explained each downtime for cacti.  Here I would like to explain a bit about our perspective of where cacti is and how to go forward with it.

Cacti is an application that is very old and outdated.  There is very little usable documentation and the code is extremely poorly documented with almost no comments written within.  The Geant implementation which includes many changes from the original code, makes it almost impossible to ascertain it's relation to any sort of packaging or version control.  For instance, there is have code running for a old developers home directory.  Many of the programs and scripts don't use the systemwide configuration files so it is difficult ascertain what uses what configuration.


To put this application into our standard environment system would take a great deal of work.  I would not want to attempt it with a serious commitment to this software which I understand is planned for expiration.


Most of the problems associated with cacti were caused by the RRD file sync and issues with Crowd which is it dependent on for login.  We attempted to merge the databases from both servers into one and put it in a cluster which is more reliable and we at least have some DevOps procedures for.  We can easily move it back to the local systems if there is concern.  But we can't keep bailing out this application and take blame when something happens during repair or improvement.  Perhaps the best thing is to leave it completely alone until it is retired.  





  • No labels