You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

OpsDB runs on two servers, named appropriately prod-opsdb01.geant.net and prod-opsdb02.geant.net where xxxxx = prod, uat, or test


First Steps


If for any reason the system becomes unavailable, the initial action is to make sure we have switched from the ‘Primary(01)’ instance of OpdDB to the ‘Secondary(02)’ instance. This will allow the general user to continue working on OpsDB whist we continue with our investigations as to why it initially went down.


If we find that both instances have become unavailable, then contact with IT / SWD is of the upmost urgency as further investigation, steps, and decisions will have to be taken across departments (i.e. IT / SWD / OC) as to the best way forward to resolve these issues


Change the Domain Name System (DNS) entry for OpsDB (i.e. Move from one instance to another)


Currently the URL OpsDB.dante.net resolves to our primary instance of our OpsDB Server (which we call ‘Primary’, or ‘01’) – Prod01.geant.net.

If for any reason the primary instance of OpsDB becomes unavailable we need to change the OPDSB.dante.net URL to resolve to our secondary instance of our OpsDB server (which we call ‘Secondary’. or ‘02’) - Prod02.geant.net.

Software development would not normally be involved in this, the exception being if we noticed ’01’ had gone down we would perhaps request the switch via DevOps.

The action taken by devOps would be as follows:

‘Change the CNAME opsdb.dante.net in Infoblox, to point either to prod-opsdb01.geant.net or to prod-opsdb02.geant.net depending on which instance you wish to point to.’

Once this has been done the system should then be available to the users once again whilst more detailed investigation takes place into why the Primary instance has become unavailable.

Please do not forget to inform the users that OpsDB is back up once this has been done.


Further Investigation



The following points may help troubleshot any issues that arise with this application. 


Check Apache.

  • Has apache failed? Is it running?

      Log into the appropriate VM

      As Root user issue the following command at the command line:

              systemctl status httpd

              (or)

              service httpd status

              (If no root user, prefix both commands by sudo)

      You should see output something like this:

              [mark.golder@test-opsdb01 ~]$ sudo service httpd status

              httpd (pid  18768) is running...

             [mark.golder@test-opsdb01 ~]$

  • Start / Restart Apache

      If you need to Start / Restart the httpd (apache) server issue the following command at the command line:

             systemctl restart httpd

             (or)

             service httpd restart

             (If not root user, prefix both commands by sudo)

      This should start or restart the http server (apache) on the VM – please perform this on both VMs separately.


Check MySQL.

  • Is the MySQL instance running?

      Log into the appropriate VM

      As Root user issue the following command at the command line:

            systemctl status mysqld

            (or)

            service mysqld status

            (If no root user, prefix both commands by sudo)

  • Start / Restart MySQL

      If you need to Start / Restart MySQL issue the following command at the command line:

           systemctl restart mysqld

           (or)

           service mysqld restart

           (If not root user, prefix both commands by sudo)

      This should start or restart MySQL on the VM – please perform this on both VMs separately.


Recovery of MySQL Data

       Currently MySQL data backups are stored in the /opt/vackups/mysql folder within each VM.

       Each day the daily DB dump, from each server, is also copied to an appropriate place on the Data Warehouse machine.

       To restore any of these instances of data, locate the appropriate DB dump and go through the mysql restore procedure (documented elsewhere in MySQL documentation)


Security Updates with underlying software and operating systems

        OpsDB is, in terms of software, an ‘old lady’ now, awaiting retirement.

        It is currently written using PHP 5.3.3, HTML, JavaScript, and runs in a Linux system environment (Centos).

        Centos - CentOS-6 updates until November 30, 2020

        PHP 5.3.3 FINISHED being officially supported,  but being supported via centos back porting of PHP security releases – end of life same as centos 6 system.

        HTML / Javascript are currently supported and have no future planned support end dates, in fact older versions are more supported than the latest ones!.


Check disk usage

        Is the VM disk full?

        Is the allocated OpsDB disk space full.

        This should already be being monitored and reported upon if it is becoming full , so this scenario should never occur.



  • No labels