Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Expanded Goals

...

  • Primary and Backup dashboard: prod-newdboard01 and prod-newdboard02. A CNAME - dashboard-primary - points at either the 01 or 02 instance. 
  • Crowd authentication: prod-crowd and uat-crowd contain identical information, but systems (such as jiraJira) are configured to use one or the other.

Generally, all service deployment can be done in the context of redundant services providing high availability.  The impact to SWD is very low. Services should be deployed such that they are essentially the primary in every case.  

Goal

Goals

Provide an infrastructure which supports automatic Automate service failover.

Create a scalable infrastructure: deploy services independent on location.  Services should auto-register and be discoverable, and auto-deregister.

Provide an infrastructure which provides service discoverability, where services:

  • auto-register when available
  • auto-deregister when no-longer available

Provide a zero downtime scheduled maintenance framework.Minimise downtime through redundancy

Automate service recovery tolerant of hardware failure or outagessystem failure, within the context of the service reliability infrastructure.

Follow-up work: identify and document remaining single points of failure.

Structure

Each server will run the consul agent and include a config listing the services it runs and how to monitor them (to test if they are serviceable).  This should be maintained in puppet.

...