DEV Community

Cover image for GitHub High Availability – Why It Should Never Be Considered as a Regular Backup
GitProtect Team for GitProtect

Posted on • Updated on • Originally published at gitprotect.io

GitHub High Availability – Why It Should Never Be Considered as a Regular Backup

We do backups. That is a fact. The open question is whether we do them well. You can discuss this, show the pros and cons of specific solutions, or refer to examples you have encountered by yourself. Is our own solution better? How about third-party tools? The cloud? There doesn’t exist an ideal solution, because often the approach to backup and recovery depends not only on what our product is, but also on what technologies we currently use, and our experience with them.

What is High Availability

GitHub Enterprise Server is a self-hosted platform for software development. Due to the fact that it runs on our own infrastructure, we can decide on security policies, access control, monitoring, and other characteristics. It even has its own tools for backup solutions. Moreover, GitHub Enterprise Server has also introduced some additional protection. It’s good that we have a backup in case of any problem, but this is only a response to the consequence of our application malfunctioning. However, what about disruption? What to do during these outages? During these critical events, we are dealing with data loss so we have to take care of this aspect, as well.

High Availability mode allows us to minimize the interruption of our services in the event of unexpected technical problems with our appliances. For example, network issues, hardware failures, or software crashes. Such a secondary instance is in sync with the original one, and proper configuration allows us to build an automated setup for asynchronous replication of Git repositories, MySQL, etc. Such replicas can operate in the so-called “active/passive configuration” which means that the replicated appliances are up-and-running as a standby. In that mode database services run in the replication mode, but all our services in that instance are stopped.

You should also be aware that GitHub Enterprise Server limits the number of such instances. The information from the official documentation looks precisely like this:

GitHub Enterprise Server limits

High Availability configuration

GitHub Enterprise Server provides us with a tool to help configure our HA replicas. Using an SSH connection, we can connect with the replicas and run some commands to trigger proper actions. Let’s look at them:

  • ghe-repl-setup – puts Server appliance in replica standby mode
    encrypted VPN tunnel is configured for communication
    database services are configured for replication
    application services are disabled

  • ghe-repl-start – turns on active replication of all datastores

  • ghe-repl-stop – stops replication services, can be resumed by

  • ghe-repl-status – returns status for each datastore replication stream, use flag -v for detailed message

  • ghe-repl-promote – disables replication and takes the replica appliance to a primary status

The replica appliance is a redundant copy of the primary appliance. If the main one fails, the high availability mode allows the replica to act as the primary appliance. The creation of the replica itself is quite easy. Without going into technical details, we need to perform the following steps:

  • create a new GitHub Enterprise Server appliance (mirrored to the primary one)
  • make sure that communication between instances works well
  • set up admin password (need to match the primary one)
  • add SSH key
  • use ghe-repl-setup command with proper parameters
  • add the public key to a whitelist
  • start replication

Configuration for geo-replication is quite different but I will not focus on that feature right now. The idea is similar and that’s enough at that point.

Is High Availability similar to clustering?

Well.. no. HA provides redundancy by primary or secondary failover configuration. Clustering, on the other hand, provides redundancy and scalability by additional nodes. In both cases, we avoid the risks associated with using a single node only. So we can avoid the dangers like hardware failure (in one node), virtualization issues, or network issues. However, clustering allows us to do horizontal scaling, which gives us more options and can be crucial for our product.

There is another fact working against HA on the topic of scaling-out. Although there is something called geo-replication that allows us to distribute traffic geographically, there is one “problem” here. The performance of the HA solution is limited to the speed and availability of our primary instance. This is because each written request to a replica requires sending data to the original appliance first, and then distributing it to all replicas.

High Availability is not a backup solution!

Let’s start by saying that neither solution is in any way a substitute for a full-fledged backup approach. Even the authors themselves report that HA should not be treated as a backup. Let me quote them here:

“A high availability replica does not replace off-site backups in your disaster recovery plan. Some forms of data corruption or loss may be replicated immediately from the primary to the replica. To ensure safe rollback to a stable past state, you must perform regular backups with historical snapshots.”

This is due to what I have described above, which is how data replication works with High Availability. A backup and recovery plan is necessary regardless of whether you use GitHub HA or not. Regarding the need to back up your GitHub environment, by the way, I recommend reading our article dedicated to GitHub backup best practices.

Conclusion

According to the definition and common sense, a good backup solution should include automation, encryption, data retention, and restore plan, etc. Neither GitHub HA nor clustering provides us with this. Both solutions have their right to exist and we will certainly benefit from using them, but their task is quite different. GitHub High Availability is not a backup, so we need another tool for that. There are a few options, but when it comes to providing “business continuity” GitProtect.io is the only one with real Disaster Recovery features and backup replication. Both are critical aspects of any DR planning. But that’s a completely different story.

✍️ Subscribe to GitProtect DevSecOps X-Ray Newsletter – your guide to the latest DevOps & security insights

🚀 Ensure compliant DevOps backup and recovery with a 14-day free trial

📅 Let’s discuss your needs and see a live product tour

Top comments (0)