First, let’s define what split-brain is. Each replica can be either connected or disconnected towards to the other. If the replica spontaneously goes to StandAlone. It means that it refuses to accept the state and don’t want to synchronize with the other. This is a classic split-brain situation.

Continue reading

Over the past few years of tight work with LINSTOR and DRBD9, I have accumulated a some amount of problems and solutions for them. I decided to collect all of them into single article. Not sure that you will face exactly the same problems, but now you could at least understand the mechanics of managing and troubleshooting the DRBD9-devices.

There is not much information on this matter on the Internet. Hope you’ll find it useful in case if you use or plan to use LINSTOR.

Continue reading

etcd is a fast, reliable and fault-tolerant key-value database. It is at the heart of Kubernetes and is an integral part of its control-plane. It is quite important to have the experience to back up and restore the operability of both individual nodes and the whole entire etcd cluster.

In the previous article, we looked in detail at regenerating SSL-certificates and static-manifests for Kubernetes, as well as issues related to restoring the operability of its control-plane. This article will be fully devoted to restoring an etcd-cluster.

Continue reading

Kubernetes is a great platform both for container orchestration and everything else. Recently, Kubernetes has gone far ahead in terms of functionality, security and resilience. The Kubernetes architecture allows you to easily survive various kinds of failures and always stay afloat. Today we will break the cluster, delete certificates, rejoin nodes on live, and doing all this fancy stuff without possible downtime for already running services.

Continue reading

Author's picture

Andrei Kvapil

DevOps / Cloud Architect

WEDOS Internet a. s.

Czech republic, EU