MYSQL HA 서비스 ( drbd + heartbeat + mon)

THE CONCEPT

The concept of an active/passive fail-over Cluster is the following:

  • You have 2 servers (also called nodes).
  • They communicate over a cluster software (Heartbeat, Corosync, OpenAIS, Red Hat Cluster Suite).
  • They are running on DRBD or have a shared storage (SAN, NAS) connected to both nodes.
  • MySQL is only running on ONE node (active), the other node does nothing (passive).
  • You reach MySQL over a Virtual IP (VIP)
  • In case of a problem the cluster software should detect those and fail-over the resources including the VIP to the passive node. There your database should continue working a few minutes (from < 1 Min up to > 60 Min) later.
  • There should be no need to reconfigure or change anything in your application.

COMMENT ABOUT THE USED TOOLS AND THEIR VERSIONS

The following description bases on MySQL 5.1 (the database version does not really matter here). It was set-up on CentOS 5.5 but other Linux distributions and versions should work similarly.
Further we used DRBD v8.3.8 and Heartbeat v3.0.3. We configured Heartbeat to use the version 1 mode because of 3 reasons:

  1. We got troubles with Corosync, Pacemaker and Heartbeat (node was shutdown without any obvious reason because of wrong return codes of the underlying Heartbeat scripts) and we found at least one bug in Corosync.
  2. Pacemaker/Corosync is IMHO more difficult and, at least for me, less transparent than the old Heartbeat version 1 mode. IMHO it is an overkill for a simple active/passive fail-over Cluster.
  3. Configuration files are human readable and pretty simple.

I am aware, that with this opinion I am antiquated! The modern way to do it is Pacemaker/Corosync. We will investigate about this solution later again…

BEFORE YOU START

Before you start I recommend you to do a little sketch with all relevant components on it:

drbd_sketch.png

HARDWARE PRECONDITIONS

For testing purposes we were using 2 virtual machines on a VirtualBox from Oracle. In practices you should avoid virtual systems because of performance reasons. Further with Enterprise VM servers (as for example VMWare ESX server) you have such fail-over capabilities already implemented on VM level. So IMHO it does not make sense to have fail-over capabilities on two layers…
An other approach for getting High Availability when you do not have an ESX server you can find here: MySQL on VMware Workstation/DRBD vs. VMWare ESX Server/SAN.

Typically it is a good idea for HA reasons to have 2 network cards with 2 ports each so we can bind 1 port of each network card together to a bond (make sure that you bind 2 ports of different cards, not of the same card, otherwise it will not make much sense).

If you have 2 independent power supplies for each node it will not hurt.
On the disk system ideally you have a RAID-1 or RAID-10 for redundancy. When you have a SAN or similar it could make sense to attach the SAN with 2 paths to each server (multipathing, this is not discussed further here).

More redundancy typically means more availability but also more complexity.

So for our set-up we have 2 (virtual) machines with 4 network ports each and a local attach storage with 2 devices (partitions, disks). One device for DRBD and one device for all the other stuff…

NETWORK AND OPERATING SYSTEM SETTINGS

Typically it makes sense to work as user root for setting-up the cluster software. The following steps have to be taken on both nodes.

Before you begin it is a good idea to do an update of the operating system and all installed packages. You can avoid some troubles which are already known and fixed:


At least for testing purposes you should disable the firewall between the cluster nodes and between the ping nodes. Clusters are communicating over various ports to different locations so firewalls between cluster nodes just disturbs in the beginning.
If you are more advanced you can think about having useful firewall settings around your cluster.

To check, stop and disable firewall use the following command:


On CentOS it looks like SELinux is installed and enabled by default. This just disturbs for testing and thus we disable it:


To make it persistent after a reboot we also have to set it in the configuration file:


To keep things simple we use short network names:


And for ease of use we give the servers meaningful names:


To make the change visible for the system (without a reboot) you have also to set it manually:


When we refer to a server name it should match the following command:


TIME SYNCHRONIZATION NTP

It is important that all nodes in a cluster have the same and correct time. In case of troubles this makes searching for errors much easier.
To have a correct time on your server install a NTP client:


NETWORK – BONDING

Before setting up anything else it makes sense to set-up the bonding stuff first. Then you can test if the bonding works and then forget about it for the later steps.

In our set-up we have 2 servers with 2 network cards and 2 ports each. So we bind port 0 from network card 1 with port 0 from network card 2 to bond0 and port 1 from network card 1 with port 1 from network card 2 to bond1 (see sketch above).

In our set-up we decided to use the bond0 for external communication and bond1 for the internal node to node communication.

To configure a bond manually you can use the following commands:


On my virtual machine I got a kernel panic when I used the wrong interfaces! 🙂 So be careful! But this should not happen on real boxes. Maybe I just did something wrong with my VM configuration…

To check if bonding works correctly the following commands can help:


and


and


To destroy the bonding again you can use the following commands:


To make the bond permanent you have to change the following configuration files:




After the network restart the bond should show up:

TESTING

To make sure the actual bonding works we typically use ping on the IP address and unplug the cables.

LITERATURE

  1. Bonding (Port Trunking)
  2. How to Set up Network Bonding on CentOS 5.x Tutorial

INSTALLING DRBD

Installing DRBD 8.3 works straight forward. On both machines:


If the module is not loaded automatically you can load it with the following command:


Then prepare your device/partition:


Then we can start configuring DRBD. The configuration of DRBD is done in the following file:


To avoid problems make sure, the configuration file is equal on both nodes. Then on both nodes you have to run the following commands:


drbdadm up includes the following steps:

  • drbdadm attach drbd_r1
  • drbdadm syncer drbd_r1
  • drbdadm connect drbd_r1

With the following command you can see what DRBD does:


You should get a value of: Inconsistent/Inconsistent.

Then we do an initial device synchronization on one node:


Now we should get a value of Secondary/Secondary with cat /proc/drbd

To make one DRBD node primary (with MySQL databases we should only have Primary/Secondary roles, never Primariy/Primary otherwise you destroy your InnoDB data files) run the following command:


Then we can format and mount the device:


Do not add the device to the fstab. This resource will be controlled by Heartbeat later. If you add it to thefstab this will cause some conflicts during reboot of the server.

To manually fail-over the DRBD device you should proceed as follows on the node where the DRBD device is mounted and/or Primary:


Then on the other node:


This will be later automatized with the cluster suite (Heartbeat).

LITERATURE

  1. Configuring DRBD

INSTALLING MYSQL

Installing MySQL is straight forward. Just use your preferred MySQL installation method:


I personally prefer to install MySQL on each server and do not place the MySQL binaries on a DRBD device. This has the advantage that you can upgrade the passive node first and then do a fail-over and later on upgrade the other node.
If you put the MySQL binaries on the DRBD device you have the advantage that you only have to upgrade the binaries once.

To avoid different MySQL configurations on both nodes I prefer to locate the my.cnf on the DRBD device. This has the big advantage that you never have different configurations on both nodes and you will not experience bad surprises after a fail-over.

To make this possible we only have a very small /etc/my.cnf:


Then we install our database on the mounted drbd device:


Ideally you clean up the already installed /var/lib/mysql to avoid confusions. Then we create our my.cnf:


Because MySQL will be controlled by the Cluster Software (Heartbeat) we have to disable the automated start/stop mechanism on both nodes:


Now we can try a manual fail-over including MySQL. Start with the active node:


Then on the other node:


HEARTBEAT

As stated above Heartbeat is not a contemporary tool any more. Never the less it is quite easy to configure, straight forward and has human readable configuration files.

To install Heartbeat use:


You can find some sample configurations under /usr/share/doc/heartbeat-<version>

We configure the Heartbeat in the old (v1) style. XML is not a format made for humans and old style Heartbeat configurations are easily human readable:

There are 3 files located under /etc/ha.d/



And finally the authkeys file should be secured:


Under normal circumstances all 3 files should be the same on both nodes of our Cluster. To ensure this you can either edit them on one node and distribute them with the following command to the other node:


or if you prefer to configure them on both nodes independently I usually use this method to compare them:


Now stop all resources and start Heartbeat on both nodes at the same time (more or less):


You should see MySQL, DRBD and the VIP started on server1. Try to access you MySQL database from remote through the VIP.


If this works we try a manual fail-over. On the passive (standby) node run the following command:


Follow all the steps in the syslog on both nodes:


If everything is OK all the resources should be move now to server2.

If you run into error or warning messages try to find out what it is and fix it. One thing we found is that the start/stop script of mysqld in CentOS 5.5 seems not to work with Heartbeat. After we changed to following line in the script it worked fine for us!

TESTING

When your MySQL HA Cluster works properly without any error message and you manage to fail-over for- and backwards it is time to test all possible different scenarios which can go wrong:

  • Disk failures (just unplug the disks)
  • Multipathing fail-over if you have such.
  • Stopping nodes hard (power off)
  • Restarting a server (init 6)
  • Unmounting file system
  • Try to kill DRBD
  • Killing MySQL
  • Stopping VIP
  • Stopping Heartbeat
  • Bonding fail-over (unplug one cable)
  • Split brain (disconnect the interconnect cables simultaneously)
  • etc.

Theoretically all those cases should be handled by your set-up properly.

MONITORING

When you stop/kill MySQL or the VIP you will notice that no fail-over happens. Heartbeat will not notice this “failure” and will not act accordingly.

If you want to fail-over in such a case is to discuss and depends on your personal flavour. We typically in such a situation set-up a monitoring solution on the active node which just stops Heartbeat and triggers a fail-over if MySQL is not reachable any more over the VIP.

The philosophy behind this decision is that it is worse for the customer when he cannot work for 15 or 30 minutes until we found the problem and restarted MySQL or the VIP. So we trigger a fail-over an hope MySQL comes up properly on the other side again. In most of the cases this should be the case.

If not it is still the right time after the fail-over to figure out what went wrong. We stop Heartbeat to guarantee that no failback will take place without human intervention. Just do not forget to restart Heartbeat on this node after you found what was the problem.

In our set-ups we typically use mon to monitor the MySQL service. Mon is configured as follows:


All those events should be monitored and immediately reported to the DBA/System Administrator. So he can investigate immediately what has happened and fix the problem to avoid a later complete system outage.

To monitor DRBD we have added a module to our MySQL Performance Monitor for MySQL. When DRBD is in the wrong state an alert can be risen.

It is further a good idea to regularly check a cluster fail-over. I recommend once every 3 month. If you trust your cluster set-up that is 15 minutes of work off your peak hour. If you do NOT trust your cluster set-up you should not use it at all.

TROUBLE SHOOTING

If you once manage to run into a Split Brain situation with DRBD find out which is the side you want to continue with and then do on one side where status is cs:StandAlone ro:Primary/Unknown:


and on the other side:


Be careful: This destroys your data on the DRBD node you run this command! Ideally you do a backup of both nodes beforehand.

And now have fun setting up your MySQL HA set-up…

태그

코멘트 쓰기

이메일은 공개되지 않습니다. 필수 입력창은 * 로 표시되어 있습니다.

다음의 HTML 태그와 속성을 사용할 수 있습니다:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>



배송정보
배송조회를 하시려면 송장번호를 클릭하세요
배송조회
상품명
주문번호
택배사
송장번호