Sunday, June 17, 2007

Scaling webapps with Tomcat Session Replication and MySQL Database Replication

In a recent project of mine we had to scale our application across multiple (6) plain Linux machines. There was no cluster or failover mechanism available from the host. We had only one simple load balancer without any support of IP/session stickiness. Our application was a web application based on Tomcat application server and MySQL database. We decided to go with both application server and database on every machine and synchronize database and tomcat sessions across.

Versions used:
- Tomcat 5.5
- Mysql 5.0.41

Tomcat Session Replication:
  • Install tomcat on all the machines
  • Locate the "Cluster" node in server.xml. It is usually commented out in the default server.xml
  • If it is commented out, uncomment to enable it.
  • Nodes in a Tomcat cluster advertise themselves and discover other nodes through multicast. Multicast connection properties can be configured from the "Membership" node.
  • Verify mcastAddr and mcastPort and change if required.
  • Once the nodes discover each other, replication happens through TCP connections. Each node has a receiver and a sender to send and receive session information from other nodes. Properties for the sender and receiver TCP connection can be configured through nodes "Sender" and "Receiver".
  • Set the "tcpListenPort" of the sender to an unused TCP port of your machine.
  • If name resolution is not setup properly in your servers, set "tcpListenAddress" to the IP of the machine as seen by other machines in the cluster
  • Restart Tomcat and it is done!
  • If automatic application deployments across cluster ("Farming") is not required, the "Deployer" node can be commented out.
Session replication can be verified by installing a session dependent application on all machines and then hitting the individual IP of each machine in turn to see whether the application is able to recollect your session information across machines. At this point "netstat" command should show TCP connections from each machine to each other machine on the TCP send-receive ports.

** Tomcat session replication works by serializing sessions, transferring it to another machine and de-serializing it back there. So unless one is careful not to set any session attribute that is not serializable, replication will break.

We did a load test with a simple application that stores and modifies 10KB of session data and it worked beautifully with no failures. If you face any synchronization delays, try tuning the replicationMode and tcpThreadCount. More details on tuning tomcat session replication is at

MySQL Database Replication:

MySQL session replication works my re-playing the SQL queries run on one machine in the chain on all other machines. The source node is called "Master" and the destination node is called the "Slave". The same node can be a master to a node and a slave to a different node at the same time.

Master functionality:
When replication is enabled, the master node keeps all the SQL queries it executes in a binary log. These logs are typically named as bin.000001, bin.000002 and so on. The sequence number of binary logs gets incremented on every restart of the server and possibly based on a size limit. The list of binary logs in a machine is maintained in a file named log-bin.index.

Slave functionality:
The slave connects to the master and waits for receiving commands from the master's binary log. The slave creates file called where it maintains details of the master and its current position in reading the master binary log. The slave receives the master binary logs into a relay files named relay.000001, relay.000002 and so on. The slave creates a file called where it maintains details relay log and the position in the relay log from where to start processing. It keeps track of multiple relay log files through a index file relay-log.index that it maintains.

Preventing clash of auto increment values in a replication chain:
To prevent the same sequence number from getting generated in two servers in a replication chain, MySQL provides two parameters auto_increment_increment (amount by which and auto increment column is incremented for its next value) and auto_increment_offset (the starting value of an auto increment column). Using these two parameters you can ensure unique auto increment values in all the nodes in the chain.

  • Install MySQL on all the nodes
  • Start MySQL, use your DDL scripts to create databases and tables.
  • Shutdown all MySQL servers.
  • Add the following parameters to the my.ini files of all the nodes. I have taken an example of 3 servers, which you can extend to the number of nodes in your installation:
    Parameter                    Node 1     Node 2      Node 3
    server-id 1 2 3
    auto_increment_increment 10 10 10
    auto_increment_offset 1 2 3
    master-host Node3 IP Node1 IP Node2 IP
    master-user reptest reptest reptest
    master-password abc abc abc
    master-port 3306 3306 3306
    replicate-same-server-id 0 0 0
    master-connect-retry 30 30 30
  • Startup all MySQL servers.
  • Create a replication user on all servers:
    Here reptest is the replication user and abc is the password. You can assign your own user id and password for replication.
  • Log in to mysql using the mysql console.
  • Check the running of slaves using command "show slave status"
  • If slave is not running, you can start us the slave through command "slave start"
Replication was fast and changes were reflected across the chain almost instantaneously - enough for a web-application type of environment.

We encountered a surprise where in if a query uses SYSDATE to set the current date to a column, because the same SQL is run on all nodes, the column will have the time at which the query was run on each machine after replication. If you don't want any time differences to crop up in the time values, you should avoid using commands like SYSDATE that use machine specific information. It may also help if you can sync up the time of all nodes through NTP or something.

Strategy for adding a new node into a chain:
Inserting a new node becomes easy with the following steps. I'll explain that with the 3-node example I had used before. Lets say we want to introduce node 4, then it should be between Node 3 and Node 1.
  • Shutdown Node 3
  • Take a backup of Node 3 data files and configuration files and copy them to Node 4.
  • In Node 4, change the server-id to 4, auto-increment-offset to 4.
  • Set master node of 4 to 3, master node of 1 to 4.
  • Restart node 1
  • Start nodes 3 and 4.
The simple steps above will be suitable if the DB size if small and you can afford a downtime of few of the nodes. Otherwise the same can be done by replicating node 3 while it is online. Note that the simple chain breaks even if one of the nodes go offline. It is also possible to create backup nodes and avoid breaking of the chain if one node goes down. Or have scripts to detect coming down of a node and quickly change the master and slave settings of adjacent nodes to keep the loop completed while the new node is re-installed and inserted as a new node.

Also interesting is this blog posting that talks about efficient selective replication for large databases by using the MySQL Blackhole engine to set up an intermediate filtered database.

1 comment:

tan said...

Also useful is load balancing of tomcats using Apache proxy. See this .