Wednesday, September 07, 2011

WSO2 Load Balancer - how it works

... an under the hood look at the WSO2 Elastic Load Balancer.

In The role of a Load Balancer in a Platform-as-a-Service, I explained some key concepts behind load balancing. In this article, we will take a look at the inner workings of the WSO2 Elastic Load Balancer to understand how some of these concepts have been incorporated. The WSO2 Elastic Load Balancer is the load balancer used in StratosLive, the first fully complete open Java Platform-as-a-Service (PaaS)

High Level Architecture

As shown in the diagram above, the WSO2 Elastic Load Balancer is built using Apache Tribes, which is the group management framework used by Apache Tomcat as well as Apache Axis2, Apache Axis2 clustering module, Apache Synapse - one of the best performant mediation frameworks, and the autoscaling component from the award-winning WSO2 Carbon framework.  The autoscaling component interacts with the Amazon EC2 API to carry out infrastructure related functionalities such as starting new instances, terminating running instances, mapping elastic IPs & so on.

Let's start looking at the above diagram from the topmost layer. The Carbon autoscaling component is responsible for keeping track of the traffic to each Cloud Service cluster, and making decisions related to scaling the system up when the load increases & scaling the system down when the load reduces. It is also responsible for keeping a sanity check on the system. This sanity check ensures that the minimum system configuration, such as the minimum number of running instances of each Service & load balancer cluster is maintained at all times. This autoscaling component calls out to the EC2 API for infrastructure related activities. The autoscaling component has been implemented as two Apache Synapse mediators; autoscale-in & autoscale-out mediators, and a Synapse task.

A Synapse endpoint called the ServiceDynamicLoadBalanceEndpoint is responsible for routing the messages to the appropriate node. First, the Cloud Service cluster to which the message has to be sent is identified using the Host HTTP header. Then a member from that Service cluster to which the message has to be routed to is selected according to the load balancing algorithm specified, and the message is sent to that member. The response from that member is sent back to the client which originated the request. While the LB is trying to forward a request to a member, that member may fail, and in such a case, if possible, the ServiceLoadBalanceEndpoint will try to failover to another available member. Such a member which failed will be suspended for a specified period. This endpoint is also responsible for handling sticky sessions. In fact, when a request comes in, this endpoint first checks whether there is sticky session created for that client, and if such a session is found, the request is forwarded to the relevant member. Sticky sessions are identified using the value of the JSESSIONID cookie.

Binary Relay or Message Relay is the message pass through mechanism in the WSO2 Elastic Load Balancer. This relay enables the LB to pass the messages without building or processing the message. This ensures that the overhead introduced by the LB is minimal.

Membership discovery & management is handled by the Axis2 Clustering module, which handles Service membership. There are multiple membership handlers, one for each clustering domain, responsible for handling membership. The ServiceDynamicLoadBalanceEndpoint will obtain the relevant members of the cluster from this Axis2 membership handlers, and then apply the load balance algorithm to determine the next member to which the request has to be sent. The Axis2 clustering implementation uses the underlying Apache Tribes Group Management Framework for group membership management.

Dynamic Clusters & Group Management Agents

In the WSO2 Elastic Load Balancer, we have support for dynamic & hybrid membership discovery. As shown in the above diagram, the Elastic Load Balancer can be set up in a primary-secondary configuration. These LBs will be in a cluster, and state replication can take place between these nodes. The LBs are also configured as Group Management Agents, hence they can manage domains A, B & X. This means, any membership changes in the domains, A, B, X will be visible to LBpri & LBsec. A public IP address will be assigned to the LBpri in a typical deployment. The LBsec will keep observing LBpri, and if LBpri fails, LBsec will map the public IP to itself, spawn another secondary LB, and then become the primary LB. The client sees the public IP address mapped to LBpri, and needs to send the HTTP 1.1 Host header. This Host header will be used by LBpri to determine the destination Service cluster. For example, if the Host is set as Hb, the LBpri will select one of the nodes in domain B, and send the request to it.

For this setup to work, clustering has to be enabled in all the backend worker nodes, as well as the load balancers, because the underlying Axis2 clustering mechanism & Tribes group management framework are used for dynamic membership management.

Real Port to Proxy Port Mapping

The mapping between the proxy port & the real port on the member (worker node) is provided as shown in the diagram above. In the above diagram, we have shown an example of the LB exposing 4 ports; 2 for HTTP & 2 for HTTPS requests. The HTTP ports on the LB are 80 & 8280. The HTTPS ports on the LB are 443 & 8243.

Member A1 exposes 4 ports; HTTP 9762 is proxied via port 80 in the LB, HTTP 9763 is proxied via port 8280 in the LB; HTTPS 9443 is proxied via port 443 in the LB & HTTPS 9444 is proxied via port 8243 in the LB. 

Member A2 also has a similar real port to proxy port mapping as depicted in the diagram above.

These ports & the mapped proxy ports will be advertised by the members when they join the cluster, and the LB can retrieve these values as properties of the member. When the LB receives a request, it will get the incoming port, and before dispatching the request to a member, it will try to get the mapped real port from the member, and route the request to the appropriately mapped port (as advertised by the member).

The diagram shown below shows how requests from clients will be proxied from the LB to member A1.

How Elasticity/Autoscaling is Handled
Autoscaling is handled by the WSO2 Carbon autoscaling component. This component keeps track of the number of messages in flight to each Service cluster, and decides whether to scale the system up or down. The autoscaler component consists of an AutoscaleIn mediator, AutoscaleOut mediator & LoadAnalyzerTask, as shown in the diagram below. When a message is received, the AutoscaleIn mediator creates a unique token & puts it into a list. When a response to the message is sent, the AutoscaleOut mediator removes that token from that list. So, this list tracks the number of messages in flight for each backend Service. The LoadAnalyzerTask periodically check the list lengths & based on the configuration parameters, decides whether to scale up or scale down a backend Service.

This component will start new Service member instances, and once those members successfully boot up, they will join the relevant Service cluster. Now the load balancer will start forwarding the request to the new members as well.

Membership Management using Apache Tribes

As shown in the above diagram, the WSO2 Elastic Load Balancer is a special Tribes member, which can sense membership changes in the groups it is managing. Each group has its dedicated membership channel, and the LB can connect to all these channels. The other Service members can only see membership changes in their respective channels. In the above diagram, members in Group A (A1, A2, A3,..., An) can sense membership changes in Membership Channel - A, but cannot see such changes in Membership Channel -B. However, the LB can see membership changes in Channels A, B & LB.

Deployment in a Data Center
If your network allows multicasting, you could use the multicast based membership scheme. In this case, membership will be handled using multicasting, hence a multicast socket needs to be configured.

Deployment on the Cloud
On a Cloud deployment, typically you would use well-known address based membership scheme since multicasting will be disabled on such setups. In such a setup, we will make the one or more LBs to be the well-known members. All nodes in the Service clusters, SA, SB & SX will see the LB as a well-known member. Membership will revolve around the well-known load balancer in this setup.

WSO2 Elastic Load Balancer on StratosLive

The above diagram shows how we have deployed the WSO2 Elastic Load Balancers in StratosLive. The diagram shows how a single LB instance can front several Cloud Service clusters. For example, the Manager, Business Rules Server (BRS) & Business Process Server (BPS) Cloud Services are fronted by a single load balancer, and the Governance Registry (G-Reg), Mashup Server (MS) & Gadget Server (GS) Cloud Services are fronted by another load balancer. Some other services such as the Identity Server, Application Server, Business Activity Monotor (BAM) & Data Services Server (DSS) have dedicated load balancers, since these are popular Services with heavy incoming traffic. As you may note, some nodes are marked as permanent nodes. This means, these nodes are not supposed to be shutdown if the autoscaler makes a scale-down decision. Special flags mark these nodes as permanent nodes, and the autoscaler will check this flag before deciding to terminate an instance. It is the elastic nodes that can be terminated if a scale-down decision is made.

Post a Comment