I am working on setting up a redundant web server solution and am researching many possibilities. This article is basically just a sounding off of my concepts in an effort to solicit any commentary and or ideas and to share what I have learned thus far.
I have looked at several products that are already available such as LVS and Ultramonkey and none are quite what I need. In my case both LVS (Linux Virtual Server) and Ultramonkey (and others) are not an option because they do not maintain any state information and cannot keep session information straight. These are great load balancers but the content must either be somewhere else or static, and that wont work in my situation.
The task is to take a dynamic web application that is designed to be standalone on a single server (web server, app server, code and database) and duplicate it on many servers AND provide load balancing. That is to say provide multiple redundant self contained servers and a load balancer in front.
The web application has purposely been designed to be self contained to make setup quick and painless and to ensure that all configuration is consistent and all data is the same. That is to say that all the machines that actually run the app are designed to be small self contained (database included) app servers with the SAME DATA AT ALL TIMES. (For web, app, db servers I am of course using a Linux machine with apache, tomcat and postgres that runs a java servlet based application.)
This is tremendously advantageous in that the setup and maintenance of the app servers themselves is very easy and scalability is a snap (just add another self contained web, app, db server box.) However this also has the disadvantage of the data needing to be mirrored and the session information being contained on only each specific machine that the session was begun on. There are certainly debateable points about putting the database server somewhere else, but then the problem of making the database server redundant STILL EXISTS. Also of course the session problem of maintaining which server has which session is the age old load balancing problem with dynamic web content (and is addressed by many products in many different ways, basically once a session comes in from an ip address the load balancer needs to continue to hook up that ip address with the server it started the session with regardless of the load.)
More on this to follow as the research progresses. Currently I plan to use a journaled network based filesystem to mirror the data on the servers and implement "heartbeat" based failover for servers so that if one server fails the next can take up the task. At this point I dont have a solution for load balancing and maintaining sessions but thats not a huge issue because each standalone machine I have can handle the load. Fault tolerance is more important now but in the future load balancing will be important. Therefore a custom load balancer that can maintain the "hook" from outside ip address of request to a specific server throughout the session will be required at some point.
This is a work in progress and more info will be available as I have it (and will be posted here.) If anyone has ideas or suggestions please feel free to comment, thats the idea of this post (I am nowhere near done with the design), thanks! http://linux-ha.org/
Chatter
1 hour 40 min ago
5 hours 2 min ago
1 day 7 hours ago
2 days 2 hours ago
2 days 11 hours ago
2 days 21 hours ago
2 days 22 hours ago
3 days 5 hours ago
3 days 21 hours ago
4 days 16 sec ago