[ Team LiB ] Previous Section Next Section

18.3 Scalability

For a large, complex application, there are many reasons to move to a model that includes Enterprise JavaBeans components. But, contrary to popular belief, scalability and great performance should not be the one deciding factor. There are many ways to develop scalable applications using just JSP or the servlet/JSP combination, often with better performance than an EJB-based application, because the communication overhead between the web tier and EJB tier is avoided.

Scalability means that an application can deal with more and more users by changing the hardware configuration rather than the application itself. Typically this means, among other things, that it's partitioned into pieces that can run on separate servers. Most servlet- and JSP-based applications use a database to handle persistent data, so the database is one independent piece. They also use a mixture of static and dynamically generated content. Static content, such as images and regular HTML pages, is handled by a web server, while dynamic content is generated by the servlets and JSP pages running within a web container. So without even trying, we have three different pieces that can be deployed separately.

Initially, you can run all three pieces on the same server. However, both the web container and the database use a lot of memory. The web container needs memory to load all servlet and JSP classes, session data, and shared application information. The database server needs memory to work efficiently with prepared statements, cached indexes, statistics used for query optimization, etc. The server requirements for these two pieces are also different; for instance, the web server must be able to cope with a large number of network connections, and the database server needs fast disk access. Therefore, the first step in scaling a web application is typically to use one server for the web server and servlet container, and another for the database.

If this isn't enough, you can distribute the client requests over a set of servers processing HTTP requests. There are two common models: distributing the requests only for dynamic content (servlet and JSP requests) or distributing requests for all kinds of content.

If the web server is able to keep up with the requests for static content but not with the servlet and JSP requests, you can spread the dynamic content processing over multiple web containers on separate servers, as shown in Figure 18-5. Load balancing web container modules are available for all the major web servers, for instance Apache's Tomcat (http://jakarta.apache.org/tomcat/), BEA's WebLogic (http://www.bea.com/), Caucho Technology's Resin (http://www.caucho.com/), and New Atlanta's ServletExec (http://www.newatlanta.com/).

Figure 18-5. Web server distributing load over multiple web containers
figs/Jsp3_1805.gif

The tricky part when distributing dynamic content requests over multiple servers is ensuring that session data is handled appropriately. Most containers keep session data only in memory. In this case, the load balance module picks the server with the lowest load to serve the first request from a client. If a session is created by this request, all subsequent requests within the same session are sent to the same server. Alternatively, a container can also save session data on disk or in a database. It can then freely distribute each request over all servers in the cluster and can also offer failure recovery in case a server crashes. A container is allowed to move a session from one server to another only for applications marked as distributable, as described in the next section. You can find which model a certain product uses by looking at the vendor's web site and documentation. Pick one that satisfies your requirements as well as your wallet.

For a high-traffic site, you may need to distribute requests for both static and dynamic content over multiple servers, as illustrated in Figure 18-6. You can then place a load-balancing server in front of a set of servers, each running a web server and a web container. The same as with the previous configuration, session data must be considered when selecting a server for the request. The easiest way to deal with it is to use a load-balancing product that sends all requests from the same client to the same server. This is not ideal though, since all clients behind the same proxy or firewall appear as the same host. Some load-balancing products try to solve this problem using cookies or SSL sessions to identify individual clients behind proxies and firewalls. In this configuration, you get the best performance from a web server that runs a web container in the same process, eliminating the process-to-process communication between the web server and the web container. Most of the web containers mentioned here can be used in-process with all the major web servers. Another alternative for this configuration is a pure Java server that acts like both a web server and a web container. Examples are Apache's Tomcat, Ironflare AB's Orion Application Server (http://www.orionserver.com/), and Gefion Software's LiteWebServer (http://www.gefionsoftware.com/LiteWebServer/). Compared to adding a web container to a standard web server, this all-in-one alternative is easier to configure and maintain. The traditional servers written in C or C++ may still be faster for serving static content, but with faster and faster Java runtimes, pure Java servers come very close.

Figure 18-6. Load balancing server distributing requests over multiple servers with a web server and container
figs/Jsp3_1806.gif

You shouldn't rely on configuration strategies alone to handle the scalability needs of your application. The application must also be designed for scalability, using all the traditional tricks of the trade. Finally, you must load-test your application with the configuration you will deploy it on to make sure it can handle the expected load. There are many pure Java performance testing tools to choose from, spanning from the simple but powerful Apache's JMeter (http://jakarta.apache.org/jmeter/index.html) to sophisticated tools such as Minq Software's PureLoad (http://www.minq.se/products/pureload/) that supports data-driven, session aware tests to be executed on a cluster of test machines.

18.3.1 Preparing for Distributed Deployment

As I described in the previous section, some web containers can distribute the requests for a web application's resources over multiple servers, each server running its own Java Virtual Machines (JVM). Of course, this has implications for how you develop your application. So, by default, a web container must use only one JVM for an application.

If you want to take advantage of web-container controlled load balancing, you must do two things: mark the application as distributable and follow the rules for a distributed application defined by the servlet specification.

To mark an application as distributable means adding a <distributable/> element in the deployment descriptor for the application:

<web-app>
  <description>A distributable application</description>
  
  <distributable/>
  
  <context-param>
    ...
  
</web-app>

By doing so, you're telling the web container that your application adheres to the rules for distributed applications. According to the servlet specification, a distributed application must be able to work within the following constraints:

  • Each JVM has its own unique servlet instance for each servlet declaration. If a servlet implements the javax.servlet.SingleThreadModel interface, each JVM may maintain multiple instances of the servlet class.

  • Each JVM has its own unique javax.servlet.ServletContext instance. Objects placed in the context are not distributed between JVMs.

  • Each JVM has its own unique listener class instances. Event notification is not propagated to other JVMs.

  • Each object stored in the session must be serializable (must implement the java.io.Serializable interface).

This means you cannot rely on instance variables to keep data shared by all requests for a certain servlet; each JVM has its own instance of the servlet class. For the same reason, be careful with how you use application scope objects (ServletContext attributes); each JVM has its own context, with its own set of objects. In most cases, this is not a problem. For instance, if you use the application scope to provide shared access to cached read-only data, it just means you may have copies of the cached data in each JVM. If you really need access to the same instance of some data between JVMs, you must share it through an external mechanism, such as a database, a file in a filesystem available to all servers, or an EJB component.

The most interesting part about distributed applications is how sessions are handled. The web container allows only one server at a time to handle a request that's part of a session, but since all objects put into the session must be serializable, the container can save them on disk or in a database as well as in memory. If the server that handles a session gets overloaded or crashes, the container can therefore move the responsibility for the session to another server. The new server simply loads all serialized session data and picks up where the previous server left off. This means that an object may be placed in the session in one JVM but actually used on another.

Listeners (described in Chapter 19) are also unique per JVM, and events are sent only to the local listeners. Since a session may migrate to another JVM, this means that a session lifecycle listener in one JVM may be notified about the start of the session, while a listener in another JVM gets the end-of-session notification.

    [ Team LiB ] Previous Section Next Section