Java Scalability 101: Volume 1, Web Servers
Note: this is the first in a series of articles on achieving better application response and scalability for small businesses, without having to spend an enormous amount of money on hardware.
Many application developers tend to eschew a web server for the built-in solution that comes with most popular application servers (Tomcat, Jboss, Resin, BEA, etc). While this is acceptable for a small-traffic website, this solution simply can’t hold up against a large amount of traffic, or against larger traffic spikes. When your application becomes popular overnight, and you wake up to a huge amount of web traffic overloading your application server, it can be incredibly frustrating to see your application performing so poorly, yet your physical hardware so underutilized.
One famous bottleneck at the application server is simply due to HTTP keepalives. One client will, on average, tie up 2-4 threads of your application server for as long as your keepalive timeout (normally, 10-30 seconds). With the average app server able to handle 255 threads maximum, it only takes 50-100 simultaneous end users before your application grinds to a halt. Even reducing the keepalive time on your app server (if this config change is even possible) will only lessen the impact of browsers.
Enter the web server. One well-formed Apache configuration will handle thousands of simultaneous end users, and will overload your bandwidth long before overloading your memory, disk I/O, or application server threads. Simply adding the web server can make an enormous difference, however the greatest impact Apache has is when it is serving up static content for your application, leaving your application server to handle only your java-enabled pages (jsp, struts, jsf, etc).
First, the web server compilation and initial configuration: we recommend the Apache 2.0 branch vs the 1.3 or 2.2 branch, for two reasons. One, the 2.0 branch allows for a hybrid multi-threaded, multi-process model (the “worker mpm module”)– this is an incredible amount of resource savings over the 1.3 model of one-process-per-request. One MPM worker can serve hundreds of simultaneous requests. The 2.2 line is not yet completely compatible with all application server connectors, so 2.0 works well for us.
We configure Apache with the maximum amount of shared modules, so that removing modules results in a lower memory footprint.
./configure --prefix=/usr/local/apache2
--enable-mods-shared="all" --disable-ssl
--enable-ssl=static --enable-so
--with-mpm=worker --with-ssl
--enable-deflate=shared
This tells the configuration script to enable every module as a shared library except for SSL (which must be statically linked). It also tells Apache to use the MPM worker module.
Once the server is compiled and installed, the initial configuration: go through your httpd.conf and strip out all unnecessary LoadModule lines (to conserve memory). Also, pay attention to this section:
<IfModule worker.c> StartServers 2 MaxClients 1000 MinSpareThreads 25 MaxSpareThreads 75 ThreadLimit 100 ThreadsPerChild 100 MaxRequestsPerChild 100000 </IfModule>
With this simple configuration, your web server will be able to handle 1000 simultaneous clients; raising this number as necessary once you run into traffic issues and Apache will take care of scaling up on its own. Setting MaxRequestsPerChild to be non-zero results in fewer memory leak issues, as the MPM worker children processes get killed off, returning the memory to the OS.
Each application server has its own connector for Apache, usually in the form of a shared module. These also have their own configuration files. In the case of Tomcat, a simple workers.properties file will suffice, with the following (your locations may vary):
workers.tomcat_home=/usr/local/tomcat workers.java_home=/usr/java/jdk1.5.0_12 worker.local.port=8009 worker.local.host=localhost worker.local.type=ajp13 worker.status.type=status
Lastly, inside your VirtualHost entry for your host, add:
JkMount /*.jsp local JkMount /status status
This will send all requests ending in *.jsp to your application server; obviously if you are running struts/etc this will need to change to your page extension or servlet path. Some developers choose to send everything to the app server except for some types; in that case, JkMount /* local and then JkUnmount /*.jpg local would work to only have jpeg files served up from Apache.
Hopefully this has shown that for application server scalability, adding a web server in front of your app server is a good first step. Static content on the filesystem is Apache’s specialty, and is where you can get a huge amount of speed gains. Your application server threads can be saved for actually serving up dynamic content, while the huge amount of Apache threads can serve up content which doesn’t change from client to client.

I am curious if you have tried any of the NIO enabled solutions out there. Seems like there is a trend away from webservers for application hosting on java:
http://www.javalobby.org/java/forums/m92139905.html
Comment by James Law — September 24, 2007 @ 1:42 pm
Hi!
From the Tomcat docs: (http://tomcat.apache.org/tomcat-5.5-doc/connectors.html)
“The HTTP connector is setup by default with Tomcat, and is ready to use. This connector features the lowest latency and best overall performance.
[…]
When using a single server, the performance when using a native webserver in front of the Tomcat instance is most of the time significantly worse than a standalone Tomcat with its default HTTP connector, even if a large part of the web application is made of static files.”
Doesn’t that mean it is prefered to NOT use a “real” web server in front of Tomcat (with, say, mod_jk)? How many Tomcat instances are needed for a native web server to start being more performant if I configure my Tomcat Http-Connectors according to the points you stated (means maxKeepAliveRequests=”1″ and so on)?
Thanks!
Comment by Michael P — September 24, 2007 @ 2:23 pm
Hi James,
We have not tried the NIO solutions as of yet; while NIO certainly looks promising from a throughput standpoint, we still enjoy the versatility and configuration that Apache allows. Many additional benefits that Apache has had years to refine are just now starting to be implemented in Tomcat/Resin/etc.
Michael– first of all, using MaxKeepAliveRequests set to 1 will be an incredible strain upon your app server as it closes and reopens TCP/IP connections. MaxKeepAliveRequests should always be set based upon the content of your average page; if your page has 10 or 15 elements per page, that should be the minimum number of MaxKeepAliveRequests; you want users to have an incredibly fast initial load (even with pipelining).
The documentation around the Tomcat mod_jk connector applies only for requests that get sent through to the application server. Additionally, the documentation doesn’t take into account the maximum amount of concurrent clients available using a webserver vs. strictly the application server. The slight performance hit of sending one page through the connector is more than made up for by the speed of the web server combined with its versatility.
Thanks for the comments! -Mark
Comment by Mark — September 24, 2007 @ 2:59 pm
Hi Mark!
Thanks for your answer!
My use case is a little bit special in that Tomcat is not running a “normal” web application but instead is used as the backend for a desktop application which performs requests over XML-RPC. Each of these requests is seperate (except the login-sequence) and is performed approx. every 2 minutes by each client. So I think maxKeepAliveRequests=â€1″ is justified in this special case. (for normal web applications, of course you are right!).
Using NIO, sendfile and Comet-Servlets for large requests/responses I think there is no need for a native web server (well, maybe except for load balancing.)
Looking forward to your next article!
Michael
Comment by Michael P — September 24, 2007 @ 3:56 pm
[…] Heluna Weblog » Java Scalability 101: Volume 1, Web Servers (tags: performance apache scalability tomcat j2ee java webservice http) […]
Pingback by napyfab:blog» Blog Archive » links for 2007-09-25 — September 25, 2007 @ 4:41 pm
Hi Mark,
With the propose configuration for 1000 clients, how can we estimate system resources (CPU and MEM) based on the number of simultaneous users
Comment by Carlos — November 20, 2007 @ 6:55 am