Controlling SPAM-related load on an Exim 3 server

During the summer of 2003, we had a few weeks where skyjammer was plagued by a number of problems related to email. In most cases, these problems seemed to be related to the volume of spam that skyjammer needed to handle on a daily basis.

These various problems generally amounted to a denial-of-service for skyjammer. Often, skyjammer ended up bogged down for anywhere from five to ten minutes, a period during which it either didn't respond at all to network requests, or responded very slowly. During one event, skyjammer was so bogged down that it had not recovered even 45 minutes after the mail deluge began, and we had to power-cycle it to make it come back.

Initially, we thought this was because skyjammer was just getting too old to deal with the load that was being placed on it. However, discussions with some friends revealed that even newer servers (on the order of Celeron 300) have similar problems. The solution was twofold: first, it's best to use spamd rather than invoking SpamAssassin directly from your .procmailrc; second, if you run Exim 3 as a daemon, you get some control over how much load Exim can place on your system.

The spamd deamon, since it's always running, eliminates the need to instantiate the entire Perl runtime for every processed message. It reportedly makes SpamAssassin processing as much as four times faster when compared message-to-message, and even larger aggregate gains are possible when many incoming messages are being processed in parallel (as a rule, the improvement declines somewhat as messages get larger).

Changing Exim to run as a deamon rather than via inetd does even more. Whereas inetd will fork a new exim process for every port 25 SMTP connection, exim as a daemon can be running all of the time listening on port 25, which results in much better control over things like how many concurrent messages or connections we're willing to accept. Actual changes to exim.conf are listed below, but basically, I set up new limits related to load average, concurrent number of connections, concurrent number of connections by a single host, and also concurrent number of certain kinds of mail-related processes.

With these limits in place, we seem to have been able to avoid the debilitating outages that skyjammer experienced during 2003. Most of the time, skyjammer can keep up with the load. When it can't, the source SMTP server gets a "please come back later" response rather than a timeout, which is much better.

   ######################################################################
   #             LOAD-RELATED CONFIGURATION SETTINGS                    #
   ######################################################################

   # These settings have been added in an attempt to control SPAM-related
   # denial-of-service problems.  We hope that, when coupled with the use
   # of spamc rather than forked spamassassin processes, the load on the
   # box will be better moderated now than it has been in the past.
   #
   # Note that these settings only make sense if exim has been configured
   # to run as a daemon rather than directly from inetd (as is the default
   # on a Debian system).  To run exim as a daemon, remove the exim entry
   # from /etc/inet.d/conf, and then start the deamon using the script in
   # /etc/init.d (which will detect that exim is no longer run via inetd).
   # Note that the init.d script does not seem to properly stop the daemon
   # once it's running.
   #
   # I believe that, if a load- or capacity-related threshold is exceeded,
   # message deliveries should be enqueued rather than immediately
   # delivered, or else the SMTP peer will receive an error message
   # something like "Too many concurrent SMTP connections; please try again
   # later."  Either is better than bringing the box to its knees.

   # When this option is set, no message deliveries are ever done if the
   # system load average is greater than its value, except for deliveries
   # forced with the -M option. If deliver_queue_load_max is not set and
   # the load gets this high during a queue run, the run is abandoned.

   deliver_load_max = 10

   # If the system load average is higher than this value, all incoming
   # messages are queued, and no automatic deliveries are started. If this
   # happens during local or remote SMTP input, all subsequent messages
   # on the same connection are queued. Deliveries will subsequently be
   # performed by queue running processes, unless the load is higher than
   # deliver_load_max.

   queue_only_load = 5

   # This specifies the maximum number of simultaneous incoming SMTP calls
   # that Exim will accept. It applies only to the listening daemon;
   # there is no control (in Exim) when incoming SMTP is being handled by
   # inetd. If the value is set to zero, no limit is applied. However,
   # it is required to be non-zero if smtp_accept_max_per_host or
   # smtp_accept_queue is set.

   smtp_accept_max=15

   # This option restricts the number of simultaneous IP connections
   # from a single host (strictly, from a single IP address) to the Exim
   # daemon. Once the limit is reached, additional connection attempts are
   # rejected with error code 421. The default value of zero imposes no
   # limit. If this option is not zero, it is required that smtp_accept_max
   # also be non-zero.

   smtp_accept_max_per_host = 5

   # If the number of simultaneous incoming SMTP calls handled via
   # the listening daemon exceeds this value, messages received are
   # simply placed on the queue, and no delivery processes are started
   # automatically. A value of zero implies no limit, and clearly any
   # non-zero value is useful only if it is less than the smtp_accept_max
   # value (unless that is zero).

   smtp_accept_queue = 10

   # This controls the maximum number of queue-running processes that
   # an Exim daemon will run simultaneously. This does not mean that it
   # starts them all at once, but rather that if the maximum number are
   # still running when the time comes to start another one, it refrains
   # from starting it. This can happen with very large queues and/or very
   # sluggish deliveries. This option does not, however, interlock with
   # other processes, so additional queue-runners can be started by other
   # means, or by killing and restarting the daemon.

   queue_run_max = 5

   # This sets the maximum number of messages that will be accepted in
   # one connection and immediately delivered. If one connection sends
   # more messages than this, any further ones are accepted and queued but
   # not delivered. The default is 10, which is probably enough for most
   # purposes, but is too low on dialup SMTP systems, which often have many
   # more mails queued for them when they connect.

   smtp_accept_queue_per_connection = 20

EximLoad (last edited 2008-07-09 06:21:27 by localhost)