Friction as friend

To help guard against such possibilities, in addition to the message size limit, a ccr world also places rate limiters on every established network connection, to reduce the risk of accepting and acting on network communications, whatever they may be. The inbound bandwidth limiter specifies the maximum average rate in bytes/second that one world is willing to read data from another world, with an effective default of 2Kb/s, which is enough to allow most normal channel usage to flow unimpeded. If more than 2Kb/s is supplied, the receiving world simply doesn't read it until enough time has passed that the overall bandwidth doesn't exceed the set bandwidth limit.

In addition to the communication bandwidth control, a ccr world also maintains a processing bandwidth limiter for each world that is in contact. For two worlds to communicate meaningfully, it is necessary that data sent from one world somehow affect the processing that occurs on the other world; therein also lies the risk. Although ccr uses several mechanisms to control what operations remote worlds can perform locally--for example, by controlling the language used to express the messages the receiving world will choose to read--here we focus only the processing time control. Each operation a ccr world can perform has an associated cost in terms of ``work units''. Each computation request received from another world is tagged locally with the identity of the requesting world, and as processing proceeds, work units are logged against the remote world. As with the bandwidth limiter, if processing on behalf of a remote world exceeds a specified rate, then the local world delays accepting further input from that world until the overall processing rate drops within established limits.

**Figure:** Using rate-limiters on bandwidth and processing time to mitigate denial-of-service events. *Case 1:* World `B' floods world `A'--with defenses disabled--starting at 00:00. *Case 2:* World `B' floods world `A'--with defenses in place--starting at 18:00. See text for details.
$\begin{figure}\begin{center} \centerline{\psfig{figure=sum8.eps,width=3.4in}}\end{center}\end{figure}$

With a default of 20 work units/world/second, once again most normal inter-ccr operations are at most minimally impacted by the limiter. In pathological situations, however, the protection they afford can be significant. Figure 2 illustrates their effect via two simulated denial-of-service attacks launched by World `B' against World `A'. World `A's data appears in the upper graph and World `B's in the lower graph; in both cases the solid lines represent memory usage and are measured in megabytes of growth (the left side

-axis labels), and the dotted lines represent the currently main loop processing rate in cycles/second (the right side

-axis labels). A rate of 20CPS is the target `heartbeat' rate of a ccr world; it will `sleep' if it doesn't require the full 50ms to complete all scheduled tasks.

Before the first attack, which starts at zero minutes into the displayed Case 1 data, World `A's limiters were effectively disabled by setting the acceptable rates to extremely large values. `A' grows by several megabytes quickly and then stabilizes for several minutes, and then shortly before the 5:00 mark it begins growing exponentially and its cycle rate crashes. The simulated attack was stopped shortly after 15:00 minutes, at which point `A's size had exceeded 200Mb and it had nearly exhausted the swap space on its machine. Both worlds were then restarted and the attack was repeated, this time with the normal values for the limiters. Now in Case 2 `A's growth rate is slower, and remains under 10Mb, and it turns a steady 20CPS throughout the event: The controls are performing effectively.

The growth behavior displayed by World `B' through the two events was somewhat unexpected: It was much less different between the two cases than anticipated. The `attack' was performed by instructing a character in world `B' to `speak' 10Kb strings of random numbers approximately 50 times per second. In both cases `B's size gradually grows. In Case 2 `B' grows because `A' is deliberately delaying reading from `B', to protect itself, while `B' continues to speak, causing `B's `pending output' buffers to expand. In Case 1 the same effect occurs, but there it is because `A' is in severe distress from memory thrashing and processing overload, and its cycle rate has crashed, so there also it is not reading from `B' as frequently. In both cases `B's size eventually drops, when the communication channel was closed by yet another watchdog within the system. That mechanism injects `Are you alive?' messages into communications streams at random intervals and times how long the response takes; if no response is received after several minutes the connection is killed.