Optimizing Robots in RoboWar

RoboWar is a programming game where players write software ``robots'' in a programming language called RoboTalk. Then the robots bop around an arena shooting at each other while the player, who might more properly be called a ``coach'' than a ``player'', sits there biting his (or her) nails and shouting ``Go up! Go up! Shoot you blind fool! Oh no!'' Or words to that effect.

RoboTalk is a very low-level language for a stack-based abstract machine. It's probably a little different from the programming languages you already know. Nonetheless, the instant visual (and audible, if you care to record some sounds for your 'bot!) feedback from your programs makes it a relatively easy language to learn. Possibly even if it is your first programming language. RoboWar comes with a short RoboTalk tutorial. Various RoboWarriors have also contributed tutorials and tips and tricks archives. The following are known to me. Please email me if you know of any others!

  • Paul Hansen's site contains both a tutorial and a large collection of very useful code scraps.
  • Austin Barton's tutorial. Covers much the same ground as the tutorial that comes with the game, but when you're learning something new two sources are at least twice as good as one.
  • John Abbott's optimization tips. Short, but good to know.
  • Bryant Brandon's algorithm archive. More advanced stuff.
  • The Stilt Man's very helpful and extremely influential Theory of Robot Design page. (This one is really more about RoboWar than about RoboTalk, but this list wouldn't be complete without it!)
  • Chuk's Robowar webpage. The site is brand new, but strategies and code bits are promised for the near future.
  • This page here is primarily a record of my own experiences in trying to squeeze performance out of RoboWar robots. I'm not sure who will find this useful. New RoboTalk programmers probably should not worry about this stuff, and experienced RoboTalk programmers have probably figured it out for themselves by now. Still, in the hope that something will turn out to be useful to you, here it is. Also, even for beginners, knowing about some of the techniques here may make it easier to understand the why's and wherefore's of other people's robots.

    Many of these optimizations will only save you one instruction, so the point of performing them may not be really clear. And indeed ``Premature optimization is the root of all evil,'' and don't you ever forget it! But especially in RoboWar, where very many events occur at chronon boundaries, very slight changes in the location of those chronon ticks in your code may have very large effects on your robot's behavior, even with a 50 speed CPU. So at some time or other something here may actually be useful to you.

    Stuff to keep in mind

    The following things can be stored in RoboTalk registers:

    Integers (of course)
    Addresses
    Register Names

    In addition, integers and addresses but not (as far as I know) register names can be arguments of arithmetic operations. So the objects you're computing with are not just X or Y coordinates, angles and distances!

    Last Call Optimization, Continuation Passing and Other Tricks with Addresses

    In this section we consider the fun you can have manipulating addresses. In the next section we consider (more briefly) storing registers in registers.

    The techniques discussed here are mostly based on the following equivalences. First,

       labelA call
       {continue}
        
    is (except for the number of instructions involved) equivalent to this:
       labelB labelA jump
    labelB:
       {continue}
        

    Also, the operation return is identical to the operation jump (you've probably already noticed that the debugger systematically translates your return's to jump's). The distinction is just a convenience for programmers to remind them that the address to jump to was put on the stack a while ago, rather than immediately before executing the jump.

    Last Call

    This optimization is also referred to as Tail Recursion Optimization, and is crucial to implementing high level languages that use a lot of recursion (like Lisp or Prolog). It is meant to allow a recursive procedure to run forever without overflowing the stack. But it can be applied to RoboTalk programs as well, with the more modest goal of saving an instruction here and there.

    Consider the following situation: A robot is in a subroutine, and the last thing it does before returning from the subroutine is to call another subroutine, as in:

       labelB call
       return
        
    As noted, this is (almost) equivalent to the following code, where labelA refers to the address to which the above return operation will send us.
       labelC labelB jump
    labelC:
       {labelA} jump
        
    When we are done with the subroutine at labelB we will jump to labelC and then immediately to labelA.

    The same end result would be achieved a little faster if we simply jumped to labelA directly on reaching the end of the routine at labelB. And we know we can do that because we know that labelA is sitting on the stack already when we call labelB. It just gets covered up by the return address generated by labelB call. So we can replace the call by a jump, and delete the return from our current subroutine, as follows.

       {labelA} labelB jump # not call
    #   return
        
    That saved us the return instruction. Whoopee!

    Note that the subroutine at labelB has not been affected by this change! It remains a ``true'' subroutine, ending in a return, and therefore callable as a subroutine from any other point in the code.

    Continuation Passing

    Another related trick is the following. Consider once again the expanded version of a call operation.

       labelA labelB jump
    labelA:
       {continue}
        
    It may well happen that the code we want to execute after the routine at labelB is not in fact located right after the jump operation. Then we can use this ``expanded call'' format, but with some other label in place of labelA.
       labelC labelB jump
    labelA:
       {labelC is our continuation now, not labelA}
        

    This is equivalent to

       labelB call
       labelC jump
        
    but one instruction quicker. And two instructions quicker than
       labelB call
       labelC call
       return
        
    You see: these optimizations can add up.

    As with Last Call, the subroutine at labelB is not affected by this trick: it remains a separate, reusable code module.

    This trick can be expanded to string together a series of subroutines, as in the following:

       stop dodge_north back_off_east jump
        
    Again we assume that at least back_off_east and dodge_north are subroutines, i.e., they end with a return operation. But ``called'' in this way the return at the end of back_off_east will bounce us to dodge_north, and the return at the end of dodge_north will bounce us to stop. (N.b.: if stop ends with a return, then we need to supply it with a return address somehow!)

    Written in this way, we could just as easily elsewhere issue the sequence of instructions

       stop dodge_north back_off_west jump
        
    or
       stop dodge_south back_off_east jump
        
    or whatever. That is, the fact that the mainline routine is jumping instead of calling does not affect the reusability of the subroutines jumped to.

    Warning! This won't work if any of your subroutines executes a dropall. Of course, in such a case it is no longer a true reusable subroutine, but you may want to keep this warning in mind in any case.

    It's a Loop. It's a Subroutine. It's Both!

    This is a trick you see a lot in RoboWar, which allows you to use the same code for both a one-shot subroutine, called and returned from, and also for an infinite loop, exited only on an interrupt. This is of course particularly useful with aim loops. Sometimes you may just want to make a single sweep of the arena, other times you may prefer to sit in the aim loop until something moves into view.

    somewhere:
       80 range' setparam single_sweep call  # anyone nearby?
       ...
    elsewhere:
       600 range' setparam aim_loop jump
       ...
    aim_loop:
       aim_loop
    single_sweep:
       {rotate the turret or check some look/scan angles}
       return
        
    If you jump to aim_loop, the return behaves just like another jump to aim_loop. But if you call single_sweep, the return behaves like a ``true'' return, taking you back to the instruction following the call.

    It's worth considering how to unroll such a ``return to self'' loop. For example, suppose you are in a certain mode where you are primarily interested in covering ground for a fixed number of chronons (let's say 5 chronons, for concreteness). You could just write sync sync sync sync sync, but it's nearly a capital crime to waste all those chronons. You could at least be continuing to track your opponent during that time, for example. So suppose you have a routine called trackWhileMoving, which is designed to fill one chronon. You could replace that string of sync's with something like

    move_for_five_chronons:
       label_A
       trackWhileMoving
       trackWhileMoving
       trackWhileMoving
       trackWhileMoving
       trackWhileMoving
       jump
    label_A:
       {...}
        

    On the assumption that trackWhileMoving ends with a return (i.e., a jump with the target address determined by the state of the stack at run time), this code will jump to trackWhileMoving, do the tracking thing, and then, when it hits the return, it will ``return'' to its own beginning. Each time through it will consume one of the four return addresses provided to it on the stack until, after the fifth time through, it will finally return to label_A.

    This technique is really an instance of continuation passing, derived from underlying code like the following.

    move_for_five_chronons:
       trackWhileMoving call
       trackWhileMoving call
       trackWhileMoving call
       trackWhileMoving call
       trackWhileMoving call
    label_A:
       {...}
        
    But with 3 fewer instructions.

    Finally, we can vary the time we consume using tricks like the following.

    somehwere:
       move_for_five_chronons call
    label_A:
       {...}
    
    elsewhere:
       move_for_ten_chronons call
    label_B:
       {...}
    
    move_for_ten_chronons:
       trackWhileMoving 
       trackWhileMoving 
       trackWhileMoving 
       trackWhileMoving 
       trackWhileMoving 
    move_for_five_chronons:
       trackWhileMoving 
       trackWhileMoving 
       trackWhileMoving 
       trackWhileMoving 
       trackWhileMoving 
    move_base:
       jump
        

    Observe that calling move_for_X_chronons puts the return address (either label_A or label_B) on the stack in exactly the right place.

    The reader who has read Bryant Brandon's Fast Looping code will recognize this as a specialization of a Brandon-esque fast loop with the offset from the base address move_base precalculated to be either 5 or 10. Any number of iterations of trackWhileMoving can be selected by jumping to move_base minus the desired number of iterations. But at the cost of the subtraction calculation.

    Multiple Entry Points

    The preceding code fragments offered some examples of a more general technique, which you will also often see in tournament robots, namely providing a subroutine with multiple entry points. For example, suppose that you start off a duel with a long aim loop, then, on spotting your opponent, you begin to track it using the look register to reacquire your target whenever it slips out of view. Once you start using the look register, you will need to execute the following code repeatedly to compensate.

       aim look + aim' store
       0 look' store
        

    So consider now a simple range interrupt handler like the following.

    pre_range_handler:
       aim look + aim' store
       0 look' store
    range_handler:
       {actually shoot}
        
    When you are doing all your searching with only the aim register, as is typically the case with monster humongous unrolled aim loops, you want your range interrupt handler to be set to range_handler: there is no reason to be messing with the look register if you can safely assume it is set to zero. But in your tracking routine, when you are using your look register on the assumption that the enemy is not far away from your current aim, you need to set the range interrupt handler to pre_range_handler.

    The multiple entry points to the main range handling logic allow you, in this example, to execute only as much code as you know you will need. Other uses I have seen are to add a ``preamble'' to drop extraneous values from the stack, or place values on the stack that are otherwise assumed to be there already (as in the combined loop/subroutine example above).

    Jump Tables

    This is a technique I use all the time. The problem it addresses is the following. The usual way of making decisions in RoboTalk is to use one of the if family of operators. However, if you must decide among more than two cases, or if the cases you decide between have disjunctive descriptions (e.g., is my aim less than 90 or greater than 270?), then the amount of code between you and your next action can get very long. Often a single calculation can take the place of a large number of if statements and you can use a jump table. That is, you calculate the value of a (small!) formula, add the result to a base address, and jump there. One very typical use is to calculate a heading based on the angle to an opponent. (See Bryant Brandon's Quadrants example, and also Grid Thingy .) If you just need to know whether you are looking basically horizontally or vertically, then you can use the well-known aim 1 tan trick. But if you then need to know, given a basically horizontal aim angle, whether it is on your left or your right side, or whether it is above you or below you, then you must make another test. If you need to know both, then you must make two further tests. That's a lot of ifg statements to chain through!

    In the end, what you really need to know is in which 45 degree ``pie slice'' your opponent is located, and respond accordingly. You can use a jump table, as for instance in the following example.

    choose_direction: 
       aim 45 / 2 * go_table + jump
    go_table:
    go_table_0_44:    go_left jump
    go_table_45_89:   go_down jump
    go_table_90_134:  go_up jump
    go_table_135_179: go_left jump
    go_table_180_224: go_right jump
    go_table_225_269: go_up jump
    go_table_270_314: go_down jump
    go_table_315_359: go_right jump
        
    The 2 in the header refers to the size of a cell in the table. When you are first hacking together a robot, you can always assume that a jump table cell will just contain the two instructions label jump. Later on, when your code stabilizes, you will want to ``unfold'' the jumps and replace them with the actual code that is being jumped to. Of course, you must remember to replace that 2 with the new cell size! And of course all the cells must be the same size. That is usually the case anyway if you are using a jump table in the first place. Otherwise, pick the largest cell, and pad the other cells with nop. You may not have realized that nop is actually a very useful instruction, but you know it now.

    If you can spare the time to load the vector with your jump targets, then you can replace the above code with something like the following.

    choose_direction: 
       aim 45 / vrecall jump
        
    or, if you can't spare the bottom of the vector, but instead have to start your table at, say, location 10,
    choose_direction: 
       aim 45 / 10 + vrecall jump
        
    That's a lot faster! But it does require that you sit there doing nothing destructive while you load up the vector with jump addresses. If you use a lot of jump tables, that can be suicidal, but if you use only one, or use one more than any of the others, it can be worth it. Finally, observe that the time it takes to get through a jump table is independent of the size of the table. (It depends only on the size of the table cell, and the formula in the preamble.) So even if you decide that you want to do the same thing in almost all of the table entries, you may still want to use a table rather than a chain of if operations.

    Indirection

    This is a technique that is actually more useful before you optimize. During initial development it pays (up to a point) to maximize the generality of your routines. It keeps your code small (and compile times shorter), and allows you to tune things in just one place. More importantly, you will be tempted to try things that you would just never do if you had to copy slight variations of the same code to eight or sixteen different places. Later on you will want to use Partial Application and Unfolding (alias Partial Evaluation; it's what you do when you unroll your aim loops, for example) to make things faster.

    The basic idea is simple enough, if a little mind-bending at first. It is legal to store a register name in a register. For example, the following code is allowed, and useful.

       speedy' s' store
        
    Now you can recover the value of speedy (or speedx, as the case may be) with ``s recall''. Remember, that's syntactic sugar for ``s' recall recall''. You can assign to whatever register is pointed to by s with code like ``0 s store''. If you have left store-warnings on, you will get one here. The missing apostrophe is not a typo! You need to turn store warnings off for this one. Again, that is syntactic sugar for ``0 s' recall store''. If ``s' recall'' leaves speedy' on the stack, then the stack evolves over time as follows.
    0
    0 s'
    0 s' recall
    0 speedy'
    0 speedy' store
        

    Partial Application

    As a matter of efficiency, you will eventually probably want to replace whatever code uses these s' recall recall sequences by two identical copies which differ only in that in one you have replaced this sequence with speedx' recall (speedx, for short), and in the other you have replaced it with speedy. This is called partial application, among other things. If you have a function and one of its arguments can only take a (small) finite number of distinct values (say, speedx' and speedy' and nothing else), then you can replace the definition of that function with copies in which that argument has been removed and replaced by an explicit mention of the relevant value.

    This can really bloat your code! This is why wall huggers are always so huge: they have in effect partially applied a general wall handler to four or eight (clockwise vs. counterclockwise times the four walls) different sets of particular values. (Of course, the partially applied code is usually written from scratch in these cases; I don't know if anyone has ever written a generic wall handling routine for wall huggers. It would have a very large number of parameters!)

    But it can also save you a lot of instructions. It's a classic trade off that you learn from your very first look at a 3 or 4 degree resolution aim loop, unrolled. The essential principle is that, though there is now much more code overall, the length of any given path through that code is shorter. There's just a lot more paths.

    And speaking of unrolled aim loops...

    Partial Evaluation

    Also known as Unfolding, Unrolling (when applied to loops) and In-Lining. The basic idea is to replace a function call with the function definition, to avoid the overhead of the call. For example, you might start out building a robot with a subroutine like

    cruisingSpeedNorth:
       -5 speedy store
       return
        
    This way, if you decide later that your cruising speed should be a little faster or a little slower (or you just want to know what would happen if...), then you only have to change the relevant numbers in one place.

    Then, once you're happy with the robot's behavior, you systematically replace all instances of

       cruisingSpeedNorth call
        
    with
       -5 speedy store
        
    thereby saving yourself the cost of the call (and the cost of placing the called address on the stack) and the return. Obviously, this works with larger and more complicated routines. The primary limit is, as usual for most optimizations, the 5000 instruction limit. And your patience, which is probably exhausted sooner.
    Hassle Tom Cornell by mail. Or visit his Home Page.
    Last modified: Mon Sep 18 19:31:31 MEST 2000