How to Write Emulator – Opcodes Jumptables

How to efficiently store and execute many small functions, without using a single if statement or switch construct.

Opcodes

Redcode contains 17 opcodes which perform standard operations such as copying data, maths and conditional jumps. It completely lacks any form of input or output. Once a program is run, it takes no input and produces no output.

The simulator needs to take each opcode out of the core and perform the desired function. It seems logical that creating one function per opcode would be a good start, something like this:

Void opADD() { … }
Void opSUB() { … }
Void opMUL() { … }

Then, to make things easier later, an enum can be created to allow us to store information about our opcodes in variables

Enum opcodes { ADD, SUB, MUL }

Then, all we need is to somehow store our program as a series of opcodes in memory, and have some sort of large if statement to choose the appropriate function to call:

opcode = get_next_instruction_from_memory()
If (opcode == ADD) opADD()
Else if (opcode == SUB) opSUB()
… etc …Code language: PHP (php)

Except with 17 opcodes, this gets a little large and becomes very tedious to type out. So a more organised way might be to use a switch statement instead:

Switch (opcode):
	Case ADD: opADD()
	Case SUB: opSUB()
	… etc …Code language: PHP (php)

However both of these have the same issue which is that to choose the correct function, a test has to be performed. While today’s CPUs have no real problem with this, and the code being run is hardly complex, this is one of those situations where it feels like a shorter way must exist.

Jump Tables

Jump tables are nothing more than an array full of function pointers, each entry accessible by an integer index. Enums in C++ are nothing more than fancy mappings between magic words and integers.

Combining the two makes some rather short, but special code:

First we create an array of function pointers

Void(*)() opcodeTable[17]Code language: CSS (css)

Then we assign each entry in the array the address of the appropriate function

opcodeTable[ADD] = &opADD
opcodeTable[SUB] = &opSUB
… etc ...

Now we can simply get the instruction from memory and use it as an index into that array, calling the array entry as if it were a function.

opcode = get_next_instruction_from_memory()
opcodeTable[opcode]()

What’s the point?

Using jump tables has some advantages, compared to using if/switch statements.

Less bugs

For a start there’s less typing, which means less bugs. It’s my own experience that most of my bugs happen when I’m bored and not really paying attention. And nothing is more boring than typing out the same code 17 times, replacing three letters each time.

Easier to debug

The core code is two lines, not 17. Either all the code all works, or none of the code works. There won’t ever be a weird case where one opcode doesn’t execute. This is a simulator of a programmable system, we need to know whether a program being simulated itself contains bugs, or whether the simulator is faulty.

Knowing the mechanism for selecting the correct opcode is bug free means the only place that bugs could exist is in the individual opcode functions. And “why isn’t ADD working” is easier to locate and debug compared to “sometimes it doesn’t even run the ADD opcode”.

Easier to extend

It’s also easy to extend. Maybe I start with only implementing three opcodes, and over time add more. All I need to do is write the function, add an enum definition and put the function pointer in the array.