Lecture 6

Andreas Moshovos

Spring 2007

 

Simple Control Flow

 

We have seen how “straight-line” instruction sequencing works. Let’s us now see how the processor can implement simple control-flow (instruction sequencing) structures such as the if-then-else. Let’s use the following pseudo-C code as a starting point:

 

unsigned int a = 0x00000000;

unsigned int b = 0x11223344;

unsigned int c = 0x22334455;

 

if (b == 0)

then a = b + c;

else a = b – c;

 

 

An implementation of this program in assembly is as follows:

 

      .section .data

va:   .long 0x0

vb:   .long 0x11223344

vc:   .long 0x55667788

 

      .section .text

main:

      movia r11, va

      ldw   r9, 4(r11)

      beq   r9, r0, then

else:

      ldw   r10, 8(r11)

      sub   r8, r9, r10

      stw   r8, 0(r11)

      beq   r0, r0, after

then:

      ldw   r10, 8(r11)

      add   r8, r9, r10

      stwio r8, 0(r11)

 after:

 

There are a two new instructions here. SUB is for subtraction. The other is “beq” which has the general form:

 

            beq rX, rY, label

 

B is for branch and eq is the “equals” condition. This instruction changes the execution flow depending on whether the condition is met.

It operation is:             Compare the values of registers rX and rY and if the condition is TRUE then PC = Destination. When the branch changes the PC we call it a taken branch, otherwise we call it a non-taken branch. Non-taken branches fall-through to the next instruction in memory. In the case of NIOS II, this amounts to updating PC with PC + 4. This is because all NIOS II instructions are four byte long.

 

Returning to our previous example,  the following flow diagram shows the various control flow sequences possible:

 

There are two possible paths through the code:

 

ABD and ACD. The first correspond to executing the ELSE part while the second to executing the THEN part.

 

(*) More efficient code can be found in last year’s lecture notes  . However, it uses the macros %hiadj() and %lo().

 

Other Branches:

 

There are other branches that test for different conditions. For example “blt r9, r10, label” tests whether r9’s value is less than r10’s. The instructions take the form:

 

Bcondition rX, rY, label

 

Where condition is the condition we are testing. Here are all the branches:

 

br à branch always / unconditional

blt à branch if rX < rY treating the numbers as signed. For example, 0xFFFFFFFF (-1) is less than 0x7FFFFFFFF (2^31 -1)

bge à branch if rX >= rY treating the numbers as signed. For example, 0xFFFFFFFF (-1) is greater than 0xFFFFFFFE (-2)

 bne à branch if rX != rY (the values are different)

beq à branch if rX == rY

bltu à branch if rX < rY treating the numbers as unsigned. For example, 0x7FFFFFFF (2^31 -1) is less than 0xFFFFFFFF (2^32-1).

bgeu à branch if rX >= rY treating the numbers as unsigned. For example, 0xFFFFFFFF (2^32-1) is greater than  0x7FFFFFFFF (2^31-1).

 

The NIOS II assembler provides a set of pseudo-instructions that test for other conditions also. These get translated into other instructions by the assembler. Here they are:

 

 

All aforementioned branches except for “br” are also called conditional branches as they test a condition and change the PC only if this condition is true. “Br” is an unconditional branch. Note that “beq r0, r0, label” and “br label” have the same effect on PC. If you take the computer architecture course you will understand that “br” is preferable from an implementation point of view since it does not read any registers. Just by looking at the instruction we know that the PC will change (truth is that we can do that with “beq r0, r0” but it requires a bit more hardware to also check the source operand names).

 

Comparison instructions:

 

NIOS II has also a set of comparison instructions that take the form:

 

cmpCONDITION rZ, rX, rY

 

They compare the values of rX with rY testing for the CONDITION. rZ is set to zero if the condition is not met and to 1 otherwise.

For example, cmpeq r9, r10, r11 will set r9 to 1 if r10 and r11 have the same value, otherwise r9 will be set to 1.

For each conditional branch there is also a corresponding cmp instruction:

 

cmplt, cmpge, cmpne, cmpeq, cmpltu, cmpgeu.

 

there are also corresponding pseudo-instructions:

 

cmpgt, cmpgtu, cmple, cmpleu.

 

Finally, there are variants of compare where the second source register is replaced by a 16-bit immediate. Please check the NIOS II reference manual for a complete list.

 

If-Then:

 

Let’s now try to implement the following if-then statement in assembly:

 

if (b = = 0)

     a = a + 1;

 

Following the same methodology as described before we get this:

 

            .text

main:

     movia r11, va

     ldw        r9, 4(r11)

     beq        r9, r0, then

notthen:

     beq        r0, r0, after

then:

     ldw        r8, 0(r11)

     addi       r8, r8, 1

     stw       r8, 0(r11)

after:

 

While the code is correct it’s inefficient and awkward when the if condition is not met. When b is not zero, the “beq r9, r0, then” falls through into the immediately following instruction in memory. That’s why we need the “beq r0, r0, after”, which skips over the “then:” part. In this case, we execute two branches one after the other just to avoid executing the if body. The above code is an if-then-else with an empty else body. Instead we can use the following:

 

 

            .text

main:

     movia r11, va

     ldw        r9, 4(r11)

     bne        r9, r0, after

then:

     ldw        r8, 0(r11)

     addi       r8, r8, 1

     stw       r8, 0(r11)

after:

 

Notice that we reversed the condition that the first branch checks: instead of checking whether r9 is equal to zero, we check whether it’s not equal. Then instead of branching to the “then” part, we branch after the if body. This implementation avoids the two branch problem we encountered before. It can be applied to any if-then and as we will explain next to any if-then-else, however, care must be taken to reverse the condition checked. For example if the if statement checks for (a >= 10) the assembly code should check for (a < 1) and branch to after the if body accordingly. The next section explains how this strategy can be applied to an if-then-else.

 

If-Then-Else Revisited:

 

Returning to our if-then-else example we can equivalently write:

 

      .section .data

va:   .long 0x0

vb:   .long 0x11223344

vc:   .long 0x55667788

 

      .section .text

main:

      movia r11, va

      ldw   r9, 4(r11)

      bne   r9, r0, else

then:

      ldw   r10, 8(r11)

      sub   r8, r9, r10

      stwio r8, 0(r11)

      br    after

else:

      ldw   r10, 8(r11)

      add   r8, r9, r10

      stw   r8, 0(r11)

 after:

 

Here we reversed the order of the THEN and ELSE code sections and hence we also reversed the condition in our first brach. That is, we replaced the “BEQ r9, r0, then” with a “BNE r9, r0, else” which tests whether the result of the previous operation was non-zero. Hence the first branch diverts execution to address “else” only when the value read from vb is non-zero. If the value read from vb is zero then the instruction following the branch is executed (the one at label then).

 

 

The following flow diagram shows the various control flow sequences possible:

 

 

 

There are two possible paths through the code:

 

ABD and ACD. The first correspond to executing the then part while the second to executing the else part.

 

Encoding – Limitations

 

While in assembly branch instructions a label as the last argument in the actual machine the destination is encoded relatively to the location of the instruction. A 16-bit displacement constant is used. That is the destination is calculated as PC + 4 + displacement where PC is the address where the branch instruction is placed in memory.. The displacement is a 16-bit signed constant and thus can take values from -32768 to +32767. The new PC can be at most -32764 bytes before the current PC or +32772 bytes ahead. The displacement must be divisible by four since all instructions are four bytes long and must appear at aligned memory addresses. The assembler and/or the linker will complaint if a branch is impossible to implement.

 

The encoding of branch instructions is as follows (this shows beq):

 

 

Where A and B are register names (that’s why they are 5 bits long each), Imm16 is the displacement and 0x26 is a unique number per branch type. 0x26 is for beq. That number is 0x0e for bge, for example.

 

For example, in the following code:

 

            beq  r9, r10, lala

     addi r9, r9, 1

     addi r10, r10, 2

lala:

     ldw r10, 0(r9)

 

The Imm16 field of the beq instruction will contain the value 8, whereas in the following code Imm16 will be 4:

 

            beq  r9, r10, lala

     addi r10, r10, 2

lala:

     ldw r10, 0(r9)

 

And in the following infinite loop, Imm16 is -4, or 0xFFFC:

 

lala:

beq  r0, r0, lala

 

The good news is that when you write in assembly, you don’t have to calculate the Imm16 value. The assembler will do that for you. You just write the target label.

 

Examples of other conditions being tested:

 

Let us now look at various examples of C conditions and how they can be implemented in NIOS II assembly:

 

if (a == b) then …

 

      .section .text

main:

      movia r11, va

      ldw   r9, 4(r11)

      ldw   r8, 0(r11)

      beq   r8, r9, then

      br    after

then:

      ...

after:

 

The above code goes through two branch instructions to execute the then part (beq and br). There is a more efficient implementation. It relies on testing the opposite condition:

 

      .section .text

main:

      movia r11, va

      ldw   r9, 4(r11)

      ldw   r8, 0(r11)

      bne   r8, r9, after

then:

      ...

after:

 

Here’s another example where we compare two different memory variables.

            if (a > b) then …

 

      .section .text

main:

      movia r11, va

      ldw   r9, 4(r11)

      ldw   r8, 0(r11)

      ble   r8, r9, after

then:

      ...

after:

           

Note that in the previous example we used a signed condition. Note that we also used the reverse condition in the branch instruction since we want to execute the instruction that follows (the then part) if a is greater than b. Thus we test for the opposite condition which is less or equal.

 

            if (va == 3) then …

 

      .section .text

main:

      movia r11, va

      ldw   r8, 0(r11)

      cmpeqi r9, r8, 0x3

      beq   r9, r0, after

then:

      ...

after:

           

Note that we use the cmpeqi = compare if equal with immediate instruction. This instruction compares r8 with the constant 1 and sets the value of r9 to 1 if the two values are equal (r8 contains the value 3) or to 0 otherwise. It does not access memory. The immediate argument to compare is a 16-bit immediate.

 

We then test if r9 holds the value zero, If it does, then r8 did not contain 3 and hence we branch over then.

 

The following implementation is equivalent (but r9 at the end contains 3 and not 0 or 1):

 

      .section .text

main:

      movia r11, va

      ldw   r8, 0(r11)

      addi r9, r0, 0x3

      beq   r8, r9, after

then:

      ...

after:

 

Let’s now look at a combined condition (&& = AND):

 

            if (va == 1 && vb == 2) then … else …

 

      .section .text

main:

      movia r11, va

 

      ldw   r8, 0(r11)

      cmpeqi r10, r8, 0x1

      beq r10, r0, else

 

      ldw   r9, 4(r11)

      cmpeqi r10, r9, 0x2

      beq   r10, r0, else

then:

      ...

      br after

else:

      ...

after:

           

Since all conditions must be met for the then part to get executed we test the opposite condition and branch to the else part as soon as one of these opposite conditions is met. Only if none of the opposite conditions is met we reach the then part.

 

Let’s now look at a combined condition (|| = OR):

 

            if (va == 1 || vb == 2) then … else …

 

      .section .text

main:

      movia r11, va

 

      ldw   r8, 0(r11)

      cmpeqi r10, r8, 0x1

      bne r10, r0, then

 

      ldw   r9, 4(r11)

      cmpeqi r10, r9, 0x2

      beq   r10, r0, else

then:

      ...

      br after

else:

      ...

after:

 

Since the then part is executed as long as at least one condition is met, we first test for the condition as specified (not the opposite) and branch to then part if this is met. If not we proceed to test the opposite of the second condition and branch to the else part if it is met.