Lecture 6
Andreas Moshovos
Spring 2005
Using the Assembly
Programming Language to Write Programs
As we have discussed, internally instructions and data are represented using binary quantities. To aid in programming a machine, however, we commonly use a symbolic representation of instructions and data. This is the assembly programming language. Here we explain the conventions used by the assembly programming language used for 68k and the specific tools that you are going to be using for the Ultragizmo board. Different CPUs most likely use different assembly languages. Moreover, there may even been different assembly dialects for the same CPU.
Once you write an assembly program (which is a text representation of instructions and memory data values), then you have to pass it through an assembler. The assembler is a program that parses your assembly program and translates it into its equivalent binary form. It is this binary representtaion that is then loaded into the computer’s memory and gets executed. Even for that “binary representation” there are specific formats that are used. The operating system (or “monitor” in the case of Ultragizmo) running on the computer knows how interpret these formats and to load your program and data in memory before it is executed by the computer. In the ultragizmo a simple text representation is used to communicate the memory contents including instructions and data. This format is called SREC. Briefly in SREC data values are represented using the hexadecimal system using the ASCII encoding for each hex digit. Each line has the form ADDRESS DATA where ADDRESS is the starting address where the data values encoded in DATA should be stored (note this is a simplified explanation of the format – the actual format is a bit more complicated – you can look at the srec file that the assembler produces and then find a complete description on-line).
Let’s see how we could express our example program in 68k assembly:
ORG $10000
start move.l $20004, d6
add.l $20008, d0
lala move.l d6, $20000
ORG $20000
VA dc.l $00000000
VB dc.l $11223344
VC dc.l $22334455
The “org $10000” is an assembler directive. That is, it *does not*
correspond to a 68k instruction or data. It just tells to the assembler that
whatever follows should be placed starting from memory address $10000.
The next line is:
start move.l $20004, d6
The “move.l $20004, d6” is a textual representation
of our first instruction.
The “start” prefix defines a label. The label is “start”
and what really happens is that if at any point we use the word “start” this is
going to get replaced with the constant $10000. The $10000 is the address where
the instruction will be stored (since it follows the org statement). Note that
similarly, “lala” is later defined to correspond to the constant $1000c (since
each instruction is 6 bytes, as we have seen, lala is at $10000 + 2 x 6 =
$1000c).
Anything that appears on the first column of your
assembly text file is interpreted as a label definition.
The next two lines, each defines one instruction.
The binary representation of each instruction is placed immediately after the
previous instruction in memory. So, the first move will be placed starting at
address $10000, the add.l that follows will be placed at address
$10000+sizeof(first move.l) = $10000 + 6 = $10006, and finally, the last move.l
will be placed immediately after the add.l, or at $10006 + sizeof(add.l) =
$10006 + 6 = $1000c. The last move will occupy also 6 bytes (as we explained in
the previous lecture) and hence it will occupy addresses $1000c through $10011.
The second “org” statement directs the assembler to
now place whatever follows starting from address $20000. If there wasn’t an ORG
directive whatever followed would have been placed at address $10012
(immediately after the last move.l).
The next line is:
VA dc.l $00000000
The “VA” defines the label “VA” to be the constant
$20000 since this immediately follows the ORG $20000 directive.
The “dc.l” is an assembler directive that instructs
the assembler to interpret whatever follows as a long word constant. dc =
Define Constant. Hence this directive will result in the assembler placing the
value $0 as in memory as a long word starting from address $20000.
The next line is:
VB dc.l $11223344
This defines VB to be the constant $20004 because it
just follows the previous statement which ended up at address $20003. Similarly
to the previous line, this places a long word constant in memory.
The last line is similar to the previous two.
We can now rewrite our program by utilizing labels
as opposed to direct constants:
ORG $10000
start move.l VB, d6
add.l VC, d0
lala move.l d6, VA
ORG $20000
VA dc.l $00000000
VB dc.l $11223344
VC dc.l $22334455
Note that now we do not refer to the variables using their absolute address. Instead we use the labels placed in front of them. This way we could use a different starting address for our data (by changing the parameter of the second ORG) without having to go and update all instruction references.
General Notes:
Please read through the corresponding section of the Ultragizmo manual (assembly language) for detailed information about all assembly language directives. Here we discuss only some of them.
The DC. directive can take a list of values as in:
dc.l $01, $02, $03, $04
Each value will be placed as a long word consecutively in memory. Thus the aforementioned DC will allocate 16 bytes in memory (four long words).
The dc. directive also accepts a datatype. Besides long words, we can use it for bytes and words (.b and .w respectively).
Use the $ prefix for hexadecimal numbers, no prefix for decimal numbers, the % for binary numbers and 0 for octal.
You can also refer to ascii values using single quotes as in ‘0’ (this is 48 the ascii code for 0).
You can also use expressions such as ‘lala + 4’.
The DS directive take the form DS number and it simply allocates memory space. It does not initialize this space. So DS 100 allocates 100 bytes.
Koko EQU $ffff is the equivalent of #define Koko 0xffff in C. It is used to define symbolic constants.
Other examples:
The following code calculates a = 2 x b + c
org $20000
vb dc.l $12003200
vc dc.l $00223311
va ds 4
org $10000
move.l. vb, d0
add.l d0, d0
add.l vc, d0
move.l d0,
va
What does this calculate?
org $20000
vb dc.l $12003200
vc dc.l $00223311
va ds 4
org $10000
move.l. vb, d0
add.l d0, d0
add.l vc, d0
add.l d0, d0
add.l d0, d0
move.l d0,
va