Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1Learning Outcomes

From our overview:

The assembler translates assembly code to machine modules. It translates pseudoinstructions to real instructions and produces an object file. The assembler uses assembly directives to produce the object file, which contains portions of an executable’s text segment, data segment, and more.

2Object File

The final .o object file is a machine module, which is in binary:

  1. Object File Header: size and position of other pieces of the object file. This is like the “table of contents”.

  2. Text Segment: machine code.

  3. Data Segment: binary representation of static data in the source file.

  4. Symbol Table: List of file’s labels, static data that can be referenced by other programs

  5. Relocation Table: Lines of code to fix later (by Linker)

  6. Debugging Information.

The Text Segment and Data Segment (recall program memory layout) translates into machine code where possible. The last three items (Symbol Table, Relocation Table, Debugging Information) are used for the downstream Linker to resolve everything and create a single executable.

We highly recommend reading this section and then seeing the example at the end of this chapter.

3Text Segment

An example text segment is discussed in the example at the end of this chapter.

Arithmetic, Logical Instructions: For simple cases like add or sub, the 32-bit instruction contains all the information needed to build the machine code. This encompasses R-Format and some I-Format instructions.

PC-Relative Branches and Jumps, like beq/bne/etc. and jal. Once pseudoinstructions are replaced with real ones, all known PC-relative addressing within the object file can be computed. Determine the offset to encode by counting the number of half-word instructions between current instruction and target instruction.

After replacing pseudoinstructions, the assembler performs two passes over the program to compute all offsets.

If the assembler only made one pass (from earlier instructions to later), “forward references” (labels to locations later in the program) would have unknown PC-relative offsets.

4Symbol Table

The symbol table is a list of labels (procedures) and data (like global arrays) in your file that could be used by other files. The Symbol Table is also used by the debugger gdb.

An example symbol table is discussed in the example at the end of this chapter.

5Relocation Table

The relocation table is a “to-do list” of things to fix later (by the downstream linker). It contains placeholders for:

An example relocation table is discussed in the example at the end of this chapter.