1Learning Outcomes¶
Implement a datapath that supports loads (I-Type) and stores (S-Type).
Explain why certain control signals can be marked as “don’t care” (*).
Implement partial loads and stores for the course project, and gain a basic understanding of the minimum requirements for the DMEM block.
🎥 Lecture Video
🎥 Lecture Video
2Building a Processor with DMEM access¶
Recall that load instructions are I-Type because they read a register, have an immediate, and write to a register a 32-bit value read from memory.
To support lw, we use a similar datapath to addi but instead compute an address with which to access DMEM.
RegFile: We read one registerrs1and write one registerrd. The value to write is a word read from memory.PC: We read from and write toPC. The value to write isPC + 4.DMEM: We read the memory word at addressR[rs1] + imm.
Loads (and stores) participate in the MEM phase of the five step process. We therefore introduce additional logic connecting DMEM to the ALU and the RegFile, as shown in Figure 1:

Figure 1:DMEM: Connect and use a mux before WB (Write Back).
DMEM: To read the memory at an address, we use the ALU to compute the address as alu = R[rs1] + imm. This readily reuses the circuitry for arithmetic and logical I-Type instructions.
Mux: We now include a mux after the ALU and DMEM that uses the control signal WBSel to select between two values for wdata, the data to write to R[rd]:
Arithmetic and Logical R-Type or I-Type instructions: The output of the ALU (
alu), which is now wired both intoaddrand into the new mux.Load instructions: The output of
DMEM(mem).
Figure 2:The lw datapath. Use the menu bar to trace through the animation or download a copy of the PDF/PPTX file.
Instruction Fetch: Increment PC to next instruction (see R-Type datapath).
Instruction Decode: Fetch
R[rs1]from RegFile, build the immediateimmfor I-Type instructions (see I-Type datapath). Also configure control logic:Configure
ImmSeltoI-type immediates.Set
RegWEnto1.Set
BSelto1.Set
ALUSeltoAdd.Set
MemRWtoRead.Set
WBSelto0.
After some delay, the immediate generator block updates its output signal
immto the appropriate sign-extended 32-bit immediate value, register valueR[rs1]is read, and control signals are set.Execute: Because the control line
BSel=1selects the generated immediateimmfor ALU inputB, our ALU computesR[rs1] + imm.Memory: Read memory at address
alu = R[rs1] + imm. After some delay, the output signalmemhas the valueMem[R[rs1] + imm].Write Back: Write DMEM output to the destination register by connecting the output of the
WBSelmux to RegFile’swdatainput.Around the next rising clock edge,
wdata,RegWEn, andrdshould be held stable through setup and hold time of RegFile.
3Tracing the Store Datapath¶
By contrast, store instructions, by contrast, are S-Type because they read two registers, have an immediate, and write to memory. Stores do not write data to registers.
We do not need to add additional blocks for stores, but we will need to:
Upgrade the Immediate Generator to support immediates in S-Type instructions; we encourage you to read that section afterwards.
Wire
R[rs2]towdata(DMEM input signal).
Figure 3:The sw datapath. Use the menu bar to trace through the animation or download a copy of the PDF/PPTX file.
Instruction Fetch: Increment PC to next instruction (see R-Type datapath).
Instruction Decode: Fetch
R[rs1]andR[rs2]from RegFile, and build the immediateimmfor I-Type instructions (see I-Type datapath). Also configure control logic (see below).Execute: Because the control line
BSel=1selects the generated immediateimmfor ALU inputB, our ALU computesR[rs1] + imm.Memory: Write memory at address
alu = R[rs1] + immby holdingaddrandwdata(DMEM input).Around the next rising clock edge,
wdata(DMEM input),MemRW, andaddrshould be held stable through setup and hold time of RegFile.Write Back: (We don’t write to
RegFile, so skip this.)
4Partial Loads and Stores: Course project details¶
This section discusses how to implement partial loads and stores. The details in this section are specific to the course project. We start by introducing additional functions of the course project’s DMEM block.
Recall from an earlier section that according to the RISC-V standard, loads and stores of words should specify word-aligned addresses. DMEM memory accesses in this course (and in the course project) therefore follow the following conventions:
All DMEM memory accesses are word accesses. Addresses must be 32-bits wide and a multiple of 4.
Loads read a word (4 bytes) at a time.
Stores write at most 4 bytes at a time, specified by a bitmask.
Consider Figure 4, which shows each byte in memory, referred to by its 32-bit address.
In a single access, DMEM can access a set of 4 bytes in one of the red boxes in the diagram, e.g., bytes at memory addresses
0-1-2-3or12-13-14-15.In a single access, DMEM cannot access a set of 4 bytes across two red boxes in the diagram, e.g., bytes at memory addresses
13-14-15-16. This would take two accesses.
Figure 4:Diagram of aligned memory (Red boxes around every 4 bytes).
4.1Word-aligned DMEM¶
DMEM forces memory accesses to be with addresses of multiples of 4 bytes. DMEM always zeros out the bottom 2 bits of any address you provide (i.e., round down to the nearest multiple of 4), and then accesses 4 bytes starting at this modified address.
For example, if you give DMEM the address
19=0b010011, DMEM will zero out the bottom 2 bits to get16=0b010000, and then start at this modified address to access 4 bytes (16-17-18-19).You don’t need to zero out the bottom 2 bits yourself--the provided DMEM implementation will automatically do this for any address you provide.
Assume memory accesses in instructions do not cross word boundaries. All instructions should follow RV32I standard convention and will not cross a word boundary in memory (i.e., a single access will never cross a red line in Figure 4).
DMEM uses only the lower 16 bits. Due to Logisim size limitations, the memory unit only uses the lower 16 bits of the provided address, discarding the upper 16 bits. This means that the memory can only store 216 bytes of data.
The provided tests will always set the upper 16 bits of addresses to 0, and any tests you write should avoid using the upper 16 bits when interacting with memory.
Partial Stores: See the section below for partial-store-specific details, e.g., unit tests.
4.2Partial Load¶
The partial_load.circ circuit in the course project is designed to take the 32 bits of data read from DMEM, extract the relevant data, then process the data to put into a 32-bit register. The signals for this subcircuit are in Table 1.
Table 1:Signals for the course project partial load subcircuit.
| Name | Direction | Bit Width | Description |
|---|---|---|---|
Instruction | Input | 32 | The load instruction being executed. |
MemAddress | Input | 32 | The memory address to read from.[2] |
DataFromMem | Input | 32 | The data read from DMEM. |
DataToReg | Output | 32 | The data to put in the register. |
Behavior: For loads, DMEM will always read 32 bits of memory starting at an address that is a multiple of 4 bytes. The partial load subcircuit then uses the instruction itself to determine what bytes to extract.
Example 1. Suppose we had a lb instruction on address 6 = 0b000110.
DMEM will read the 4 bytes at addresses
4-5-6-7.We want just the byte at address
6(again, albinstruction).The bottom 2 bits of the address
6are0b10, so we want the 2nd byte (zero-indexed, i.e., bits 16-23) of the data read from DMEM.
Example 2. Suppose we had a lh instruction on address 9 = 0b001001.
DMEM will read the 4 bytes at addresses
8-9-10-11.We want the two bytes at addresses
9-10(again, alhinstruction).The bottom 2 bits of the address
9are0b01, so we want to start extracting at the 1st byte (zero-indexed) and extract two bytes, i.e., bits 8-23 of the data read from DMEM.
Table 2:If useful, all scenarios you should handle in the partial load subcircuit for the course project. SignExt means sign-extend.
| Instruction | MemAddress[1:0][2] | Value to put in DataToReg |
|---|---|---|
lb | 00 | SignExt(DataFromMem[7:0]) |
lb | 01 | SignExt(DataFromMem[15:8]) |
lb | 10 | SignExt(DataFromMem[23:16]) |
lb | 11 | SignExt(DataFromMem[31:24]) |
lh | 00 | SignExt(DataFromMem[15:0]) |
lh | 01 | SignExt(DataFromMem[23:8]) |
lh | 10 | SignExt(DataFromMem[31:16]) |
lw | 00 | DataFromMem |
4.3Partial Store¶
The partial_store.circ circuit in the course project is designed to take data from a register, process it, and store the relevant bytes to memory. The signals for this subcircuit are in Table 3.
Table 3:Signals for the course project partial store subcircuit.
| Signal Name | Direction | Bit Width | Description |
|---|---|---|---|
Instruction | Input | 32 | The store instruction being executed. |
MemAddress | Input | 32 | The memory address to store to.[2] |
DataFromReg | Input | 32 | The data from the register. |
MemWEn | Input | 1 | The control signal indicating whether writing to memory is enabled for this instruction. |
DataToMem | Output | 32 | The data to store to memory. |
MemWriteMask | Output | 4 | The write mask indicating whether each byte of DataToMem will be written to memory. |
Behavior: For stores, DMEM will write bitmasked data to memory at an address that is a multiple of 4 bytes. Each bit in the 4-bit write-mask MemWriteMask corresponds to the 4 bytes of the word; if a bit in MemWriteMask is 0, DMEM will not store the corresponding byte of DataToMem to memory.
Example 1: Suppose we had a sb instruction on address 3 = 0b000011.
Of the word’s byte addresses
0-1-2-3, we actually want to just write one byte at address3, because the bottom 2 bits of the address3are0b11(and it is ansbinstruction).Make a 32-bit value where bits 24-31 are the 8 bits we want to store to memory.[3]
Make a 4-bit writemask
0b0001, which says to only write the zeroth byte to memory, leaving the other bytes in the memory word unchanged.
Example 2: Suppose we had a sh instruction on address 2 = 0b000010.
Of the word’s byte addresses
0-1-2-3, we actually want to write two bytes at addresses2and3, because the bottom 2 bits of the address2are0b10(and it is ansbinstruction).Make a 32-bit value where bits 16-31 are the 16 bits we want to store to memory.[3]
Make a 4-bit writemask
0b1100, which says to only write the second and third bytes to memory, leaving the other bytes in the memory word unchanged.
Note that sh and sb instructions specify data as the lower bits of a 32-bit value, i.e., bottom 16 bits and bottom 8 bits, respectively. See Table 4.
Table 4:If useful, all scenarios you should handle in the partial store subcircuit for the course project. (Also recall no sign-extension for stores.)
| Instruction | MemAddress[1:0][2] | DataToMem[31:24] | DataToMem[23:16] | DataToMem[15:8] | DataToMem[7:0] | MemWriteMask |
|---|---|---|---|---|---|---|
sb | 00 | 0 | 0 | 0 | DataFromReg[7:0] | 0001 |
sb | 01 | 0 | 0 | DataFromReg[7:0] | 0 | 0010 |
sb | 10 | 0 | DataFromReg[7:0] | 0 | 0 | 0100 |
sb | 11 | DataFromReg[7:0] | 0 | 0 | 0 | 1000 |
sh | 00 | 0 | 0 | DataFromReg[15:8] | DataFromReg[7:0] | 0011 |
sh | 10 | DataFromReg[15:8] | DataFromReg[7:0] | 0 | 0 | 1100 |
sw | 0b00 | DataFromReg[31:24] | DataFromReg[23:16] | DataFromReg[15:8] | DataFromReg[7:0] | 1111 |
Tips/reminders for the course project:
Recall that we should not write when our instruction is not a store. When
MemWEncontrol signal is 0,MemWriteMaskshould be set to0b0000.The bytes in
DataToMemthat aren’t being written to the file (i.e. whereMemWriteMaskis 0) can technically be any value[3], but Table 4 lists them as 0s in the table. The unit test for this part also assumes that those bytes will be 0.