NomNom
From Ece
Contents |
Project Proposal
1. Team Name and Members
Team name: NomNom
Team leader: James Key
Team member: Josh Caldwell
2. Design Overview
The design that we decided to implement focuses on providing a larger register file than would normally be feasible with a 16-bit wide instruction base. The register file will be able to accept a 4-bit wide address yielding a total of 16 registers for use in general purpose processing. This decision comes with certain sacrifices. The function portion of the instruction, which normally controls the ALU operation, was removed. Thus, all 16 instructions must be contained within the opcode.
Another design decision involved with using larger address widths was with immediate operations. Leaving two registers (Rs and Rt) in the I-type instruction would limit the immediate value to 4 bits, which is unacceptably small. Thus, the Rt register address was removed from the I-format instruc-tions allowing the immediate to increase to 8 bits. The assembly functions were altered accordingly.
The register file contains 16 items as listed below:
| $zero |
| $at |
| $gp |
| $sp |
| $ra |
| $mt |
| $mr |
| $b |
| $a0 |
| $a1 |
| $s0 |
| $s1 |
| $s2 |
| $s3 |
| $t0 |
| $t1 |
Some of these registers are custom for our architecture and are listed below:
$mt: Memory transmit. Whenever a save word operation is performed, the contents of $mt are used to write to memory. This is necessary due to the lost Rt register in I-type instructions. This leads to save instructions typically requiring two instructions.
$mr: Memory receive. Whenever a load word operation is performed, the contents of the memory are loaded into the $mr register. This is necessary due to the lost Rt register in I-type instructions. This leads to load instructions typically requiring two instructions.
$b: Brancher. The sole branch instruction in our architecture uses this register to make the compar-ison to Rs to determine whether or not a branch is performed. A branch will typically be preceded by a value being loaded into this register for comparison.
We have chosen to use a positive edge triggered clocking system for the majority of our architec-ture. The exception to this is that the reading from the register file will occur on the negative edge of the clock.
2.1. Instruction format
R-format:
| Opcode | Rs | Rt | Rd |
| 4 bits | 4 bits | 4 bits | 4 bits |
I-format:
| Opcode | Rs | Immediate |
| 4 bits | 4 bits | 8 bits |
J-format:
| Opcode | Jump |
| 4 bits | 12 bits |
2.2. Instructions
| Name | Mneumonic | Operation | Opcode | Format |
| No operation | nop | Nop | 0000 | N/A |
| Addition | add | add $s1, $s2, $s3; $s1 = $s2 + $s3 | 0001 | R |
| Subtract | sub | sub $s1, $s2, $s3; $s1 = $s2 - $s3 | 0010 | R |
| And | and | and $s1, $s2, $s3; $s1 = $s2 & $s3 | 0011 | R |
| Or | or | or $s1, $s2, $s3; $s1 = $s2 | $s3 | 0100 | R |
| Nor | nor | nor $s1, $s2, $s3; $s1 = ~($s2 | $s3) | 0101 | R |
| Shift left | sl | sl $s1, $s2, $s3; $s1 = $s2 << $s3 | 0110 | R |
| Shift right | sr | sr $s1, $s2, $s3; $s1 = $s2 >> $s3 | 0111 | R |
| Set less than | slt | slt $s1, $s2, $s3; $s1 = ($s2 < $s3)? 1 : 0 | 1000 | R |
| Set greater than | sgt | sgt $s1, $s2, $s3; $s1 = ($s2 > $s3)? 1 : 0 | 1001 | R |
| Load immediate | li | li $s1, 120; $s1 = 120 | 1010 | I |
| Load word | lw | lw $s1(24); $mr = M[$s1 + 24] | 1011 | I |
| Save word | sw | sw $s1(24); M[$s1 + 24] = $mt | 1100 | I |
| Branch if equal b | beb | beb $s1, BraAddr; PC = ($s1 == $b)? PC + 2 + BraAddr : PC + 2 | 1101 | I |
| Jump | j | j JumpAddr; PC = JumpAddr | 1110 | J |
| Jump and link | jal | jal JumpAddr; $ra = PC + 4; PC = JumpAddr | 1111 | J |
2.3. Assembly language and machine code for the test program
li $s0, 40hex
li $s1, 10hex
li $t1, 8
sl $s1, $s1, $t1
li $t1, 10hex
or $s1, $s1, $t1
li $s2, 0Fhex
li $s3, F0hex
li $t0, 00hex
li $a0, 10hex
li $a1, 05hex
BegWhile: li $b, 0
sgt $t1, $a1, $zero
beb $t1, EndWhile
li $t1, 1
sub $a1, $a1, $t1
lw $a0(0)
add $t0, $mr, $zero
li $t1, 1hex
li $b, 8
sl $t1, $t1, $b
li $b, 0
sgt $t1, $t0, $t1
beb $t1, Else
li $t1, 3
sr $s0, $s0, $t1
or $s1, $s1, $s0
li $t1, FFhex
li $b, 8
sl $mt, $t1, $b
sw $a0(0)
j EndIf
Else: li $t1, 2
sl $s2, $s2, $t1
nor $t1, $s2, $zero
and $s3, $s3, $t1
li $mt, FFhex
sw $a0(0)
EndIf: li $t1, 2
add $a0, $a0, $t1
j BegWhile
EndWhile: j $ra
3. Tasks and Schedule
James Key:
1st and 2nd week: Datapath
2nd week: Hazard detection
3rd and 4th week: Coding
4th week: Report
Josh Caldwell:
1st and 2nd week: Datapath
2nd week: Control unit
3rd and 4th week: Coding
4th week: Report
Whole time: Web Design
4. Planned Meetings
- Thursday March 31: 2-3pm
- Thursday April 7: 2-3pm
- Thursday April 14: 2-3pm
- Wednesday April 20: 5-?pm
5. Team Name
NomNom
http://www.youtube.com/watch?v=SMWi7CLoZ2Q
Progress
April 5: Finished setting up datapath. Now work on hazard detection/control unit is underway.
April 12: Finishing up hazard detection and the control unit.
April 20: Coding done, debugging never finished though.
Project Final Project
1.Team Name
NomNom
2.Meetings (March 31 - April 20)
Total number of meetings: 5
| Name | #attendance | Absence | Total attendance time |
| Josh Caldwell | 5 | None | 10 hours |
| James Key | 5 | None | 15 hours |
3. Implementation Diagram
- A series of muxes added before the PC that allow branch and jump addresses to be used
- Second read address on register file has a mux that selects between the Rt field and the $b field
- Register file always outputs the contents of $mt to the ID/EX buffer
- Rd is not directly passed to ID/EX buffer. A mux is used to select a write destination from Rs, Rd, and $mr.
- Sign extend unit provided in ID stage
- The writeData for the Data Memory cell is always tied to the $mt line from the EX/MEM buffer, which subsequently is passed from the ID/EX buffer
- Branch logic added in the MEM stage
4.ALU Control
The ALU has a 4-bit control signal, ALUOp, to decide what operation to do. ALUOp is set by the control unit depending on the instruction. For example, an instruction of 4’b0010 will set the ALUOp to 4’b0001 for a subtract. We have the following ALU operations: add, subtract, and, or, nor, shift left, shift right, a < b and b < a. The zero bit gets set high whenever an operation is zero.
5. Control Unit
The control unit has one input, instrControl, which is the instruction from the program memory. The unit provides the following outputs:
- RegWrite: controls if we will write a value into the register in the ID phase
- MemtoReg: controls if either the memory or the ALU result will be sent to the ID stage to be written
- Branch: helps decide if there will be a branch
- MemRead: decides if the memory will be read to load a word from memory
- MemWrite: set high if the memory will be written to store a word into memory
- RegDst: either rd, rs, or $mr contingent how liRlw is set for a load immediate, an R-type, or load word instruction, respectively
- ALUSrc1: control whether we want to send Rs or zero to the ALU
- ALUSrc2: control if we want to send Rt or the sign-extended number to the ALU
- ALUOp: operation set to perform the correct arithmetic operation in the ALU
- doJmp: a signal before the PC to decide on a jump or not to be loaded into the PC
- liRlw: controls whether a load immediate, R-type, or load word instruction go the ID/EX buffer
6.Forwarding Unit
The forwarding unit takes in IDEX_Rs, IDEX_Rt, EXMEM_Rd, EXMEM_RegWrite, MEMWB_Rd, and MEMWB_RegWrite. It outputs ForwardA and ForwardB. ForwardA and ForwardB are set to 2’b00 by default. When a data dependency is detected between the operation in execution with the previous operation located in the MEM stage, the corresponding Forward unit (Forwar-dA/ForwardB) is set to 2’b10. This is checking for a hazard in the Execution stage, which is an arithmetic operation.
When a data dependency is detected between the operation in execution with the previous op-eration located in the WB stage, the corresponding Forward unit (ForwardA/ForwardB) is set to 2’b01.
7.Hazard Detection Unit
The hazard has the inputs IDEX_MemRead, IDEX_Rt, IFID_Rs, IDEX_Rt, IFID_Rt and the outputs stall and PCWriter. If IDEX_MemRead is high and IDEX_Rt equals IFID_Rs or IDEX_Rt equals IFID_Rt, then a stall signal is sent into a mux in the Instruction Decode stage that causes a “nop” when set high, and PCWrite is set low to make sure that PC isn’t incremented.
8. Simulation Results
-Unsuccessful
9. Discussion
We were able to successfully able to come up with and produce what seemed to be a coherent design. However, we were unable to synthesize our code to a form capable of simulation due to is-sues with Xilinx. We realize that this failure is a hefty penalty on the grade for this project, but all of the necessary code for creating our architecture is attached.





