# Pipelined MIPS Processor in Verilog (Part-1)

Last time, I posted a Verilog code for a 16-bit single-cycle MIPS Processor and there were several requests for a Verilog code of a 32-bit 5-stage pipelined MIPS Processor. The first problem with the single-cycle MIPS is wasteful of the area which only each functional unit is used once per clock cycle. Another serious drawback is that the clock cycle is determined by the longest possible path in the Processor. Thus, the pipelined MIPS came out to solve those problems by exploiting most functional unit in one clock cycle and improving the performance by increasing the instruction throughput. However, the pipelined MIPS also faces challenges such as control and data hazards.

## Today, a 32-bit 5-stage pipelined MIPS Processor will be designed and implemented in Verilog.

### Verilog code for special modules such as Forwarding Unit, Flush Control Unit and Stall Control unit for solving hazards will be also provided. The Verilog code for 32-bit pipelined MIPS Processor is mostly done by using structural modeling.

This project is quite long so I will divide it into 3 parts (Part 1, Part 2, and Part 3).
 Pipelined MIPS Design Flow

#### The design flow for the 32-bit pipelined MIPS flow is shown in the above figure. However, at first, the instruction set of the MIPS Processor is as follows:

1. ADD rd, rs, rt: Reg[rd] = Reg[rs] + Reg[rt].
2. BNE rs, rt, imm16: if (Reg[rs] != Reg[rt]) PC = PC + 4 + Sign_ext(Imm16)<<2 else PC = PC + 4.
3. J target: PC = { PC[31:28], target, 00 }.
4. JR rs: PC = Reg[rs].
5. LW rt, imm16(rs): Reg[rt] = Mem[Reg[rs] + Sign_ext(Imm16)].
6. SLT rd, rs, rt: If (Reg[rs] < Reg[rt]) Reg[rd] = 00000001 else Reg[rd] = 00000000.
7. SUB rd, rs, rt: Reg[rd] = Reg[rs] – Reg[rt].
8. SW rt, imm16(rs): Mem[Reg[rs] + Sign_ext(Imm16)] = Reg[rt].
9. XORI rt, rs, imm16: Reg[rt] = Reg[rs] XOR Zero_ext(Imm16).

#### From the instruction set architecture, the single-cycle datapath with the control unit of the MIPS Processor is obtained as shown below.

 Single-Cycle MIPS Datapath with Control Unit

### Now, presenting the Verilog code for the single-cycle MIPS Processor datapath first.

#### Verilog code for instruction memory:

```/* Instruction memory module.  Change the \$readmemb line to have the name of the program you want to load */
// fpga4student.com: FPGA projects, Verilog Projects, VHDL projects
// Verilog project: 32-bit 5-stage Pipelined MIPS Processor in Verilog
// Instruction memory module
`timescale 1 ps / 100 fs

output [31:0] instruction;
reg [31:0]instrmem[1023:0];
reg [31:0] temp;

buf #1000 buf0(instruction[0],temp[0]),
buf1(instruction[1],temp[1]),
buf2(instruction[2],temp[2]),
buf3(instruction[3],temp[3]),
buf4(instruction[4],temp[4]),
buf5(instruction[5],temp[5]),
buf6(instruction[6],temp[6]),
buf7(instruction[7],temp[7]),
buf8(instruction[8],temp[8]),
buf9(instruction[9],temp[9]),
buf10(instruction[10],temp[10]),
buf11(instruction[11],temp[11]),
buf12(instruction[12],temp[12]),
buf13(instruction[13],temp[13]),
buf14(instruction[14],temp[14]),
buf15(instruction[15],temp[15]),
buf16(instruction[16],temp[16]),
buf17(instruction[17],temp[17]),
buf18(instruction[18],temp[18]),
buf19(instruction[19],temp[19]),
buf20(instruction[20],temp[20]),
buf21(instruction[21],temp[21]),
buf22(instruction[22],temp[22]),
buf23(instruction[23],temp[23]),
buf24(instruction[24],temp[24]),
buf25(instruction[25],temp[25]),
buf26(instruction[26],temp[26]),
buf27(instruction[27],temp[27]),
buf28(instruction[28],temp[28]),
buf29(instruction[29],temp[29]),
buf30(instruction[30],temp[30]),
buf31(instruction[31],temp[31]);

begin
end

initial
begin
end

endmodule

module instrmemstimulous();

wire [31:0] instr;

initial
begin
#10000;
\$finish;
end

endmodule
```

Instructions that you want to load into the instruction memory need to save in the "instr.txt" in the binary format. If you want to save it hexadecimal format, replace the \$readmemb instruction by \$readmemh instruction in the Verilog code of the instruction memory. Below is an example of the "instr.txt" file:
```00111000000100000000000000000011
00111000000100010000000000000100
00001000000000000000000000000101
00111000000100000000000000000001
00111000000100010000000000000001
00000010001100001001000000100010
00010110000100011111111111111100
00000010000100011001100000100000
10101110010100110000000000010000
10001110010101000000000000010000
00000010000101001010100000101010
10001110010100110000000000010000
00111010010100110000000000000001
00111010101101010000000000000001
00000010101000000000000000001000
```

#### Verilog code for 32-bit Adder:

````timescale 1 ps / 100 fs
// fpga4student.com: FPGA projects, Verilog Projects, VHDL projects
// Verilog project: 32-bit 5-stage Pipelined MIPS Processor in Verilog
// Verilog code for 32-bit adder
output [31:0] S;
input [31:0] A,B;
wire [31:0] C;

endmodule

//---------------------------------------------------------------------------------------------------

`timescale 1 ps / 100 fs
input   a,b,cin;
output  cout,sum;
// sum = a xor b xor cin
xor #(50) (sum,a,b,cin);
// carry out = a.b + cin.(a+b)
and #(50) and1(c1,a,b);
or #(50) or1(c2,a,b);
and #(50) and2(c3,c2,cin);
or #(50) or2(cout,c1,c3);
endmodule
```

#### Verilog code for register file:

````timescale 1 ps / 100 fs
// fpga4student.com: FPGA projects, Verilog Projects, VHDL projects
// Verilog project: 32-bit 5-stage Pipelined MIPS Processor in Verilog
// Register file
module regfile(
WriteData,
WriteRegister,
RegWrite,
reset,
clk);

input [31:0] WriteData;
input RegWrite,reset, clk;
wire [31:0] WriteEn;
wire [31:0] RegArray [0:31];
integer i;
//----Decoder Block
decoder Decoder1( WriteEn,RegWrite,WriteRegister);
register reg0 (RegArray[0],32'b0,1'b1,1'b0, clk);
register reg1 (RegArray[1],WriteData,WriteEn[1],reset,clk);
register reg2 (RegArray[2],WriteData,WriteEn[2],reset,clk);
register reg3 (RegArray[3],WriteData,WriteEn[3],reset,clk);
register reg4 (RegArray[4],WriteData,WriteEn[4],reset,clk);
register reg5 (RegArray[5],WriteData,WriteEn[5],reset,clk);
register reg6 (RegArray[6],WriteData,WriteEn[6],reset,clk);
register reg7 (RegArray[7],WriteData,WriteEn[7],reset,clk);
register reg8 (RegArray[8],WriteData,WriteEn[8],reset,clk);
register reg9 (RegArray[9],WriteData,WriteEn[9],reset,clk);
register reg10 (RegArray[10],WriteData,WriteEn[10],reset,clk);
register reg11 (RegArray[11],WriteData,WriteEn[11],reset,clk);
register reg12 (RegArray[12],WriteData,WriteEn[12],reset,clk);
register reg13 (RegArray[13],WriteData,WriteEn[13],reset,clk);
register reg14 (RegArray[14],WriteData,WriteEn[14],reset,clk);
register reg15 (RegArray[15],WriteData,WriteEn[15],reset,clk);
register reg16 (RegArray[16],WriteData,WriteEn[16],reset,clk);
register reg17 (RegArray[17],WriteData,WriteEn[17],reset,clk);
register reg18 (RegArray[18],WriteData,WriteEn[18],reset,clk);
register reg19 (RegArray[19],WriteData,WriteEn[19],reset,clk);
register reg20 (RegArray[20],WriteData,WriteEn[20],reset,clk);
register reg21 (RegArray[21],WriteData,WriteEn[21],reset,clk);
register reg22 (RegArray[22],WriteData,WriteEn[22],reset,clk);
register reg23 (RegArray[23],WriteData,WriteEn[23],reset,clk);
register reg24 (RegArray[24],WriteData,WriteEn[24],reset,clk);
register reg25 (RegArray[25],WriteData,WriteEn[25],reset,clk);
register reg26 (RegArray[26],WriteData,WriteEn[26],reset,clk);
register reg27 (RegArray[27],WriteData,WriteEn[27],reset,clk);
register reg28 (RegArray[28],WriteData,WriteEn[28],reset,clk);
register reg29 (RegArray[29],WriteData,WriteEn[29],reset,clk);
register reg30 (RegArray[30],WriteData,WriteEn[30],reset,clk);
register reg31 (RegArray[31],WriteData,WriteEn[31],reset,clk);
//----32x32to32 Multiplexor1 Block----
RegArray[8],RegArray[9],RegArray[10],RegArray[11],RegArray[12],RegArray[13],RegArray[14],RegArray[15],RegArray[16],RegArray[17],
RegArray[18], RegArray[19],RegArray[20],RegArray[21],RegArray[22],RegArray[23],RegArray[24],RegArray[25],RegArray[26],
);

//----32x32to32 Multiplexor2 Block----
RegArray[8],RegArray[9],RegArray[10],RegArray[11],RegArray[12],RegArray[13],RegArray[14],RegArray[15],RegArray[16],RegArray[17],
RegArray[18], RegArray[19],RegArray[20],RegArray[21],RegArray[22],RegArray[23],RegArray[24],RegArray[25],RegArray[26],
);
endmodule

//------------DFF-------------------
module D_FF (q, d, reset, clk);
output q;
input d, reset, clk;
reg q; // Indicate that q is stateholding

always @(posedge clk or posedge reset)
if (reset)
q = 0; // On reset, set to 0
else
q = d; // Otherwise out = d
endmodule
// 1 bit register
module RegBit(BitOut, BitData, WriteEn,reset, clk);
output BitOut; // 1 bit of register
input BitData, WriteEn;
input reset,clk;
wire d,f1, f2; // input of D Flip-Flop
wire reset;
//assign reset=0;
and #(50) U1(f1, BitOut, (~WriteEn));
and #(50) U2(f2, BitData, WriteEn);
or  #(50) U3(d, f1, f2);
D_FF DFF0(BitOut, d, reset, clk);
endmodule

//32 bit register
module register(RegOut,RegIn,WriteEn,reset,clk);
output [31:0] RegOut;
input [31:0] RegIn;
input WriteEn,reset, clk;
RegBit bit31(RegOut[31],RegIn[31],WriteEn,reset,clk);
RegBit bit30(RegOut[30],RegIn[30],WriteEn,reset,clk);
RegBit bit29(RegOut[29],RegIn[29],WriteEn,reset,clk);
RegBit bit28(RegOut[28],RegIn[28],WriteEn,reset,clk);
RegBit bit27(RegOut[27],RegIn[27],WriteEn,reset,clk);
RegBit bit26(RegOut[26],RegIn[26],WriteEn,reset,clk);
RegBit bit25(RegOut[25],RegIn[25],WriteEn,reset,clk);
RegBit bit24(RegOut[24],RegIn[24],WriteEn,reset,clk);
RegBit bit23(RegOut[23],RegIn[23],WriteEn,reset,clk);
RegBit bit22(RegOut[22],RegIn[22],WriteEn,reset,clk);
RegBit bit21(RegOut[21],RegIn[21],WriteEn,reset,clk);
RegBit bit20(RegOut[20],RegIn[20],WriteEn,reset,clk);
RegBit bit19(RegOut[19],RegIn[19],WriteEn,reset,clk);
RegBit bit18(RegOut[18],RegIn[18],WriteEn,reset,clk);
RegBit bit17(RegOut[17],RegIn[17],WriteEn,reset,clk);
RegBit bit16(RegOut[16],RegIn[16],WriteEn,reset,clk);
RegBit bit15(RegOut[15],RegIn[15],WriteEn,reset,clk);
RegBit bit14(RegOut[14],RegIn[14],WriteEn,reset,clk);
RegBit bit13(RegOut[13],RegIn[13],WriteEn,reset,clk);
RegBit bit12(RegOut[12],RegIn[12],WriteEn,reset,clk);
RegBit bit11(RegOut[11],RegIn[11],WriteEn,reset,clk);
RegBit bit10(RegOut[10],RegIn[10],WriteEn,reset,clk);
RegBit bit9 (RegOut[9], RegIn[9], WriteEn,reset,clk);
RegBit bit8 (RegOut[8], RegIn[8], WriteEn,reset,clk);
RegBit bit7 (RegOut[7], RegIn[7], WriteEn,reset,clk);
RegBit bit6 (RegOut[6], RegIn[6], WriteEn,reset,clk);
RegBit bit5 (RegOut[5], RegIn[5], WriteEn,reset,clk);
RegBit bit4 (RegOut[4], RegIn[4], WriteEn,reset,clk);
RegBit bit3 (RegOut[3], RegIn[3], WriteEn,reset,clk);
RegBit bit2 (RegOut[2], RegIn[2], WriteEn,reset,clk);
RegBit bit1 (RegOut[1], RegIn[1], WriteEn,reset,clk);
RegBit bit0 (RegOut[0], RegIn[0], WriteEn,reset,clk);

endmodule

// Decoder
module decoder(WriteEn,RegWrite, WriteRegister);
input RegWrite;
input [4:0] WriteRegister;
output [31:0] WriteEn;
wire [31:0] OE; // Output Enable
dec5to32 dec(OE,WriteRegister);
assign WriteEn[0]=0;
and  #(50) gate1(WriteEn[1],OE[1],RegWrite);
and  #(50) gate2(WriteEn[2],OE[2],RegWrite);
and  #(50) gate3(WriteEn[3],OE[3],RegWrite);
and  #(50) gate4(WriteEn[4],OE[4],RegWrite);
and  #(50) gate5(WriteEn[5],OE[5],RegWrite);
and  #(50) gate6(WriteEn[6],OE[6],RegWrite);
and  #(50) gate7(WriteEn[7],OE[7],RegWrite);
and  #(50) gate8(WriteEn[8],OE[8],RegWrite);
and  #(50) gate9(WriteEn[9],OE[9],RegWrite);
and  #(50) gate10(WriteEn[10],OE[10],RegWrite);
and  #(50) gate11(WriteEn[11],OE[11],RegWrite);
and  #(50) gate12(WriteEn[12],OE[12],RegWrite);
and  #(50) gate13(WriteEn[13],OE[13],RegWrite);
and  #(50) gate14(WriteEn[14],OE[14],RegWrite);
and  #(50) gate15(WriteEn[15],OE[15],RegWrite);
and  #(50) gate16(WriteEn[16],OE[16],RegWrite);
and  #(50) gate17(WriteEn[17],OE[17],RegWrite);
and  #(50) gate18(WriteEn[18],OE[18],RegWrite);
and  #(50) gate19(WriteEn[19],OE[19],RegWrite);
and  #(50) gate20(WriteEn[20],OE[20],RegWrite);
and  #(50) gate21(WriteEn[21],OE[21],RegWrite);
and  #(50) gate22(WriteEn[22],OE[22],RegWrite);
and  #(50) gate23(WriteEn[23],OE[23],RegWrite);
and  #(50) gate24(WriteEn[24],OE[24],RegWrite);
and  #(50) gate25(WriteEn[25],OE[25],RegWrite);
and  #(50) gate26(WriteEn[26],OE[26],RegWrite);
and  #(50) gate27(WriteEn[27],OE[27],RegWrite);
and  #(50) gate28(WriteEn[28],OE[28],RegWrite);
and  #(50) gate29(WriteEn[29],OE[29],RegWrite);
and  #(50) gate30(WriteEn[30],OE[30],RegWrite);
and  #(50) gate31(WriteEn[31],OE[31],RegWrite);
endmodule
module andmore(g,a,b,c,d,e);
output g;
input a,b,c,d,e;
and #(50) and1(f1,a,b,c,d),
and2(g,f1,e);
endmodule
output [31:0] Out;

andmore a0(Out[0],  Nota,Notb,Notc,Notd,Note); // 00000
endmodule

//------------module multiplexor 32 to 1----------------
module mux32to1(Out, In , Select);
output Out;
input [31:0] In;
input [4:0] Select;
wire [31:0] OE,f; // OE = Output Enable
dec5to32 dec1(OE,Select);

and  #(50) g_0(f[0],OE[0],In[0]);
and  #(50) g_1(f[1],OE[1],In[1]);
and  #(50) g_2(f[2],OE[2],In[2]);
and  #(50) g_3(f[3],OE[3],In[3]);
and  #(50) g_4(f[4],OE[4],In[4]);
and  #(50) g_5(f[5],OE[5],In[5]);
and  #(50) g_6(f[6],OE[6],In[6]);
and  #(50) g_7(f[7],OE[7],In[7]);
and  #(50) g_8(f[8],OE[8],In[8]);
and  #(50) g_9(f[9],OE[9],In[9]);
and  #(50) g_10(f[10],OE[10],In[10]);
and  #(50) g_11(f[11],OE[11],In[11]);
and  #(50) g_12(f[12],OE[12],In[12]);
and  #(50) g_13(f[13],OE[13],In[13]);
and  #(50) g_14(f[14],OE[14],In[14]);
and  #(50) g_15(f[15],OE[15],In[15]);
and  #(50) g_16(f[16],OE[16],In[16]);
and  #(50) g_17(f[17],OE[17],In[17]);
and  #(50) g_18(f[18],OE[18],In[18]);
and  #(50) g_19(f[19],OE[19],In[19]);
and  #(50) g_20(f[20],OE[20],In[20]);
and  #(50) g_21(f[21],OE[21],In[21]);
and  #(50) g_22(f[22],OE[22],In[22]);
and  #(50) g_23(f[23],OE[23],In[23]);
and  #(50) g_24(f[24],OE[24],In[24]);
and  #(50) g_25(f[25],OE[25],In[25]);
and  #(50) g_26(f[26],OE[26],In[26]);
and  #(50) g_27(f[27],OE[27],In[27]);
and  #(50) g_28(f[28],OE[28],In[28]);
and  #(50) g_29(f[29],OE[29],In[29]);
and  #(50) g_30(f[30],OE[30],In[30]);
and  #(50) g_31(f[31],OE[31],In[31]);

or #(50) gate3(g3,f[0],f[1],f[2],f[3]);
or #(50) gate4(g4,f[4],f[5],f[6],f[7]);
or #(50) gate5(g5,f[8],f[9],f[10],f[11]);
or #(50) gate6(g6,f[12],f[13],f[14],f[15]);
or #(50) gate7(g7,f[16],f[17],f[18],f[19]);
or #(50) gate8(g8,f[20],f[21],f[22],f[23]);
or #(50) gate9(g9,f[24],f[25],f[26],f[27]);
or #(50) gate10(g10,f[28],f[29],f[30],f[31]);
or #(50) gate11(g11,g3,g4,g5,g6);
or #(50) gate12(g12,g7,g8,g9,10);
or #(50) gate(Out,g11,g12);
endmodule

input [31:0] In0, In1,In2,In3,In4,In5,In6,In7,In8,In9,In10,In11,In12,In13,In14,In15,In16,In17,In18,In19,In20,In21,In22,In23,In24,In25,In26,In27,In28,In29,In30,In31;
reg [31:0] ArrayReg [0:31];
integer j;
always @(*)
begin
for (j=0;j<=31;j=j+1)
ArrayReg[j] = {In31[j], In30[j],In29[j],In28[j],In27[j],In26[j],In25[j],In24[j],In23[j],In22[j],In21[j],
In20[j],In19[j],In18[j],In17[j],In16[j],In15[j],In14[j],In13[j],In12[j],In11[j],
In10[j],In9[j],In8[j],In7[j],In6[j],In5[j],In4[j],In3[j],In2[j],In1[j],In0[j]};

end

endmodule
```
Continue on the ALU design and Verilog code for the ALU. Let's move to the next part.
It is noted that you need to go through all the necessary parts( Part 1, Part 2, and Part 3) to fully understand the process of designing the pipelined MIPS processor, and collect all the required Verilog code to be able to run the pipelined MIPS processor in simulation.

1. Hi sir,
I executed following program, I am getting 10054 error "can't open design file instr.txt"

1. Instructions that you want to load into the instruction memory need to save in the "instr.txt" in the binary format. If you want to save it hexadecimal format, replace the \$readmemb instruction by \$readmemh instruction in the Verilog code of the instruction memory.

2. Hi sir,
Can you pls tell me, how to give the instruction data memory in cadence. cadence tool cant accept the text file as input.

1. Double check your tool. It should be able to support Text file reading Verilog code.

3. k sir. thank you...how to write constraint file for risc processor in cadence

4. Hi sir,I want code for 64 bit 5 stage pipelinig risc processor with 32 instructions in double cycle.MY requirements are
1.ALU 2.Control unit 3.Shift registers 4.Accumulator registers 5.Memory 6.I/o ports 7.Serial ports