Vivado工程文件如图:

打开Vivado软件,打开工程,如图:

自动升级到当前版本,如图:

暂时选择现有开发板的型号,如图:

出现一条警告性信息,暂时先不管,点击OK:

可以看到完整的工程文件包含如下图:

自顶而下分析卷积层的设计过程
图为该项目的一个卷积层,其中包含了多个卷积核(Filter),模块的输入为图像矩阵和卷积核设置参数,输出为卷积提取的特征矩阵

图片来自附带的技术文档《Hardware Documentation》
卷积层的原理图如图所示,其中filters的位宽为2400,image的位宽是16384,该层卷积的输出位宽是75264

单个卷积核层的设计如图,输入为图像矩阵image和单个卷积核filter,输出卷积核处理的特征矩阵

图片来自附带的技术文档《Hardware Documentation》
原理图如图所示,filter的位宽为400,image的位宽是16384,输出位宽是12544

卷积单元如图所示,输入为卷积核filter和卷积核窗口覆盖的图像image,计算输出该窗口提取的特征

图片来自附带的技术文档《Hardware Documentation》
原理图如图所示,filter的位宽为400,卷积核窗口覆盖的图像image的位宽是400,输出位宽是16

卷积单元具体实现如图所示,即相乘相加操作。卷积计算具体操作就是点乘,本质就是乘法和加法。图中输入为float16类型数据A和B,输出float16数据类型的结果

图片来自附带的技术文档《Hardware Documentation》
原理图如图所示,可以看到输入floatA和floatB,以及输出result位宽均为16

自底向上分析每个模块的功能和具体实现
如图所示Processing Element由FM(floatMult16),FADD(floatAdd16),result_reg三个单元组成

卷积单元完整的顶层原理图如图所示,对一个卷积核和该卷积核覆盖的图像区域(可以称为窗口)进行计算,输出一个计算结果(float16)

Single Filter Layer原理图如图所示,由1个RF selector和14个CU组成,该部分是计算一个卷积核与一幅图像的卷积,输出卷积提取的完整图像的特征。
RF selector的作用:将卷积核覆盖的图像区域(可以称为窗口)的数据对应传输给14个CU,输入图像尺寸为32x32x16,卷积核大小为5x5x16,卷积核滑动步长为1,此时一幅完整图像将产生28x28个窗口数据,每个窗口数据为5x5x16。因为14个CU是并行计算的,故RF selector输出位宽为14x5x5x16=5600
为什么选择使用14个CU,作者给出的解释是:LUT的数量在单个或多个卷积核模块中呈指数增长,实验对比后,最终决定使用CU的数量等于输出特征中单行像素数量的一半。例如,输入图像32x32,卷积核5x5,输出特征为28x28,故CU的数量等于28/2=14

Multi Filter Layer原理图如图所示,由2个convLayerSingle组成,即并行度为2。上述内容可知Multi Filter Layer的输入是图像和6个卷积核,因此6个卷积核分为2个一组,循环3次输入到convLayerSingle,即每次执行2个卷积核与图像的卷积

新建工程,操作如图所示:

输入工程名字和工程路径,如图:

选择创建RTL工程,如图:

直接点击Next:

继续点击Next:

添加芯片型号,操作如图:

完成创建:

创建工程文件,操作如图:

创建floatAdd16文件:

创建完成:

双击打开,输入如下代码:
module floatAdd16 (floatA,floatB,sum);
input [15:0] floatA, floatB; // 输入float16数据A和B
output reg [15:0] sum; // 输出为float16数据sum
reg sign; // 输出的正负标志位
reg signed [5:0] exponent; //输出数据的指数,有正负故选择有符号数
reg [9:0] mantissa; //输出数据的尾数
reg [4:0] exponentA, exponentB; //输出数据的阶数
reg [10:0] fractionA, fractionB, fraction; //fraction = {1,mantissa} 暂存位
reg [7:0] shiftAmount;// 移位寄存器,计算加法时配平阶数
reg cout;
always @ (floatA or floatB) begin
exponentA = floatA[14:10];
exponentB = floatB[14:10];
fractionA = {1'b1,floatA[9:0]};
fractionB = {1'b1,floatB[9:0]};
exponent = exponentA;
if (floatA == 0) begin //special case (floatA = 0)
sum = floatB;
end else if (floatB == 0) begin //special case (floatB = 0)
sum = floatA;
end else if (floatA[14:0] == floatB[14:0] && floatA[15]^floatB[15]==1'b1) begin //A与B互为相反数的情况
sum=0;
end else begin
if (exponentB > exponentA) begin // 配平阶数,使得A和B在同一阶数
shiftAmount = exponentB - exponentA;
fractionA = fractionA >> (shiftAmount);
exponent = exponentB;
end else if (exponentA > exponentB) begin
shiftAmount = exponentA - exponentB;
fractionB = fractionB >> (shiftAmount);
exponent = exponentA;
end
if (floatA[15] == floatB[15]) begin //A与B同符号
{cout,fraction} = fractionA + fractionB;
if (cout == 1'b1) begin
{cout,fraction} = {cout,fraction} >> 1;
exponent = exponent + 1;
end
sign = floatA[15];
end else begin //A与B符号不相同
if (floatA[15] == 1'b1) begin // A为负数
{cout,fraction} = fractionB - fractionA; // B-A
end else begin
{cout,fraction} = fractionA - fractionB; // A-B
end
sign = cout;
if (cout == 1'b1) begin
fraction = -fraction; // 0-负数,求出该数的绝对值
end else begin
end
//对franction进行阶数配平,求出尾数
if (fraction [10] == 0) begin
if (fraction[9] == 1'b1) begin
fraction = fraction << 1;
exponent = exponent - 1;
end else if (fraction[8] == 1'b1) begin
fraction = fraction << 2;
exponent = exponent - 2;
end else if (fraction[7] == 1'b1) begin
fraction = fraction << 3;
exponent = exponent - 3;
end else if (fraction[6] == 1'b1) begin
fraction = fraction << 4;
exponent = exponent - 4;
end else if (fraction[5] == 1'b1) begin
fraction = fraction << 5;
exponent = exponent - 5;
end else if (fraction[4] == 1'b1) begin
fraction = fraction << 6;
exponent = exponent - 6;
end else if (fraction[3] == 1'b1) begin
fraction = fraction << 7;
exponent = exponent - 7;
end else if (fraction[2] == 1'b1) begin
fraction = fraction << 8;
exponent = exponent - 8;
end else if (fraction[1] == 1'b1) begin
fraction = fraction << 9;
exponent = exponent - 9;
end else if (fraction[0] == 1'b1) begin
fraction = fraction << 10;
exponent = exponent - 10;
end
end
end
mantissa = fraction[9:0];
if(exponent[5]==1'b1) begin //exponent is negative
sum = 16'b0000000000000000;
end
else begin
sum = {sign,exponent[4:0],mantissa};//组合数据
end
end
end
endmodule
如图所示:

对设计进行分析,操作如图:

分析后的设计,Vivado自动生成原理图,如图:

对设计进行综合,操作如图:

综合完成后,弹出窗口如下,直接关闭:

创建TestBench,操作如图所示:

创建激励文件,输入文件名:

创建完成:

双击打开,输入激励代码:
`timescale 100 ns / 10 ps
module tb_floatAdd16();
reg [15:0] floatA;
reg [15:0] floatB;
wire [15:0] sum;
initial begin
// A + B = 16'h3800 = 0.5
#0
floatA = 16'h34CD; // 0.3
floatB = 16'h3266; // 0.2
// A + B = 34CD
#10
floatA = 16'h34CD;
floatB = 16'h0000; // 0
#10
$stop;
end
floatAdd16 FADD
(
.floatA(floatA),
.floatB(floatB),
.sum(sum)
);
endmodule
如图所示:

开始进行仿真,操作如下:

仿真操作,如图:

调整波形,进行观察:

仿真波形如图:

关闭仿真:

点击OK:

创建floatMult16文件,如图:

双击打开,输入如下代码:
module floatMult16 (floatA,floatB,product);
input [15:0] floatA, floatB; // 输入为两个float16数据A和B
output reg [15:0] product; // 输出为float16数据
reg sign; // 输出的正负标志位
reg signed [5:0] exponent; // 输出数据的指数,有正负故选择有符号数
reg [9:0] mantissa; // 输出数据的小数
reg [10:0] fractionA, fractionB; //fraction = {1,mantissa} 计算二进制数据最高位 补1
reg [21:0] fraction; // 相乘结果参数
always @ (floatA or floatB) begin
if (floatA == 0 || floatB == 0) begin // A或者B为0的情况
product = 0;
end else begin
sign = floatA[15] ^ floatB[15]; // 异或门判断输出的正负
exponent = floatA[14:10] + floatB[14:10] - 5'd15 + 5'd2; // 由于借位给fractionA和fractionB,需要先补齐两位指数
fractionA = {1'b1,floatA[9:0]}; // 借位给fractionA
fractionB = {1'b1,floatB[9:0]}; // 借位给fractionB
fraction = fractionA * fractionB; // 计算二进制乘法
// 找到第一个不为0的数字并对指数进行匹配处理
if (fraction[21] == 1'b1) begin
fraction = fraction << 1;
exponent = exponent - 1;
end else if (fraction[20] == 1'b1) begin
fraction = fraction << 2;
exponent = exponent - 2;
end else if (fraction[19] == 1'b1) begin
fraction = fraction << 3;
exponent = exponent - 3;
end else if (fraction[18] == 1'b1) begin
fraction = fraction << 4;
exponent = exponent - 4;
end else if (fraction[17] == 1'b1) begin
fraction = fraction << 5;
exponent = exponent - 5;
end else if (fraction[16] == 1'b1) begin
fraction = fraction << 6;
exponent = exponent - 6;
end else if (fraction[15] == 1'b1) begin
fraction = fraction << 7;
exponent = exponent - 7;
end else if (fraction[14] == 1'b1) begin
fraction = fraction << 8;
exponent = exponent - 8;
end else if (fraction[13] == 1'b1) begin
fraction = fraction << 9;
exponent = exponent - 9;
end else if (fraction[12] == 1'b0) begin
fraction = fraction << 10;
exponent = exponent - 10;
end
mantissa = fraction[21:12];
if(exponent[5]==1'b1) begin //exponent is negative
product=16'b0000000000000000;
end
else begin
product = {sign,exponent[4:0],mantissa};// 拼接输出数据
end
end
end
endmodule
如图所示:

将floatMult16设置为顶层:

关闭上次的分析文件:

对设计进行分析,操作如图:

分析后的设计,Vivado自动生成原理图,如图:

对设计进行综合,操作如图:

创建TestBench,操作如图所示:

双击打开,输入激励代码:
`timescale 100 ns / 10 ps
module tb_floatMult16();
reg [15:0] floatA;
reg [15:0] floatB;
wire [15:0] product;
initial begin
// 4 * 5
#0
floatA = 16'b0100010000000000;
floatB = 16'b0100010100000000;
// 0.0004125 * 0
#10
floatA = 16'b0000111011000010;
floatB = 16'b0000000000000000;
#10
$stop;
end
floatMult16 FM
(
.floatA(floatA),
.floatB(floatB),
.product(product)
);
endmodule
如图所示:

将tb_floatMult16设置为顶层:

开始进行仿真,操作如下:

添加仿真对象,操作如图:

开始仿真,如图:

仿真波形,如图:

创建processingElement16文件,如图:

双击打开,输入如下代码:
module processingElement16(clk,reset,floatA,floatB,result);
parameter DATA_WIDTH = 16; // 数据类型float16
input clk, reset;
input [DATA_WIDTH-1:0] floatA, floatB; // 输入float16数据A和B
output reg [DATA_WIDTH-1:0] result; // 输出float16数据
wire [DATA_WIDTH-1:0] multResult;
wire [DATA_WIDTH-1:0] addResult;
floatMult16 FM (floatA,floatB,multResult); // float16乘法运算
floatAdd16 FADD (multResult,result,addResult);// float16加法运算
always @ (posedge clk or posedge reset) begin
if (reset == 1'b1) begin
result = 0; // 开始时,result赋值为0
end else begin
result = addResult; // 求和结果不断更新为result,即为累加操作,result作为最后的输出
end
end
endmodule
如图所示:

关闭上次的分析文件:

将processingElement16设置为顶层:

对设计进行分析,操作如图:

分析后的设计,Vivado自动生成原理图,如图:

对设计进行综合,操作如图:

创建TestBench,操作如图所示:

双击打开,输入激励代码:
`timescale 100 ns / 10 ps
module tb_processingElement16();
reg clk,reset;
reg [15:0] floatA, floatB;
wire [15:0] result;
localparam PERIOD = 100;
always
#(PERIOD/2) clk = ~clk;
initial begin
#0
clk = 1'b0;
reset = 1;
// A = 2 , B = 3
floatA = 16'h4000;
floatB = 16'h4200;
#PERIOD
reset = 0;
#(2*PERIOD)
$stop;
end
processingElement16 PE
(
.clk(clk),
.reset(reset),
.floatA(floatA),
.floatB(floatB),
.result(result)
);
endmodule
如图所示:

将tb_processingElement16设置为顶层:

开始进行仿真,操作如下:

开始仿真,如图:

仿真波形,如图:

创建convUnit文件,如图:

双击打开,输入如下代码:
module convUnit(clk,reset,image,filter,result);
parameter DATA_WIDTH = 16; //数据宽度,float16
parameter D = 1; //卷积核深度
parameter F = 5; //卷积核大小
input clk, reset;
input [0:D*F*F*DATA_WIDTH-1] image, filter; //[0:399] image输入
output [0:DATA_WIDTH-1] result; //[0:15] result输出
reg [DATA_WIDTH-1:0] selectedInput1, selectedInput2;
integer i;
processingElement16 PE
(
.clk(clk),
.reset(reset),
.floatA(selectedInput1),
.floatB(selectedInput2),
.result(result)
);
// The convolution is calculated in a sequential process to save hardware
// The result of the element wise matrix multiplication is finished after (F*F+2) cycles (2 cycles to reset the processing element and F*F cycles to accumulate the result of the F*F multiplications)
always @ (posedge clk, posedge reset) begin
if (reset == 1'b1) begin // reset
i = 0;
selectedInput1 = 0;
selectedInput2 = 0;
end else if (i > D*F*F-1) begin // if the convolution is finished but we still wait for other blocks to finsih, send zeros to the conv unit (in case of pipelining)
selectedInput1 = 0;
selectedInput2 = 0;
end else begin // send one element of the image part and one element of the filter to be multiplied and accumulated
selectedInput1 = image[DATA_WIDTH*i+:DATA_WIDTH];
selectedInput2 = filter[DATA_WIDTH*i+:DATA_WIDTH];
i = i + 1;
end
end
endmodule
如图所示:

将convUnit设置为顶层:

关闭上次的分析文件:

对设计进行分析,操作如图:

分析后的设计,Vivado自动生成原理图,如图:

对设计进行综合,操作如图:

创建TestBench,操作如图所示:

双击打开,输入激励代码:
`timescale 100 ns / 10 ps
module tb_convUnit();
reg clk, reset;
reg [1*5*5*16-1:0] image, filter; // we test with a filter whose size is 2*3*3
wire [15:0] result;
localparam PERIOD = 100;
always
#(PERIOD/2) clk = ~clk;
initial begin
#0
clk = 1'b0;
reset = 1;
// We test with an image part and a filter whose values are all 4
// The expected result is 400 generated after 25 clock cycles
image = 400'h4400440044004400440044004400440044004400440044004400440044004400440044004400440044004400440044004400;
filter = 400'h4400440044004400440044004400440044004400440044004400440044004400440044004400440044004400440044004400;
#PERIOD
reset = 0;
#(27*PERIOD)
$displayh(result);
$stop;
end
convUnit
#(
.DATA_WIDTH(16),
.D(1),
.F(5)
)
UUT
(
.clk(clk),
.reset(reset),
.image(image),
.filter(filter),
.result(result)
);
endmodule
如图所示:

将tb_convUnit设置为顶层:

开始进行仿真,操作如下:

开始仿真,如图:

仿真波形,如图:

创建convLayerSingle工程文件,如图:

双击打开,输入如下代码:
module convLayerSingle(clk,reset,image,filter,outputConv);
parameter DATA_WIDTH = 16;
parameter D = 1; //卷积核的深度
parameter H = 32; //输入图像的高度
parameter W = 32; //输入图像的宽度
parameter F = 5; //卷积核的大小
input clk, reset;
input [0:D*H*W*DATA_WIDTH-1] image;
input [0:D*F*F*DATA_WIDTH-1] filter;
output reg [0:(H-F+1)*(W-F+1)*DATA_WIDTH-1] outputConv; // output of the module
wire [0:((W-F+1)/2)*DATA_WIDTH-1] outputConvUnits; // output of the conv units and input to the row selector
reg internalReset;
wire [0:(((W-F+1)/2)*D*F*F*DATA_WIDTH)-1] receptiveField; // array of the matrices to be sent to conv units
integer counter, outputCounter;
//counter: number of clock cycles need for the conv unit to finsish
//outputCounter: index to map the output of the conv units to the output of the module
reg [5:0] rowNumber, column;
//rowNumber: determines the row that is calculated by the conv units
//column: determines if we are calculating the first or the second 14 pixels of the output row
RFselector
#(
.DATA_WIDTH(DATA_WIDTH),
.D(D),
.H(H),
.W(W),
.F(F)
) RF
(
.image(image),
.rowNumber(rowNumber),
.column(column),
.receptiveField(receptiveField)
);
genvar n;
generate //generating n convolution units where n is half the number of pixels in one row of the output image
for (n = 0; n < (H-F+1)/2; n = n + 1) begin
convUnit
#(
.D(D),
.F(F)
) CU
(
.clk(clk),
.reset(internalReset),
.image(receptiveField[n*D*F*F*DATA_WIDTH+:D*F*F*DATA_WIDTH]),
.filter(filter),
.result(outputConvUnits[n*DATA_WIDTH+:DATA_WIDTH])
);
end
endgenerate
always @ (posedge clk or posedge reset) begin
if (reset == 1'b1) begin
internalReset = 1'b1;
rowNumber = 0;
column = 0;
counter = 0;
outputCounter = 0;
end else if (rowNumber < H-F+1) begin
if (counter == D*F*F+2) begin //The conv unit finishes ater 1*5*5+2 clock cycles
outputCounter = outputCounter + 1;
counter = 0;
internalReset = 1'b1;
if (column == 0) begin
column = (H-F+1)/2;
end else begin
rowNumber = rowNumber + 1;
column = 0;
end
end else begin
internalReset = 0;
counter = counter + 1;
end
end
end
always @ (*) begin
outputConv[outputCounter*((W-F+1)/2)*DATA_WIDTH+:((W-F+1)/2)*DATA_WIDTH] = outputConvUnits;
end
endmodule
如图所示:

继续创建RFselector文件:

双击打开,输入如下代码:
module RFselector(image,rowNumber, column,receptiveField);
parameter DATA_WIDTH = 16;
parameter D = 1; //卷积核深度
parameter H = 32; //图像高度
parameter W = 32; //图像宽度
parameter F = 5; //卷积核尺寸
input [0:D*H*W*DATA_WIDTH-1] image;
input [5:0] rowNumber, column;
output reg [0:(((W-F+1)/2)*D*F*F*DATA_WIDTH)-1] receptiveField;
integer address, c, k, i;
always @ (image or rowNumber or column) begin
address = 0;
if (column == 0) begin
for (c = 0; c < (W-F+1)/2; c = c + 1) begin
for (k = 0; k < D; k = k + 1) begin
for (i = 0; i < F; i = i + 1) begin
receptiveField[address*F*DATA_WIDTH+:F*DATA_WIDTH] = image[rowNumber*W*DATA_WIDTH+c*DATA_WIDTH+k*H*W*DATA_WIDTH+i*W*DATA_WIDTH+:F*DATA_WIDTH];
address = address + 1;
end
end
end
end else begin
for (c = (W-F+1)/2; c < (W-F+1); c = c + 1) begin
for (k = 0; k < D; k = k + 1) begin
for (i = 0; i < F; i = i + 1) begin
receptiveField[address*F*DATA_WIDTH+:F*DATA_WIDTH] = image[rowNumber*W*DATA_WIDTH+c*DATA_WIDTH+k*H*W*DATA_WIDTH+i*W*DATA_WIDTH+:F*DATA_WIDTH];
address = address + 1;
end
end
end
end
end
endmodule
如图所示:

将convLayerSingle设置为顶层:

关闭上次的分析文件:

对设计进行分析,操作如图:

分析后的设计,Vivado自动生成原理图,如图:

对设计进行综合,操作如图:

创建TestBench,文件名为tb_convLayerSingle,如图所示:

双击打开,输入激励代码:
`timescale 1ns / 1ps
module tb_convLayerSingle();
reg clk, reset;
reg [1*32*32*16-1:0] image; //We test with a 1*32*32 image
reg [1*5*5*16-1:0] filter; //We test with a 1*5*5 filter
wire [1*28*28*16-1:0] outputConv;
localparam PERIOD = 100;
integer i, clkCounter;
always
#(PERIOD/2) clk = ~clk;
always @ (posedge clk) begin
clkCounter = clkCounter + 1;
end
initial begin
#0
clkCounter = 0;
clk = 1'b0;
reset = 1;
//We test with the first image and the filters of the first layer of LeNet
image = 16384'h00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000326638fd3bf038fd32460000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000032063b773be83be83be83b6f000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000032c73b1f3bf03be83b7f3b4f3be833272606000000000000000000000000000000000000000000000000000000000000000000000000000000000000290533883b073be83bf03be83a5635453be83bf037a8000000000000000000000000000000000000000000000000000000000000000000000000000000000000391d3be83be83be83bf03be83be8360639ee3bf0393d0000000000000000000000000000000000000000000000000000000000000000000000000000000032663b773bf03bf039f637273bf03b2731e634f53c003945000000000000000000000000000000000000000000000000000000000000000000000000000032063b773be83be8399e2a0634b537982d45000000003bf03ba032460000000000000000000000000000000000000000000000000000000000000000000030c5392d3bf03b4f3a8735450000000000000000000000003bf03be8392d0000000000000000000000000000000000000000000000000000000000000000270739963be83b8834742cc52f070000000000000000000000003bf03be83a1e000000000000000000000000000000000000000000000000000000000000000033273be83be833e80000000000000000000000000000000000003bf03be83a1e00000000000000000000000000000000000000000000000000000000000000003a363bf039f600000000000000000000000000000000000000003c003bf03a2600000000000000000000000000000000000000000000000000000000000034c53bb83be8370700000000000000000000000000000000000000003bf03be838a500000000000000000000000000000000000000000000000000000000000035553be83b372e46000000000000000000000000000000002707383c3bf039d62a0600000000000000000000000000000000000000000000000000000000000035553be83aff000000000000000000000000000000002707381c3be83b0f3474000000000000000000000000000000000000000000000000000000000000000035553be8388d00000000000000000000000000003206392d3be8396d00000000000000000000000000000000000000000000000000000000000000000000000035653bf03b0f00000000000000000000000037273b773bf03915000000000000000000000000000000000000000000000000000000000000000000000000000035553be83bd0389532062f47355539963b0f3bf03aff393d3307000000000000000000000000000000000000000000000000000000000000000000000000000035553be83be83be83b2f3abf3be83be83be83a2638140000000000000000000000000000000000000000000000000000000000000000000000000000000000002f073a3e3be83be83bf03be83be83b4f388d0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002e4638043be83bf03be8386c30a50000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000;
filter = 400'h346b33f83146351432de310e2cc624deb409b3a2b61ab4c8b679b63bb455b48d2b45b08b2bdbb4c0b536b4b9b598b810b521;
#PERIOD
reset = 0;
#((56*28+1)*PERIOD)
for (i = 28*28-1; i >=0; i = i - 1) begin
$displayh(outputConv[i*16+:16]);
end
$stop;
end
convLayerSingle UUT
(
.clk(clk),
.reset(reset),
.image(image),
.filter(filter),
.outputConv(outputConv)
);
endmodule
如图所示:

将tb_convLayerSingle设置为顶层:

开始进行仿真,操作如下:

开始仿真,如图:

仿真波形如图所示:

创建convLayerMulti文件,如图:

双击打开,输入如下代码:
module convLayerMulti(clk,reset,image,filters,outputConv);
parameter DATA_WIDTH = 16;
parameter D = 1; //输入图像深度
parameter H = 32; //输入图像高度
parameter W = 32; //输入图像宽度
parameter F = 5; //卷积核尺寸
parameter K = 6; //卷积核数量
input clk, reset;
input [0:D*H*W*DATA_WIDTH-1] image;
input [0:K*D*F*F*DATA_WIDTH-1] filters;
output reg [0:K*(H-F+1)*(W-F+1)*DATA_WIDTH-1] outputConv;
reg [0:2*D*F*F*DATA_WIDTH-1] inputFilters;
wire [0:2*(H-F+1)*(W-F+1)*DATA_WIDTH-1] outputSingleLayers;
reg internalReset;
integer filterSet, counter, outputCounter;
genvar i;
generate
for (i = 0; i < 2; i = i + 1) begin
convLayerSingle #(
.DATA_WIDTH(DATA_WIDTH),
.D(D),
.H(H),
.W(W),
.F(F)
) UUT
(
.clk(clk),
.reset(internalReset),
.image(image),
.filter(inputFilters[i*D*F*F*DATA_WIDTH+:D*F*F*DATA_WIDTH]),
.outputConv(outputSingleLayers[i*(H-F+1)*(W-F+1)*DATA_WIDTH+:(H-F+1)*(W-F+1)*DATA_WIDTH])
);
end
endgenerate
always @ (posedge clk or posedge reset) begin
if (reset == 1'b1) begin
internalReset = 1'b1;
filterSet = 0;
counter = 0;
outputCounter = 0;
end else if (filterSet < K/2) begin
if (counter == ((((H-F+1)*(W-F+1))/((H-F+1)/2))*(D*F*F+3)+1)) begin
outputCounter = outputCounter + 1;
counter = 0;
internalReset = 1'b1;
filterSet = filterSet + 1;
end else begin
internalReset = 0;
counter = counter + 1;
end
end
end
always @ (*) begin
inputFilters = filters[filterSet*2*D*F*F*DATA_WIDTH+:2*D*F*F*DATA_WIDTH];
outputConv[outputCounter*2*(H-F+1)*(W-F+1)*DATA_WIDTH+:2*(H-F+1)*(W-F+1)*DATA_WIDTH] = outputSingleLayers;
end
endmodule
如图所示:

将convLayerMulti设置为顶层:

关闭上次的分析文件:

对设计进行分析,操作如图:

分析后的设计,Vivado自动生成原理图,如图:

对设计进行综合,操作如图:

创建TestBench,操作如图所示:

双击打开,输入激励代码:
module tb_convLayerMulti();
reg reset, clk;
reg [1*32*32*16-1:0] image;
reg [6*1*5*5*16-1:0] filters;
wire [6*28*28*16-1:0] outputConv;
localparam PERIOD = 100;
integer i;
always
#(PERIOD/2) clk = ~clk;
initial begin
#0
clk = 1'b0;
reset = 1;
//We test with a 1*32*32 image and 6 5*5 filters, all the values are 4
//Expected output 4704 (6*28*28) values equal to 400 (16*25)
image = 16384'h00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000326638fd3bf038fd32460000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000032063b773be83be83be83b6f000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000032c73b1f3bf03be83b7f3b4f3be833272606000000000000000000000000000000000000000000000000000000000000000000000000000000000000290533883b073be83bf03be83a5635453be83bf037a8000000000000000000000000000000000000000000000000000000000000000000000000000000000000391d3be83be83be83bf03be83be8360639ee3bf0393d0000000000000000000000000000000000000000000000000000000000000000000000000000000032663b773bf03bf039f637273bf03b2731e634f53c003945000000000000000000000000000000000000000000000000000000000000000000000000000032063b773be83be8399e2a0634b537982d45000000003bf03ba032460000000000000000000000000000000000000000000000000000000000000000000030c5392d3bf03b4f3a8735450000000000000000000000003bf03be8392d0000000000000000000000000000000000000000000000000000000000000000270739963be83b8834742cc52f070000000000000000000000003bf03be83a1e000000000000000000000000000000000000000000000000000000000000000033273be83be833e80000000000000000000000000000000000003bf03be83a1e00000000000000000000000000000000000000000000000000000000000000003a363bf039f600000000000000000000000000000000000000003c003bf03a2600000000000000000000000000000000000000000000000000000000000034c53bb83be8370700000000000000000000000000000000000000003bf03be838a500000000000000000000000000000000000000000000000000000000000035553be83b372e46000000000000000000000000000000002707383c3bf039d62a0600000000000000000000000000000000000000000000000000000000000035553be83aff000000000000000000000000000000002707381c3be83b0f3474000000000000000000000000000000000000000000000000000000000000000035553be8388d00000000000000000000000000003206392d3be8396d00000000000000000000000000000000000000000000000000000000000000000000000035653bf03b0f00000000000000000000000037273b773bf03915000000000000000000000000000000000000000000000000000000000000000000000000000035553be83bd0389532062f47355539963b0f3bf03aff393d3307000000000000000000000000000000000000000000000000000000000000000000000000000035553be83be83be83b2f3abf3be83be83be83a2638140000000000000000000000000000000000000000000000000000000000000000000000000000000000002f073a3e3be83be83bf03be83be83b4f388d0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002e4638043be83bf03be8386c30a50000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000;
filters[0*5*5*16+:5*5*16] = 400'h346b33f83146351432de310e2cc624deb409b3a2b61ab4c8b679b63bb455b48d2b45b08b2bdbb4c0b536b4b9b598b810b521;
filters[1*5*5*16+:5*5*16] = 400'h2beb2d7b319a3303349830989e6132afa8af343b345632da34043406345c30ebac9dacf7b0ec2464b26d2e3bb2c4b33cb203;
filters[2*5*5*16+:5*5*16] = 400'ha610312fad4522feac2330d832a4319ba5ecaac229349a10afb12f4aaeb8aadb2f99b26021bdac24a968aef7321c29c82d35;
filters[3*5*5*16+:5*5*16] = 400'h240634542fc9375033bf3851365635c4a3bd2b162aac2a602c7e31812d6a35d03782310c37c130e932e22624a6b8ab7da1f3;
filters[4*5*5*16+:5*5*16] = 400'ha99baabc2aa33113af6bb1db23c8aa0ab69ab575b6ebb60e16d4b1dfac5a31be2f9c2b2ab298b1b6b2cdae2db5c6b4f0af69;
filters[5*5*5*16+:5*5*16] = 400'h37fe37f0380b340434572f01309f31f32e76a6dd2aba9fa734cf303536562c91338e34322f47b1442217a6c2a8eba2a8addc;
#(PERIOD)
reset = 0;
#(7*1457*PERIOD)
for (i = 6*28*28-1; i >=0; i = i - 1) begin
$displayh(outputConv[i*16+:16]);
end
$stop;
end
convLayerMulti UUT
(
.clk(clk),
.reset(reset),
.image(image),
.filters(filters),
.outputConv(outputConv)
);
endmodule
如图所示:

开始进行仿真,操作如下:

开始仿真,如图:

创建integrationConv文件,如图:

双击打开,输入如下代码:
module integrationConv (clk,reset,CNNinput,Conv,ConvOutput);
parameter DATA_WIDTH = 16;
parameter ImgInW = 32;
parameter ImgInH = 32;
parameter ConvOut = 28;
parameter Kernel = 5;
parameter DepthC = 6;
input clk, reset;
input [ImgInW*ImgInH*DATA_WIDTH-1:0] CNNinput;
input [Kernel*Kernel*DepthC*DATA_WIDTH-1:0] Conv;
output [ConvOut*ConvOut*DepthC*DATA_WIDTH-1:0] ConvOutput;
convLayerMulti C1
(
.clk(clk),
.reset(reset),
.image(CNNinput),
.filters(Conv),
.outputConv(ConvOutput)
);
endmodule
如图所示:

将integrationConv设置为顶层:

关闭上次的分析文件:

对设计进行分析,操作如图:

分析后的设计,Vivado自动生成原理图,如图:

对设计进行综合,操作如图:

纪念一下,“通信行程卡”于2022年12月13日0时,正式下线

希望本文对大家有帮助,上文若有不妥之处,欢迎指正
分享决定高度,学习拉开差距
我有一个模型:classItem项目有一个属性“商店”基于存储的值,我希望Item对象对特定方法具有不同的行为。Rails中是否有针对此的通用设计模式?如果方法中没有大的if-else语句,这是如何干净利落地完成的? 最佳答案 通常通过Single-TableInheritance. 关于ruby-on-rails-Rails-子类化模型的设计模式是什么?,我们在StackOverflow上找到一个类似的问题: https://stackoverflow.co
我在我的项目目录中完成了compasscreate.和compassinitrails。几个问题:我已将我的.sass文件放在public/stylesheets中。这是放置它们的正确位置吗?当我运行compasswatch时,它不会自动编译这些.sass文件。我必须手动指定文件:compasswatchpublic/stylesheets/myfile.sass等。如何让它自动运行?文件ie.css、print.css和screen.css已放在stylesheets/compiled。如何在编译后不让它们重新出现的情况下删除它们?我自己编译的.sass文件编译成compiled/t
我将应用程序升级到Rails4,一切正常。我可以登录并转到我的编辑页面。也更新了观点。使用标准View时,用户会更新。但是当我添加例如字段:name时,它不会在表单中更新。使用devise3.1.1和gem'protected_attributes'我需要在设备或数据库上运行某种更新命令吗?我也搜索过这个地方,找到了许多不同的解决方案,但没有一个会更新我的用户字段。我没有添加任何自定义字段。 最佳答案 如果您想允许额外的参数,您可以在ApplicationController中使用beforefilter,因为Rails4将参数
如果我使用ruby版本2.5.1和Rails版本2.3.18会怎样?我有基于rails2.3.18和ruby1.9.2p320构建的rails应用程序,我只想升级ruby的版本,而不是rails,这可能吗?我必须面对哪些挑战? 最佳答案 GitHub维护apublicfork它有针对旧Rails版本的分支,有各种变化,它们一直在运行。有一段时间,他们在较新的Ruby版本上运行较旧的Rails版本,而不是最初支持的版本,因此您可能会发现一些关于需要向后移植的有用提示。不过,他们现在已经有几年没有使用2.3了,所以充其量只能让更
目录前言滤波电路科普主要分类实际情况单位的概念常用评价参数函数型滤波器简单分析滤波电路构成低通滤波器RC低通滤波器RL低通滤波器高通滤波器RC高通滤波器RL高通滤波器部分摘自《LC滤波器设计与制作》,侵权删。前言最近需要学习放大电路和滤波电路,但是由于只在之前做音乐频谱分析仪的时候简单了解过一点点运放,所以也是相当从零开始学习了。滤波电路科普主要分类滤波器:主要是从不同频率的成分中提取出特定频率的信号。有源滤波器:由RC元件与运算放大器组成的滤波器。可滤除某一次或多次谐波,最普通易于采用的无源滤波器结构是将电感与电容串联,可对主要次谐波(3、5、7)构成低阻抗旁路。无源滤波器:无源滤波器,又称
项目介绍随着我国经济迅速发展,人们对手机的需求越来越大,各种手机软件也都在被广泛应用,但是对于手机进行数据信息管理,对于手机的各种软件也是备受用户的喜爱小学生兴趣延时班预约小程序的设计与开发被用户普遍使用,为方便用户能够可以随时进行小学生兴趣延时班预约小程序的设计与开发的数据信息管理,特开发了小程序的设计与开发的管理系统。小学生兴趣延时班预约小程序的设计与开发的开发利用现有的成熟技术参考,以源代码为模板,分析功能调整与小学生兴趣延时班预约小程序的设计与开发的实际需求相结合,讨论了小学生兴趣延时班预约小程序的设计与开发的使用。开发环境开发说明:前端使用微信微信小程序开发工具:后端使用ssm:VU
Transformers开始在视频识别领域的“猪突猛进”,各种改进和魔改层出不穷。由此作者将开启VideoTransformer系列的讲解,本篇主要介绍了FBAI团队的TimeSformer,这也是第一篇使用纯Transformer结构在视频识别上的文章。如果觉得有用,就请点赞、收藏、关注!paper:https://arxiv.org/abs/2102.05095code(offical):https://github.com/facebookresearch/TimeSformeraccept:ICML2021author:FacebookAI一、前言Transformers(VIT)在图
我在我的项目中有一个用户和一个管理员角色。我使用Devise创建了身份验证。在我的管理员角色中,我没有任何确认。在我的用户模型中,我有以下内容:devise:database_authenticatable,:confirmable,:recoverable,:rememberable,:trackable,:validatable,:timeoutable,:registerable#Setupaccessible(orprotected)attributesforyourmodelattr_accessible:email,:username,:prename,:surname,:
我已经找到了几个使用datamapper的示例,并且能够让它们正常工作。不过,所有这些示例都是针对sqlite数据库的。我正在尝试将数据映射器与postgresql一起使用。我将datamapper中的调用从sqlite3更改为postgres,并且我已经安装了dm-postgres-adapter。但它仍然不起作用。我还需要做什么? 最佳答案 与SQLite不同,PostgreSQL不将数据库存储在单个文件中。在你拥有createdyourdatabase之后,尝试这样的事情:DataMapper.setup:default,{:
我经常将预配置的lambda插入可枚举的方法中,例如“map”、“select”等。但是“注入(inject)”的行为似乎有所不同。例如与mult4=lambda{|item|item*4}然后(5..10).map&mult4给我[20,24,28,32,36,40]但是,如果我制作一个2参数lambda用于像这样的注入(inject),multL=lambda{|product,n|product*n}我想说(5..10).inject(2)&multL因为“inject”有一个可选的单个初始值参数,但这给了我......irb(main):027:0>(5..10).inject