用SV写一个蒙哥马利模乘的参考模型
前言
往期推送过一个蒙哥马利算法的介绍,如果要实现蒙哥马利模乘的硬件模块,那么一个参考模型是必不可少的,这一期将利用SV实现一个简单的参考模型,这个参考模型可以直接用于功能仿真
根据以往推送中的运算流程进行建模
类的定义
class BN; rand bit [127:0] num [32:0]; string name; constraint c {num[32]==0;num[0][0]==1;}; function new(string name="A"); this.name = name; endfunction endclass : BN
定义一个大数的类,计算的位宽是4096,而使用的基是128bit也就是基为,数组大小定义为33,用于处理数运算时的溢出。
大数显示
// BN display function void BN_display(bit flag=0); string s; s={s,name,":"}; if(flag==1) s={s,$sformatf("%h",num[32])}; for (int i=1; i<32+1; i++) begin s={s,$sformatf("%h",num[32-i])}; end s={s,"\n"}; $display(s); endfunction : BN_display
默认不打印num[32]
大数移位
// BN_shift function void BN_shift(); for (int i=0; i<32; i++) begin num[i] = num[i+1]; end num[32] = 0; endfunction : BN_shift
蒙哥马利算法中有/b的操作,我们选取为基,那么这个操作就能使用移位实现,运算的大数为无符号数,最高为补0
大数乘法
// BN_mul function void BN_mul(bit [127:0] a,BN ans); bit [255:0] temp; bit [127:0] carry; for (int i = 0; i < 32; i++) begin temp = num[i] * a + carry; ans.num[i] = temp[127:0]; carry = temp>>128; end ans.num[32] = carry; endfunction : BN_mul
实现的是4096*128位的乘法,用于计算和
大数加法
// BN_add function automatic void BN_add(input BN a,b,ref BN c); bit [128:0] psum; bit carry; carry =0; for (int i = 0; i < 32+1; i++) begin psum = a.num[i] + b.num[i]+ carry; c.num[i] = psum; carry = psum[128]; end endfunction : BN_add
计算两个大数的加法
大数减法
利用取补码,然后相加实现
取补码
取反加一
// BN_com function void BN_com(ref BN c); BN temp,b; temp = new("temp"); b = new("b"); for (int i=0; i<32+1; i++) begin temp.num[i]=~this.num[i]; b.num[i]=0; end b.num[0]=1; BN_add(temp,b,c); endfunction : BN_com
实现减法
// BN_sub function automatic void BN_sub(input BN a,b,ref BN c); BN com; com =new("com"); b.BN_com(com); BN_add(a,com,c); endfunction : BN_sub
计算
计算原理如下图,下图有一处错误,循环变量中的和我们所求的并不是一个变量,这里指的是我们基的bit数
// BN_w function void BN_w(output bit [127:0] w); bit[127:0] t; bit[255:0] tt; t=1; tt=0; for (int i=0; i<127; i++) begin tt=t*t; t=tt[127:0]; tt=t*num[0]; t=tt[127:0]; $display("%h",t); end tt=0-t; w=tt[127:0]; endfunction
大数比较
最后输出时要检查r是否小于N
// BN_cmp return 1 when this >= a function bit BN_cmp(input BN a); for (int i=0; i<32+1; i++) begin if(num[32-i]<a.num[32-i]) return 0; if(num[32-i]>a.num[32-i]) return 1; end return 1; endfunction
蒙哥马利模乘
有了上面的函数,我们就能实现蒙哥马利模乘啦
// BN_sub function automatic void mont_mul(input BN x,y,N,ref BN r); BN temp1,temp2,temp3; bit [127:0] u,w; bit [255:0] ut; temp1=new("temp1"); temp2=new("temp2"); temp3=new("temp3"); N.BN_w(w); $display("%h",w); x.BN_display(); y.BN_display(); N.BN_display(); r.BN_display(); for (int i=0; i<32; i++) begin ut=y.num[i]*x.num[0]; u=ut[127:0]; ut=r.num[0]+u; u=ut[127:0]; ut=u*w; u=ut[127:0]; x.BN_mul(y.num[i], temp1); N.BN_mul(u, temp2); BN_add(r, temp1, temp3); BN_add(temp3, temp2, r); r.BN_shift(); end if(r.BN_cmp(N)) begin BN_sub(r,N,temp1); r.num=temp1.num; end $display("x*y*p^-1 mod N:"); r.BN_display(); endfunction : mont_mul
tb测试
顶层文件内容为
module tb; import BN_pkg::*; initial begin BN x,y,N,r; x=new("x"); y=new("y"); N=new("N"); r=new("r"); void'(x.randomize()); void'(y.randomize()); void'(N.randomize()); mont_mul(x,y,N,r); end endmodule
我们运行一次
那么如何验证是否正确呢,我们使用python检查一下,python自带了大数运算的库,可以自动的计算大数模乘。
下面的python代码用也模拟了一遍蒙哥马利模乘的流程,与直接模乘进行对比。蒙哥马利模乘的计算结果实际上是,所以要将结果再乘以一个才能进行对比。把上述打印出来的结果复制到python中运行,进行检查。
# tranfer long num to array def tranBN(bn,bn_a): bn_t=bn for x in range(0,32): bn_a[x] = bn_t % 0x100000000000000000000000000000000 bn_t = bn_t >> 128 # -n^(-1) mod 2^128 def get_w(n): nw=n % 0x100000000000000000000000000000000 w=1; for x in range(0,127): w=w*w % 0x100000000000000000000000000000000 w=w*nw % 0x100000000000000000000000000000000 # print("w:%x" %w) w=0x100000000000000000000000000000000-w return w # p mod n def get_p(): t=1 for x in range(0,4096): t=(t<<1) t=t%c return t if __name__ == '__main__': a = 0xdaaf6a1e33805c0f68e58a95cc39b4c64594dbc4f2b0ba88aae650f51e467e1c4f10ba102d77eb77c1547e0c40e6d7aeb05539c308ea01dafb6da336492100ab2cdd38a580091aaec64d74192431c00cce4f4c752498e88aaa5ccc010b2317db8e01c0660e1dc9ba01154024448965f8209721d39158422ef2e1817ac4240be53bfc0f05b7336e172e271c9e9fcd38057746bbe8f5bb1907ab681ae012395e78e531f5291340108b4f8b182614a29fa0c7a44032229fe3fb3af01a5577cf335f318c1ecc70b613e7532ab85dc087c618020e949640cb14a3dbf634fa0b48f0098c9e9ee4861a5e6193f2a9241e28d1f4d3c9a8f11c460943dbd7b7b06f18fe75454e20593388dcaa8b98aabe293987d22e2725251d6ebf2729cde05db076ed775b7f369d1f9e1109812960b8b76e333bcca8aaa98931c2937cadb68a4ffc6c54eff9a6bcb77da76dc02fcb83167105319dd5a25f19d6ef0b214927120635e665afe46f681259247978d4a6853bb3cac03bc554d07003496f6b8b624bfec45f4cfb24acded0aeb074e8f70df1813ebb26bd5fe26be2a627684d793a8a052e3a9476a1d9697dd9e27beb4db7ad01eb8b0a3b5c7717d716ebd30727cc7786a17f09b04d6b94a56d9f70ac514e026f42834486e6a0852cef4808c7222cc02e908f2ab225f93e316612d1fd67359987ab7a23be6348b73f6764e60de2edf72c500ebf b = 0x14b8f4973aacf4927678c823e8c5988fe76caf5b4241b1a32954ccb35729ff3573f617bc077c8d165ec5270c0b863fc231ae96dd5d933e9a98abdaf3d6e852e98149945ab1a9a90e38e07c3017c1273b18598d87b59a289de9d7c5bc5c6f64cccdbcbec42c289c8b1b799f8454cba6b89e5976a84c19217d64ddde5af42e37ab465928d068deaa3a0270b8d062dbe0b737667c3afd065871532081e72bc1f79e1d7ebd1fb933ec3555a8e986f949f72ca11bc2fbe4c704b20838c68b707d9f3db1d8ae45b44b6bd36a58bfbf7d565347a6c6e20130c84f1bad77f6251e81dfb6ffa9a508d64db7d2fe48b5e4ebe68e7c8d62cdf5ab1c2ca8c2d2e835a1423acbef65956c980dfb62b3a405b9efbc93283d5071c2129b831481c537cc5be8f1d2723f1168f797bde736c1f73054d7d0dc97538fba25bb3e38703934d8fc46ad22eb23ea409184c3dba8241efc92ce5a6728f4385da637bc23ef7acb506d0543804ae7d660926a82406f9d3206376d5454466ecde2246a125c99aebdf16743d55cfb1c4ab0fdb8387320d541a94e3c5aa6038466eaa18682a163d571db3214de448b3d4d7a632bc60f0a524a041cd6e72a75dbc9f6bb63743df3c3c0d4649a28bd0bbeee569182303a66b830a2273b8df05c712adadf2bcb75244a66826265da778e0c3b45a20d6c962fd203e708ff62dd29b9edd96f2afd2bfe92014968e439ef c = 0xa35f4906ed176bc241535c78955d02f4d0ac5376d736ae280077887200c758b7781b4432fa8baca2a81ad6fb0817051a00fccf8e15c63048681bcf8342b56433abd550affa489b289cd4f0482adce321c8cf4374ce15267692dfc8b0da108f4bb0e922d4a28402ef785c2516f6296486f8505ac3df05c0f953acce65e2dc5f1e59965ded73fa18ffb482ad1a2e5433d4df8211de12a3e7a71a1a084fed671fb11eeaf76f640c4fd549ea307b6622f798f027786e79232206de1507281d84c719209d408bc85f9ed2e1b82ecf72ff805a45221dc712c45a8dbc375e9b64227ec6b659a75fc5b5e051e776bcd9f4f6d82ebaff89a48c8494d6ed072372b846156af229994baab390ec57c00130255acc2cdf975783df4678153f0ca51b854425b1568b5b8b53239f50dd39fc53c3d41827a0687c435f6de5e98843def3fb7b0f7e701cdfb51517d6628392bd9291c16282556f5581766dd6a0a426a35312237399f93ad69502592c0f6d1864ba0b75600ee04cb406bcb833bc98527a0ac1249c6a918456b06f24611770c1708426b4d9041f7fe83be68fbc7018e461951d234ebf00227b4301911e24055c745203c888276f4db0c05f66514ae4e6b4bf4c8914e36c4a94bf57bf807dd40c7572d1a99c27d9f58af0877bb217c081d750d5edbe3c45eafc3ea6786560fa819873452cc8bffc7ab998ab70496b77fdadffb7e7262d t = 1 a_a=[0 for i in range(32)] b_a=[0 for i in range(32)] c_a=[0 for i in range(32)] tranBN(a,a_a) tranBN(b,b_a) tranBN(c,c_a) w=get_w(c) r = 0 # mont mul , get a*b*p^-1 mod c for x in range(0,32): r0 = r%0x100000000000000000000000000000000 u = (r0+b_a[x]*a_a[0])*w u = u % 0x100000000000000000000000000000000 r = r+b_a[x]*a+u*c # check r[127:0] == 0 if r%0x100000000000000000000000000000000 == 0: pass else: print("r[127:0]!=0!!!!") print(r%0x100000000000000000000000000000000) r = r >> 128 # print("%x"%r) if r>c: r=r-c print("a*b*p^-1 mod c:") print("%x"%r) t=get_p() # compare to python r_s=(a*b) % c r_m=(r*t) % c print("python result: \n%x" %r_s) print("mntmul result: \n%x" %r_m) if r_s==r_m: print("result match!") else: print("result dismatch!")
python中模拟蒙哥马利的结果与sv中一致
而python中模拟蒙哥马利和直接模乘结果也一致
注:结果太长,只截了一部分