NVDLA Compiler Analysis

这篇文章记录一下笔者剖析 NVDLA Compiler 工作机制的一些经过,在 NVDLA 的软件仓库中,Compiler与Runtime的源代码被放置在umd目录下,由同一个 Makefile 组织。

对于sw部分文档十分缺乏,在 NVDLA的页面里只有零星的介绍,而关于软件设计的细节、以及如何拓展设计都没有介绍。并且,Compiler这部分的代码充满了智慧,例如:

image-20210624203707954

还有一些突然停止维护了而没开发完的feature。并且由于其前端只有 Caffe 的 Parser,导致其端到端的推理仅可以支持一些比较弱的分类网络,但阅读代码理解其设计与工作的思路,学习其抽象语法树的设计还是一个比较有意义的工作。

1. Clion 结合 Docker 调试 Makefile Project

由于 Compiler 需要一些 Linux 的头文件,所以在 Mac 和 Windows 上不方便直接编译(笔者是 MacOS),为了不污染环境,我推荐使用 docker 新建一个容器,当然为了方便验证,我还是推荐直接使用之前文章里提到过的 image :

1
2
3
docker pull farhanaslam/nvdla 
# 做一个端口映射,毕竟22端口还是很容易冲突的
docker run -it -v /Users/wanglei/:/home -p 2222:22 --name "nvdlakit" wanglei/nvdla-kit

然后为其添加sshd服务,具体可以参考这篇博客

1
2
3
4
apt-get upgrade
apt-get install openssh-client
apt-get install openssh-server
/etc/init.d/ssh start

开启 root 的访问权限:

1
vim /etc/ssh/sshd_config

找到 PermitRootLogin将其设置为 yes 。

再用passwd命令设置一下Root的密码就好了。

大概一年之前,Clion 并不支持 Makefile 项目,在这之后其在插件里提供了部分对 Makefile 的支持,但是不能够进行远程调试,而在笔者尝试使用 Clion 远程调试 Makefile 项目的时候,JB的官方刚好发布了这个问题的解决方案,时机非常巧妙,官方的post title叫做 Full remote mode,根据配置就可以远程调试了,但是对于 Compiler,其在编译的过程中依赖环境变量TOP、其在编译之后的执行阶段又依赖环境变量LD_LIBRARY_PATH,所以我们需要对配置做一点小改动。

Preference|Build,Execution,Deployment|Makefile页面,为 options 加上 TOP 的路径。

image-20210624181927692

Configurations页面,加上 Environment variables 。

image-20210624182055447

再来,apps 的 Makefile 里没有编译出带调试信息的可执行文件,为MODULE_COMPILEFLAGS加上-g选项。这样,就可以单步调试了。

2. Compiler

首先,umd目录下面一共有若干个文件夹:

  • apps:compiler和runtime的外层逻辑,一般这里放的都是解析命令行参数之类的程序。
  • core:runtime和compiler的主要实现逻辑放在这里,是需要着重阅读的部分
  • externel:这个目录存放了umd用到的第三方库,需要说明的是protobuf主要用于caffe模型的解析,因为caffe的blob格式原始存储方式是使用了google开发的protobuf、用来压缩生成 Loadable 文件的 FlatBuffers等等。
  • make:若干通用的makefile规则。
  • port:主要是runtime的底层访问API,为了实现OS无关,做了这个隔离层,目前只实现了Linux下的。这层API包括内存访问,用户态和内核态的交互IOCTL,内存分配等。需要注意的是NVDLA的runtime部分用户态和内核态交互使用了Linux用于显卡抽线的DRM接口
  • utils:这个文件夹放了几个通用的模块,包括BitBinaryTree以及伙伴内存管理等。其中伙伴内存管理模块在compiler的tensor内存分配推导部分有用到

2.1 NVDLA Compiler 工作流程

在阅读代码的时候,可以打开官方已经写了的部分调试信息输出开关,在源代码里可以找到类似这样的声明:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
inline bool debugGraphDump() const { return false; }
inline bool debugClone() const { return false; }
inline bool debugOps() const { return false; }
inline bool debugGroupOps() const { return false; }
inline bool debugMathOptz() const { return false; }
inline bool debugWeights() const { return false; }
inline bool debugQuantization() const { return false; }
inline bool debugFuseSubEngineOps() const { return false; }
inline bool debugSurfaces() const { return false; }
inline bool debugBuffers() const { return false; }
inline bool debugCopyOutDebug() const { return false; }
inline bool debugMemoryLayout() const { return false; }
inline bool debugBinding() const { return false; }
inline bool debugDepGraph() const { return false; }
inline bool debugMemHazards() const { return false; }
inline bool debugRelocs() const { return false; }

false调成true,就可以输出一些信息,例如 debugGraphs 开关可以把几个关键的抽象语法树保存为 Json 格式的文件。

这里列出一下把所有的 inline bool debugxxx 开关都打开的log,有三千多行:

Console Output of Lenet5.

creating new wisdom context…
opening wisdom context…
parsing caffe network…
mark prob
Marking total 1 outputs
parsing calibration table…
attaching parsed network to the wisdom…
compiling profile “fast-math”… config “opendla_small”…
Invalid target config
Compiler profile: fast-math
compute precision NVDLA_PRECISION_INT8
network input data format NHWC
network input surface format NVDLA_IMG_A8B8G8R8
network output data format NCxHWx
network output surface format NVDLA_FEATURE_DATA_INT8
batch size 0
tensor scaling mode PER_TENSOR
quantization mode PER_KERNEL
raw weights of bias-0: -0.189939, -0.0122971, -0.127561, -0.0370565, 0.0295314, 0.00792045, -0.212149, -0.159978, -0.100598, -0.0833697, -0.234531, -0.257588, 0.211461, -0.181789, -0.289102, -0.106559, -0.24326, 0.238864, -0.0477089, 0.183563,
n-0:->n-0:dc-conv-0 n-1:bias-0
n-1:->n-2:pdp-0
raw weights of bias-1: 0.0266923, -0.0210181, -0.0988805, 0.0149911, -0.0189492, -0.0394126, -0.00789565, -0.0799653, -0.0767248, -0.0982602, -0.0398029, -0.0778087, -0.041269, 0.0100064, -0.0130392, 0.0253316, 0.0122957, -0.0409093, -0.00850784, 0.00141418, -0.0388875, -0.0170651, -0.0327771, -0.0181817, -0.086323, -0.0200653, -0.0210894, -0.0629572, -0.0672779, 0.0461048, -0.0657149, -0.0425482, 0.0145769, -0.0452918, -0.0231088, -0.106806, -0.0240538, -0.0203086, -0.0370518, -0.0241791, -0.0967711, 0.0139108, -0.0661912, -0.0684465, -0.0150072, 0.0310905, -0.0493356, -0.0828543, 0.007227, -0.0534207,
n-2:->n-3:dc-conv-1 n-4:bias-1
n-3:->n-5:pdp-1
raw weights of bias-2: -0.0160908, -0.00596529, 0.00571031, 0.0141845, 0.00457521, 0.000918757, 0.00182481, 0.00487062, -0.010953, 0.00405698, 0.00241628, 0.00215307, -0.00274916, 0.000693179, 0.00434156, -0.00366797, -0.000154131, 0.0017697, -0.00575291, 0.00485401, 0.0049358, 0.00053109, 0.0035042, 0.00115322, -0.00513705, 0.00220661, 0.0130855, -0.0040593, 0.00742773, -0.00207535, 0.00708569, 0.0130998, 0.00320843, 0.00807043, -0.00725761, 0.00397733, 0.000158773, -0.00426002, -0.00715113, 0.00737358, 0.000183469, 0.0121654, -0.00185334, -0.00478108, -0.00362065, 0.00572813, -0.000719333, 0.0062139, -0.0042254, 0.004558, 0.00368481, -0.00638158, -0.000542821, 0.0056826, -0.0025529, -0.000308287, 0.00540015, 0.0128624, -0.00784788, -0.0110209, 0.00165233, -0.00979184, 0.00627872, 0.00449704, 0.00551083, 0.0103216, -8.54422e-05, 0.0100825, -0.00381044, 0.00709125, -0.00162287, 0.0100035, 0.00153186, -0.00192266, 0.0177549, 0.0028109, -0.000116088, 0.00857696, 0.00336946, -0.00577133, 0.00734607, 0.00367282, 0.00290999, -0.00764302, -0.00435251, 0.0112651, -0.00474309, -0.00293007, -0.00337636, 0.00350733, 0.00413807, 0.00648092, -0.00520403, -0.00946726, -0.00299138, -0.0129122, 0.00296569, -0.00357285, 0.00962356, 0.00297598, -0.00135108, 0.00128038, -0.00179181, 0.010566, 0.0180151, -0.00460528, -0.000536081, -0.00138855, 0.00448689, 0.00400495, 0.00287102, 0.00166039, -0.00473274, 0.000981164, -0.00258576, -0.00459088, -0.00192138, 0.00659034, -0.00184609, 0.000764893, -0.00330961, -0.00467795, 0.00309991, 0.00156627, 0.00300166, -0.00358254, -0.00415914, 0.0100725, -0.0018276, 0.00587635, 0.00282938, 0.00114757, 0.00344898, -0.00504183, 0.0038491, -0.000436917, 0.0107106, 0.0101657, 0.000640607, 0.000200644, -0.00187112, -0.000905704, -0.00430365, -0.0114306, -0.00254204, 0.00999499, -0.00361041, -0.000814386, -0.00166425, 0.0036898, 0.00910051, -0.008081, -0.00621499, -0.00476039, 0.00352937, -0.00883984, 0.01267, 0.00553043, 0.00413786, 0.00378476, -0.000376468, -0.00935109, 0.00111507, 0.00449815, 0.00232352, -0.00526044, -0.00657905, 0.00202523, 0.00783183, 0.00331134, 0.0113301, 0.000284586, 0.00555074, -0.00561669, -0.00304549, 0.00232793, 0.00480359, 0.00417361, 0.00917487, -0.0106033, -0.00935452, 0.000472582, -0.00565795, 0.00265887, 0.00553717, 0.00830267, 0.00499382, -0.00830142, 0.00125424, 0.017598, 0.0110582, -0.00110636, 0.00206649, 0.00600223, 0.00580203, 0.0095547, -0.00347894, -0.000947946, -0.00558407, 0.00457049, -0.00695887, 0.00425796, 0.00692133, 0.0110334, 0.00080584, 0.00967977, -0.00152148, 0.000685625, 0.0111667, -0.0122356, -0.00163696, 0.00874573, -0.00747964, 9.53092e-05, 0.00125609, -0.00484556, 0.00964425, -0.0049331, -0.00674903, 0.00155267, 0.00779544, -0.00475451, -0.00446209, 0.00994591, -0.00069266, 0.0028227, -0.00315651, -0.00449883, 0.0124552, 0.00346543, -0.00341063, -0.00535677, -0.00169873, -0.000445362, -0.0113099, -0.000209029, -0.000552506, 0.00152527, 0.00539479, 0.01207, 0.00102972, 0.00209548, -0.000864239, 0.0142878, -0.00662105, -0.000244511, 0.0163849, 0.0115843, -0.000624234, 0.00149396, -0.00786256, 0.00504049, -0.00208822, -0.0053671, 0.00282316, 0.0123619, 0.00466023, -0.00132708, -0.00287518, 0.00934964, 0.0016545, -0.00141152, -0.00762542, 0.00133011, 0.00354479, -0.000554709, 0.00377187, -0.00410209, -0.0126422, 0.00330497, 0.00150753, 0.00671193, -0.000671467, -0.0020892, -0.00214926, 0.0089044, -0.00174577, -0.00241748, 0.00212694, 0.00360922, 0.000103615, -0.00459254, 0.0130479, 0.00234114, 0.00618787, 0.00378565, 0.00384775, -0.00576199, 0.00337918, -0.00326975, -0.00731954, -0.00448419, -0.00779377, -0.00449993, -0.00854937, 0.00736093, 0.00287959, 0.0059524, 0.000887351, -0.00268741, 0.00199675, -0.00819081, -0.00205734, 0.00159371, -0.00959275, -0.00528138, 0.00541374, -0.00262693, -0.00522113, 0.00781175, 0.0028028, 0.00183135, 0.00101005, 0.00178905, 0.00425291, 0.00325326, -8.29429e-05, -0.00223922, -0.0016051, -0.000555084, -0.00882739, 0.00275148, -0.0073338, -0.0130495, -0.00503639, -0.00335259, -0.00762874, -0.000773354, -0.0105282, -0.00604246, -0.0148792, 0.00509254, -0.00343132, 0.000884271, -0.00659211, -0.00150819, -0.00066668, -0.0123335, 0.00527709, 0.0030958, -0.000243152, 0.00163575, 0.00305789, -0.0032451, 0.00757511, -0.00441922, -0.00156038, -0.00333063, -0.00142264, -0.0023367, -0.000629858, -0.00742183, -0.00525475, 0.00106295, 0.00557893, -0.00735296, 0.0104845, 0.00501339, -8.00025e-05, -0.00386855, 0.0118741, 0.00171545, -0.00241817, -0.00286085, 0.0102796, 0.00680886, 0.00343526, -0.00503297, 0.00212876, 0.00683794, -0.003925, 0.000772909, 0.00182556, -0.0014057, 0.00537661, -0.00120791, 0.00606066, -0.00678337, -0.00536369, 0.00152798, 0.00113391, -0.00381443, 0.00313435, 0.00663192, -0.00171548, -0.00413377, -0.00378595, 0.00658897, 0.0115695, -0.00290412, 0.00122328, -0.00573779, -0.00666367, 0.00148788, -9.36032e-05, -0.00523711, 0.00100092, -0.00102205, -0.00262461, -0.00398723, -0.0089795, -0.00639421, -0.00406703, -0.00919813, -0.00157012, -0.00926236, 0.000345377, 0.0134298, 0.00564769, 0.0109816, 0.00467368, -0.00677339, -0.00471123, 0.00475499, -0.00101872, 0.00660738, 0.00300622, -0.00118807, 0.00205993, -0.00326071, 0.00502191, -0.010096, -0.0014445, 0.00273264, 0.00262441, -0.000504633, 0.0147181, -0.00540654, -0.00326369, -0.0115329, 0.002148, 0.00194639, -0.0101629, -0.00186599, 0.0116932, -0.000642768, -0.00464428, -0.00737856, 0.0024578, 0.00681272, 0.00416895, 0.00396947, -0.0062771, 0.000580598, -0.00101734, -0.00331908, -0.00147223, 0.0143072, 0.00945186, 0.00212467, 0.00702214, 0.00546126, 0.0012048, 0.00901155, -0.00802692, 0.0110263, -0.00302939, -0.000107664, 0.00674109, -0.00715094, -0.00416368, -0.00109759, -0.00181006, 0.00966231, -0.00492121, 0.00562308, 0.00988432, 0.00530393, -0.000590022, 0.00142536, -0.00840012, -0.00277301, -0.00480831, 0.00468205, 0.00824822, 0.0036084, 0.00174475, 0.00406262, 0.00370747, -0.00549466, 0.00743413, 0.0100927, 0.0071719, -0.00537823, 0.00443552, 0.00363082, 0.00574112, -0.0126982, -0.00116731, -0.00286545, 0.0194753, 0.0103246, 0.00858375, 0.000202983, -0.00622996, -0.00117932, -0.0201595, 0.0038894, 0.0100873, 0.00150142,
n-4:->n-6:fc-0 n-7:bias-2
n-5:->n-8:sdp-scale-0 n-9:act-0
raw weights of bias-3: -0.0285562, 0.102577, -0.0604402, -0.0525417, -0.00433865, 0.028257, -0.0173933, 0.0503867, 0.0181165, -0.0360676,
n-6:->n-10:fc-1 n-11:bias-3
n-7:->n-12:cpu-sm-0
Prototxt #chnls (C = 1) != Profile #chnls for input (NVDLA_IMG_A8B8G8R8: C = 4). Preferring #chnls from Profile for compiling.
::Edge setBindId edge=e-0 domain=0 id=0
EngineAST graph level input edge[0] is e-0
::Edge edge=e-0 bindid=0
input bind id: 0
::Edge setBindId edge=e-8 domain=1 id=0
EngineAST graph level output edge[0] is e-8
::Edge edge=e-8 bindid=0
output bind id: 0
dc-conv-0/n-0/conv1:
in e-0
out e-11
aux e-9
bias-0/n-1/conv1:
in e-11
out e-1
aux e-10
pdp-0/n-2/pool1:
in e-1
out e-2
dc-conv-1/n-3/conv2:
in e-2
out e-14
aux e-12
bias-1/n-4/conv2:
in e-14
out e-3
aux e-13
pdp-1/n-5/pool2:
in e-3
out e-4
fc-0/n-6/ip1:
in e-4
out e-17
aux e-15
bias-2/n-7/ip1:
in e-17
out e-5
aux e-16
sdp-scale-0/n-8/(No canonical node):
in e-5
out e-19
aux e-18
act-0/n-9/relu1:
in e-19
out e-6
fc-1/n-10/ip2:
in e-6
out e-22
aux e-20
bias-3/n-11/ip2:
in e-22
out e-7
aux e-21
cpu-sm-0/n-12/prob:
in e-7
out e-8
::Edge edge=e-9 bindable=0
::Edge edge=e-0 bindable=1
::Edge edge=e-10 bindable=0
::Edge edge=e-12 bindable=0
::Edge edge=e-13 bindable=0
::Edge edge=e-15 bindable=0
::Edge edge=e-16 bindable=0
::Edge edge=e-18 bindable=0
::Edge edge=e-20 bindable=0
::Edge edge=e-21 bindable=0
::Edge edge=e-11 bindable=0
::Edge edge=e-1 bindable=0
::Edge edge=e-2 bindable=0
::Edge edge=e-14 bindable=0
::Edge edge=e-3 bindable=0
::Edge edge=e-4 bindable=0
::Edge edge=e-17 bindable=0
::Edge edge=e-5 bindable=0
::Edge edge=e-19 bindable=0
::Edge edge=e-6 bindable=0
::Edge edge=e-22 bindable=0
::Edge edge=e-7 bindable=0
::Edge edge=e-8 bindable=1
printGraph: engine_ast::generateGraph
n-0:dc-conv-0/conv1: (in)e-0[tt-1], (Aux)e-9[tt-4], (out)e-11[tt-8],
n-1:bias-0/conv1: (Aux)e-10[tt-5], (in)e-11[tt-8], (out)e-1[tt-3],
n-2:pdp-0/pool1: (in)e-1[tt-3], (out)e-2[tt-3],
n-3:dc-conv-1/conv2: (in)e-2[tt-3], (Aux)e-12[tt-4], (out)e-14[tt-8],
n-4:bias-1/conv2: (Aux)e-13[tt-5], (in)e-14[tt-8], (out)e-3[tt-3],
n-5:pdp-1/pool2: (in)e-3[tt-3], (out)e-4[tt-3],
n-6:fc-0/ip1: (in)e-4[tt-3], (Aux)e-15[tt-4], (out)e-17[tt-8],
n-7:bias-2/ip1: (Aux)e-16[tt-5], (in)e-17[tt-8], (out)e-5[tt-3],
n-8:sdp-scale-0/: (in)e-5[tt-3], (Aux)e-18[tt-7], (out)e-19[tt-3],
n-9:act-0/relu1: (in)e-19[tt-3], (out)e-6[tt-3],
n-10:fc-1/ip2: (in)e-6[tt-3], (Aux)e-20[tt-4], (out)e-22[tt-8],
n-11:bias-3/ip2: (Aux)e-21[tt-5], (in)e-22[tt-8], (out)e-7[tt-3],
n-12:cpu-sm-0/prob: (in)e-7[tt-3], (out)e-8[tt-2],
::Edge edge=e-9 bindable=0
edge: e-9 tsd: tsd-0 registered
::Edge edge=e-0 bindable=1
::Edge edge=e-0 domain=0 bind_id=0
::Surface bindId(tsd-1, 0) -> 0
set bind id 0 for e-0 tsd-1
edge: e-0 tsd: tsd-1 registered
::Edge edge=e-10 bindable=0
edge: e-10 tsd: tsd-2 registered
::Edge edge=e-12 bindable=0
edge: e-12 tsd: tsd-3 registered
::Edge edge=e-13 bindable=0
edge: e-13 tsd: tsd-4 registered
::Edge edge=e-15 bindable=0
edge: e-15 tsd: tsd-5 registered
::Edge edge=e-16 bindable=0
edge: e-16 tsd: tsd-6 registered
::Edge edge=e-18 bindable=0
edge: e-18 tsd: tsd-7 registered
::Edge edge=e-20 bindable=0
edge: e-20 tsd: tsd-8 registered
::Edge edge=e-21 bindable=0
edge: e-21 tsd: tsd-9 registered
::Edge edge=e-11 bindable=0
edge: e-11 tsd: tsd-10 registered
::Edge edge=e-1 bindable=0
edge: e-1 tsd: tsd-11 registered
::Edge edge=e-2 bindable=0
edge: e-2 tsd: tsd-12 registered
::Edge edge=e-14 bindable=0
edge: e-14 tsd: tsd-13 registered
::Edge edge=e-3 bindable=0
edge: e-3 tsd: tsd-14 registered
::Edge edge=e-4 bindable=0
edge: e-4 tsd: tsd-15 registered
::Edge edge=e-17 bindable=0
edge: e-17 tsd: tsd-16 registered
::Edge edge=e-5 bindable=0
edge: e-5 tsd: tsd-17 registered
::Edge edge=e-19 bindable=0
edge: e-19 tsd: tsd-18 registered
::Edge edge=e-6 bindable=0
edge: e-6 tsd: tsd-19 registered
::Edge edge=e-22 bindable=0
edge: e-22 tsd: tsd-20 registered
::Edge edge=e-7 bindable=0
edge: e-7 tsd: tsd-21 registered
::Edge edge=e-8 bindable=1
::Edge edge=e-8 domain=1 bind_id=0
::Surface bindId(tsd-22, 1) -> 0
set bind id 0 for e-8 tsd-22
edge: e-8 tsd: tsd-22 registered
e-9 edge setting new surface format NVDLA_WEIGHT_DC_INT8
e-0 edge setting new surface format NVDLA_IMG_A8B8G8R8
e-10 edge setting new surface format NVDLA_BIAS_DATA_INT16
e-12 edge setting new surface format NVDLA_WEIGHT_DC_INT8
e-13 edge setting new surface format NVDLA_BIAS_DATA_INT16
e-15 edge setting new surface format NVDLA_WEIGHT_DC_INT8
e-16 edge setting new surface format NVDLA_BIAS_DATA_INT16
e-18 edge setting new surface format NVDLA_SCALE_DATA_INT16
e-20 edge setting new surface format NVDLA_WEIGHT_DC_INT8
e-21 edge setting new surface format NVDLA_BIAS_DATA_INT16
e-11 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-1 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-2 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-14 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-3 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-4 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-17 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-5 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-19 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-6 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-22 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-7 edge setting new surface format NVDLA_FEATURE_DATA_INT8
e-8 edge setting new surface format NVDLA_FEATURE_DATA_INT8
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-7 for tsd-7 for e-18 with NVDLA_SCALE_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-17 for tsd-17 for e-5 with NVDLA_FEATURE_DATA_INT8
tb-18 for tsd-18 for e-19 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-18 edge already has set surface format NVDLA_SCALE_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-5 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-19 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-7 for tsd-7 for e-18 with NVDLA_SCALE_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-17 for tsd-17 for e-5 with NVDLA_FEATURE_DATA_INT8
tb-18 for tsd-18 for e-19 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0

Try Merging: bias-2 & sdp-scale-0
Merging: Sucess
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-19 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-18 for tsd-18 for e-19 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-19 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-18 for tsd-18 for e-19 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0

Try Merging: dc-conv-0 & bias-0
Merging: Not Feasible

Try Merging: dc-conv-1 & bias-1
Merging: Not Feasible

Try Merging: fc-0 & bias-2
Merging: Not Feasible

Try Merging: bias-2 & act-0
Merging: Sucess
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0

Try Merging: fc-0 & bias-2
Merging: Not Feasible

Try Merging: fc-1 & bias-3
Merging: Not Feasible
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
rawBias/SiSw -0.189939 / ( 0.00784518 * 0.00345303 ) = -7011.49 -> -7011 -> -70112^-0
rawBias/Si
Sw -0.0122971 / ( 0.00784518 * 0.00345303 ) = -453.941 -> -454 -> -4542^-0
rawBias/Si
Sw -0.127561 / ( 0.00784518 * 0.00345303 ) = -4708.85 -> -4709 -> -47092^-0
rawBias/Si
Sw -0.0370565 / ( 0.00784518 * 0.00345303 ) = -1367.92 -> -1368 -> -13682^-0
rawBias/Si
Sw 0.0295314 / ( 0.00784518 * 0.00345303 ) = 1090.14 -> 1090 -> 10902^-0
rawBias/Si
Sw 0.00792045 / ( 0.00784518 * 0.00345303 ) = 292.379 -> 292 -> 2922^-0
rawBias/Si
Sw -0.212149 / ( 0.00784518 * 0.00345303 ) = -7831.37 -> -7831 -> -78312^-0
rawBias/Si
Sw -0.159978 / ( 0.00784518 * 0.00345303 ) = -5905.52 -> -5906 -> -59062^-0
rawBias/Si
Sw -0.100598 / ( 0.00784518 * 0.00345303 ) = -3713.51 -> -3714 -> -37142^-0
rawBias/Si
Sw -0.0833697 / ( 0.00784518 * 0.00345303 ) = -3077.55 -> -3078 -> -30782^-0
rawBias/Si
Sw -0.234531 / ( 0.00784518 * 0.00345303 ) = -8657.59 -> -8658 -> -86582^-0
rawBias/Si
Sw -0.257588 / ( 0.00784518 * 0.00345303 ) = -9508.72 -> -9509 -> -95092^-0
rawBias/Si
Sw 0.211461 / ( 0.00784518 * 0.00345303 ) = 7805.96 -> 7806 -> 78062^-0
rawBias/Si
Sw -0.181789 / ( 0.00784518 * 0.00345303 ) = -6710.65 -> -6711 -> -67112^-0
rawBias/Si
Sw -0.289102 / ( 0.00784518 * 0.00345303 ) = -10672.1 -> -10672 -> -106722^-0
rawBias/Si
Sw -0.106559 / ( 0.00784518 * 0.00345303 ) = -3933.57 -> -3934 -> -39342^-0
rawBias/Si
Sw -0.24326 / ( 0.00784518 * 0.00345303 ) = -8979.83 -> -8980 -> -89802^-0
rawBias/Si
Sw 0.238864 / ( 0.00784518 * 0.00345303 ) = 8817.55 -> 8818 -> 88182^-0
rawBias/Si
Sw -0.0477089 / ( 0.00784518 * 0.00345303 ) = -1761.15 -> -1761 -> -17612^-0
rawBias/Si
Sw 0.183563 / ( 0.00784518 * 0.00345303 ) = 6776.12 -> 6776 -> 67762^-0
rawBias/Si
Sw 0.0266923 / ( 0.0137246 * 0.00122683 ) = 1585.26 -> 1585 -> 15852^-0
rawBias/Si
Sw -0.0210181 / ( 0.0137246 * 0.00122683 ) = -1248.27 -> -1248 -> -12482^-0
rawBias/Si
Sw -0.0988805 / ( 0.0137246 * 0.00122683 ) = -5872.53 -> -5873 -> -58732^-0
rawBias/Si
Sw 0.0149911 / ( 0.0137246 * 0.00122683 ) = 890.325 -> 890 -> 8902^-0
rawBias/Si
Sw -0.0189492 / ( 0.0137246 * 0.00122683 ) = -1125.4 -> -1125 -> -11252^-0
rawBias/Si
Sw -0.0394126 / ( 0.0137246 * 0.00122683 ) = -2340.72 -> -2341 -> -23412^-0
rawBias/Si
Sw -0.00789565 / ( 0.0137246 * 0.00122683 ) = -468.924 -> -469 -> -4692^-0
rawBias/Si
Sw -0.0799653 / ( 0.0137246 * 0.00122683 ) = -4749.15 -> -4749 -> -47492^-0
rawBias/Si
Sw -0.0767248 / ( 0.0137246 * 0.00122683 ) = -4556.7 -> -4557 -> -45572^-0
rawBias/Si
Sw -0.0982602 / ( 0.0137246 * 0.00122683 ) = -5835.69 -> -5836 -> -58362^-0
rawBias/Si
Sw -0.0398029 / ( 0.0137246 * 0.00122683 ) = -2363.9 -> -2364 -> -23642^-0
rawBias/Si
Sw -0.0778087 / ( 0.0137246 * 0.00122683 ) = -4621.07 -> -4621 -> -46212^-0
rawBias/Si
Sw -0.041269 / ( 0.0137246 * 0.00122683 ) = -2450.97 -> -2451 -> -24512^-0
rawBias/Si
Sw 0.0100064 / ( 0.0137246 * 0.00122683 ) = 594.283 -> 594 -> 5942^-0
rawBias/Si
Sw -0.0130392 / ( 0.0137246 * 0.00122683 ) = -774.398 -> -774 -> -7742^-0
rawBias/Si
Sw 0.0253316 / ( 0.0137246 * 0.00122683 ) = 1504.45 -> 1504 -> 15042^-0
rawBias/Si
Sw 0.0122957 / ( 0.0137246 * 0.00122683 ) = 730.241 -> 730 -> 7302^-0
rawBias/Si
Sw -0.0409093 / ( 0.0137246 * 0.00122683 ) = -2429.61 -> -2430 -> -24302^-0
rawBias/Si
Sw -0.00850784 / ( 0.0137246 * 0.00122683 ) = -505.282 -> -505 -> -5052^-0
rawBias/Si
Sw 0.00141418 / ( 0.0137246 * 0.00122683 ) = 83.9884 -> 84 -> 842^-0
rawBias/Si
Sw -0.0388875 / ( 0.0137246 * 0.00122683 ) = -2309.53 -> -2310 -> -23102^-0
rawBias/Si
Sw -0.0170651 / ( 0.0137246 * 0.00122683 ) = -1013.5 -> -1013 -> -10132^-0
rawBias/Si
Sw -0.0327771 / ( 0.0137246 * 0.00122683 ) = -1946.64 -> -1947 -> -19472^-0
rawBias/Si
Sw -0.0181817 / ( 0.0137246 * 0.00122683 ) = -1079.82 -> -1080 -> -10802^-0
rawBias/Si
Sw -0.086323 / ( 0.0137246 * 0.00122683 ) = -5126.74 -> -5127 -> -51272^-0
rawBias/Si
Sw -0.0200653 / ( 0.0137246 * 0.00122683 ) = -1191.68 -> -1192 -> -11922^-0
rawBias/Si
Sw -0.0210894 / ( 0.0137246 * 0.00122683 ) = -1252.5 -> -1253 -> -12532^-0
rawBias/Si
Sw -0.0629572 / ( 0.0137246 * 0.00122683 ) = -3739.04 -> -3739 -> -37392^-0
rawBias/Si
Sw -0.0672779 / ( 0.0137246 * 0.00122683 ) = -3995.64 -> -3996 -> -39962^-0
rawBias/Si
Sw 0.0461048 / ( 0.0137246 * 0.00122683 ) = 2738.17 -> 2738 -> 27382^-0
rawBias/Si
Sw -0.0657149 / ( 0.0137246 * 0.00122683 ) = -3902.82 -> -3903 -> -39032^-0
rawBias/Si
Sw -0.0425482 / ( 0.0137246 * 0.00122683 ) = -2526.94 -> -2527 -> -25272^-0
rawBias/Si
Sw 0.0145769 / ( 0.0137246 * 0.00122683 ) = 865.723 -> 866 -> 8662^-0
rawBias/Si
Sw -0.0452918 / ( 0.0137246 * 0.00122683 ) = -2689.88 -> -2690 -> -26902^-0
rawBias/Si
Sw -0.0231088 / ( 0.0137246 * 0.00122683 ) = -1372.44 -> -1372 -> -13722^-0
rawBias/Si
Sw -0.106806 / ( 0.0137246 * 0.00122683 ) = -6343.22 -> -6343 -> -63432^-0
rawBias/Si
Sw -0.0240538 / ( 0.0137246 * 0.00122683 ) = -1428.56 -> -1429 -> -14292^-0
rawBias/Si
Sw -0.0203086 / ( 0.0137246 * 0.00122683 ) = -1206.13 -> -1206 -> -12062^-0
rawBias/Si
Sw -0.0370518 / ( 0.0137246 * 0.00122683 ) = -2200.51 -> -2201 -> -22012^-0
rawBias/Si
Sw -0.0241791 / ( 0.0137246 * 0.00122683 ) = -1436 -> -1436 -> -14362^-0
rawBias/Si
Sw -0.0967711 / ( 0.0137246 * 0.00122683 ) = -5747.25 -> -5747 -> -57472^-0
rawBias/Si
Sw 0.0139108 / ( 0.0137246 * 0.00122683 ) = 826.167 -> 826 -> 8262^-0
rawBias/Si
Sw -0.0661912 / ( 0.0137246 * 0.00122683 ) = -3931.1 -> -3931 -> -39312^-0
rawBias/Si
Sw -0.0684465 / ( 0.0137246 * 0.00122683 ) = -4065.05 -> -4065 -> -40652^-0
rawBias/Si
Sw -0.0150072 / ( 0.0137246 * 0.00122683 ) = -891.278 -> -891 -> -8912^-0
rawBias/Si
Sw 0.0310905 / ( 0.0137246 * 0.00122683 ) = 1846.47 -> 1846 -> 18462^-0
rawBias/Si
Sw -0.0493356 / ( 0.0137246 * 0.00122683 ) = -2930.05 -> -2930 -> -29302^-0
rawBias/Si
Sw -0.0828543 / ( 0.0137246 * 0.00122683 ) = -4920.73 -> -4921 -> -49212^-0
rawBias/Si
Sw 0.007227 / ( 0.0137246 * 0.00122683 ) = 429.213 -> 429 -> 4292^-0
rawBias/Si
Sw -0.0534207 / ( 0.0137246 * 0.00122683 ) = -3172.66 -> -3173 -> -31732^-0
rawBias/Si
Sw -0.0160908 / ( 0.0670743 * 0.000756184 ) = -317.243 -> -317 -> -3172^-0
rawBias/Si
Sw -0.00596529 / ( 0.0670743 * 0.000756184 ) = -117.611 -> -118 -> -1182^-0
rawBias/Si
Sw 0.00571031 / ( 0.0670743 * 0.000756184 ) = 112.584 -> 113 -> 1132^-0
rawBias/Si
Sw 0.0141845 / ( 0.0670743 * 0.000756184 ) = 279.66 -> 280 -> 2802^-0
rawBias/Si
Sw 0.00457521 / ( 0.0670743 * 0.000756184 ) = 90.2043 -> 90 -> 902^-0
rawBias/Si
Sw 0.000918757 / ( 0.0670743 * 0.000756184 ) = 18.1141 -> 18 -> 182^-0
rawBias/Si
Sw 0.00182481 / ( 0.0670743 * 0.000756184 ) = 35.9777 -> 36 -> 362^-0
rawBias/Si
Sw 0.00487062 / ( 0.0670743 * 0.000756184 ) = 96.0286 -> 96 -> 962^-0
rawBias/Si
Sw -0.010953 / ( 0.0670743 * 0.000756184 ) = -215.949 -> -216 -> -2162^-0
rawBias/Si
Sw 0.00405698 / ( 0.0670743 * 0.000756184 ) = 79.9869 -> 80 -> 802^-0
rawBias/Si
Sw 0.00241628 / ( 0.0670743 * 0.000756184 ) = 47.6391 -> 48 -> 482^-0
rawBias/Si
Sw 0.00215307 / ( 0.0670743 * 0.000756184 ) = 42.4497 -> 42 -> 422^-0
rawBias/Si
Sw -0.00274916 / ( 0.0670743 * 0.000756184 ) = -54.2021 -> -54 -> -542^-0
rawBias/Si
Sw 0.000693179 / ( 0.0670743 * 0.000756184 ) = 13.6666 -> 14 -> 142^-0
rawBias/Si
Sw 0.00434156 / ( 0.0670743 * 0.000756184 ) = 85.5977 -> 86 -> 862^-0
rawBias/Si
Sw -0.00366797 / ( 0.0670743 * 0.000756184 ) = -72.3172 -> -72 -> -722^-0
rawBias/Si
Sw -0.000154131 / ( 0.0670743 * 0.000756184 ) = -3.03883 -> -3 -> -32^-0
rawBias/Si
Sw 0.0017697 / ( 0.0670743 * 0.000756184 ) = 34.8912 -> 35 -> 352^-0
rawBias/Si
Sw -0.00575291 / ( 0.0670743 * 0.000756184 ) = -113.424 -> -113 -> -1132^-0
rawBias/Si
Sw 0.00485401 / ( 0.0670743 * 0.000756184 ) = 95.701 -> 96 -> 962^-0
rawBias/Si
Sw 0.0049358 / ( 0.0670743 * 0.000756184 ) = 97.3137 -> 97 -> 972^-0
rawBias/Si
Sw 0.00053109 / ( 0.0670743 * 0.000756184 ) = 10.4709 -> 10 -> 102^-0
rawBias/Si
Sw 0.0035042 / ( 0.0670743 * 0.000756184 ) = 69.0884 -> 69 -> 692^-0
rawBias/Si
Sw 0.00115322 / ( 0.0670743 * 0.000756184 ) = 22.7367 -> 23 -> 232^-0
rawBias/Si
Sw -0.00513705 / ( 0.0670743 * 0.000756184 ) = -101.281 -> -101 -> -1012^-0
rawBias/Si
Sw 0.00220661 / ( 0.0670743 * 0.000756184 ) = 43.5053 -> 44 -> 442^-0
rawBias/Si
Sw 0.0130855 / ( 0.0670743 * 0.000756184 ) = 257.992 -> 258 -> 2582^-0
rawBias/Si
Sw -0.0040593 / ( 0.0670743 * 0.000756184 ) = -80.0326 -> -80 -> -802^-0
rawBias/Si
Sw 0.00742773 / ( 0.0670743 * 0.000756184 ) = 146.444 -> 146 -> 1462^-0
rawBias/Si
Sw -0.00207535 / ( 0.0670743 * 0.000756184 ) = -40.9173 -> -41 -> -412^-0
rawBias/Si
Sw 0.00708569 / ( 0.0670743 * 0.000756184 ) = 139.701 -> 140 -> 1402^-0
rawBias/Si
Sw 0.0130998 / ( 0.0670743 * 0.000756184 ) = 258.273 -> 258 -> 2582^-0
rawBias/Si
Sw 0.00320843 / ( 0.0670743 * 0.000756184 ) = 63.2571 -> 63 -> 632^-0
rawBias/Si
Sw 0.00807043 / ( 0.0670743 * 0.000756184 ) = 159.116 -> 159 -> 1592^-0
rawBias/Si
Sw -0.00725761 / ( 0.0670743 * 0.000756184 ) = -143.09 -> -143 -> -1432^-0
rawBias/Si
Sw 0.00397733 / ( 0.0670743 * 0.000756184 ) = 78.4166 -> 78 -> 782^-0
rawBias/Si
Sw 0.000158773 / ( 0.0670743 * 0.000756184 ) = 3.13036 -> 3 -> 32^-0
rawBias/Si
Sw -0.00426002 / ( 0.0670743 * 0.000756184 ) = -83.9899 -> -84 -> -842^-0
rawBias/Si
Sw -0.00715113 / ( 0.0670743 * 0.000756184 ) = -140.991 -> -141 -> -1412^-0
rawBias/Si
Sw 0.00737358 / ( 0.0670743 * 0.000756184 ) = 145.377 -> 145 -> 1452^-0
rawBias/Si
Sw 0.000183469 / ( 0.0670743 * 0.000756184 ) = 3.61726 -> 4 -> 42^-0
rawBias/Si
Sw 0.0121654 / ( 0.0670743 * 0.000756184 ) = 239.851 -> 240 -> 2402^-0
rawBias/Si
Sw -0.00185334 / ( 0.0670743 * 0.000756184 ) = -36.5403 -> -37 -> -372^-0
rawBias/Si
Sw -0.00478108 / ( 0.0670743 * 0.000756184 ) = -94.2632 -> -94 -> -942^-0
rawBias/Si
Sw -0.00362065 / ( 0.0670743 * 0.000756184 ) = -71.3842 -> -71 -> -712^-0
rawBias/Si
Sw 0.00572813 / ( 0.0670743 * 0.000756184 ) = 112.935 -> 113 -> 1132^-0
rawBias/Si
Sw -0.000719333 / ( 0.0670743 * 0.000756184 ) = -14.1823 -> -14 -> -142^-0
rawBias/Si
Sw 0.0062139 / ( 0.0670743 * 0.000756184 ) = 122.513 -> 123 -> 1232^-0
rawBias/Si
Sw -0.0042254 / ( 0.0670743 * 0.000756184 ) = -83.3075 -> -83 -> -832^-0
rawBias/Si
Sw 0.004558 / ( 0.0670743 * 0.000756184 ) = 89.8649 -> 90 -> 902^-0
rawBias/Si
Sw 0.00368481 / ( 0.0670743 * 0.000756184 ) = 72.6492 -> 73 -> 732^-0
rawBias/Si
Sw -0.00638158 / ( 0.0670743 * 0.000756184 ) = -125.819 -> -126 -> -1262^-0
rawBias/Si
Sw -0.000542821 / ( 0.0670743 * 0.000756184 ) = -10.7022 -> -11 -> -112^-0
rawBias/Si
Sw 0.0056826 / ( 0.0670743 * 0.000756184 ) = 112.037 -> 112 -> 1122^-0
rawBias/Si
Sw -0.0025529 / ( 0.0670743 * 0.000756184 ) = -50.3327 -> -50 -> -502^-0
rawBias/Si
Sw -0.000308287 / ( 0.0670743 * 0.000756184 ) = -6.07815 -> -6 -> -62^-0
rawBias/Si
Sw 0.00540015 / ( 0.0670743 * 0.000756184 ) = 106.469 -> 106 -> 1062^-0
rawBias/Si
Sw 0.0128624 / ( 0.0670743 * 0.000756184 ) = 253.594 -> 254 -> 2542^-0
rawBias/Si
Sw -0.00784788 / ( 0.0670743 * 0.000756184 ) = -154.728 -> -155 -> -1552^-0
rawBias/Si
Sw -0.0110209 / ( 0.0670743 * 0.000756184 ) = -217.286 -> -217 -> -2172^-0
rawBias/Si
Sw 0.00165233 / ( 0.0670743 * 0.000756184 ) = 32.5771 -> 33 -> 332^-0
rawBias/Si
Sw -0.00979184 / ( 0.0670743 * 0.000756184 ) = -193.055 -> -193 -> -1932^-0
rawBias/Si
Sw 0.00627872 / ( 0.0670743 * 0.000756184 ) = 123.79 -> 124 -> 1242^-0
rawBias/Si
Sw 0.00449704 / ( 0.0670743 * 0.000756184 ) = 88.6631 -> 89 -> 892^-0
rawBias/Si
Sw 0.00551083 / ( 0.0670743 * 0.000756184 ) = 108.651 -> 109 -> 1092^-0
rawBias/Si
Sw 0.0103216 / ( 0.0670743 * 0.000756184 ) = 203.499 -> 203 -> 2032^-0
rawBias/Si
Sw -8.54422e-05 / ( 0.0670743 * 0.000756184 ) = -1.68457 -> -2 -> -22^-0
rawBias/Si
Sw 0.0100825 / ( 0.0670743 * 0.000756184 ) = 198.786 -> 199 -> 1992^-0
rawBias/Si
Sw -0.00381044 / ( 0.0670743 * 0.000756184 ) = -75.1261 -> -75 -> -752^-0
rawBias/Si
Sw 0.00709125 / ( 0.0670743 * 0.000756184 ) = 139.81 -> 140 -> 1402^-0
rawBias/Si
Sw -0.00162287 / ( 0.0670743 * 0.000756184 ) = -31.9964 -> -32 -> -322^-0
rawBias/Si
Sw 0.0100035 / ( 0.0670743 * 0.000756184 ) = 197.228 -> 197 -> 1972^-0
rawBias/Si
Sw 0.00153186 / ( 0.0670743 * 0.000756184 ) = 30.2019 -> 30 -> 302^-0
rawBias/Si
Sw -0.00192266 / ( 0.0670743 * 0.000756184 ) = -37.9069 -> -38 -> -382^-0
rawBias/Si
Sw 0.0177549 / ( 0.0670743 * 0.000756184 ) = 350.053 -> 350 -> 3502^-0
rawBias/Si
Sw 0.0028109 / ( 0.0670743 * 0.000756184 ) = 55.4193 -> 55 -> 552^-0
rawBias/Si
Sw -0.000116088 / ( 0.0670743 * 0.000756184 ) = -2.28878 -> -2 -> -22^-0
rawBias/Si
Sw 0.00857696 / ( 0.0670743 * 0.000756184 ) = 169.102 -> 169 -> 1692^-0
rawBias/Si
Sw 0.00336946 / ( 0.0670743 * 0.000756184 ) = 66.4318 -> 66 -> 662^-0
rawBias/Si
Sw -0.00577133 / ( 0.0670743 * 0.000756184 ) = -113.787 -> -114 -> -1142^-0
rawBias/Si
Sw 0.00734607 / ( 0.0670743 * 0.000756184 ) = 144.834 -> 145 -> 1452^-0
rawBias/Si
Sw 0.00367282 / ( 0.0670743 * 0.000756184 ) = 72.4128 -> 72 -> 722^-0
rawBias/Si
Sw 0.00290999 / ( 0.0670743 * 0.000756184 ) = 57.3731 -> 57 -> 572^-0
rawBias/Si
Sw -0.00764302 / ( 0.0670743 * 0.000756184 ) = -150.689 -> -151 -> -1512^-0
rawBias/Si
Sw -0.00435251 / ( 0.0670743 * 0.000756184 ) = -85.8135 -> -86 -> -862^-0
rawBias/Si
Sw 0.0112651 / ( 0.0670743 * 0.000756184 ) = 222.102 -> 222 -> 2222^-0
rawBias/Si
Sw -0.00474309 / ( 0.0670743 * 0.000756184 ) = -93.5142 -> -94 -> -942^-0
rawBias/Si
Sw -0.00293007 / ( 0.0670743 * 0.000756184 ) = -57.769 -> -58 -> -582^-0
rawBias/Si
Sw -0.00337636 / ( 0.0670743 * 0.000756184 ) = -66.568 -> -67 -> -672^-0
rawBias/Si
Sw 0.00350733 / ( 0.0670743 * 0.000756184 ) = 69.1502 -> 69 -> 692^-0
rawBias/Si
Sw 0.00413807 / ( 0.0670743 * 0.000756184 ) = 81.5858 -> 82 -> 822^-0
rawBias/Si
Sw 0.00648092 / ( 0.0670743 * 0.000756184 ) = 127.777 -> 128 -> 1282^-0
rawBias/Si
Sw -0.00520403 / ( 0.0670743 * 0.000756184 ) = -102.602 -> -103 -> -1032^-0
rawBias/Si
Sw -0.00946726 / ( 0.0670743 * 0.000756184 ) = -186.655 -> -187 -> -1872^-0
rawBias/Si
Sw -0.00299138 / ( 0.0670743 * 0.000756184 ) = -58.9778 -> -59 -> -592^-0
rawBias/Si
Sw -0.0129122 / ( 0.0670743 * 0.000756184 ) = -254.575 -> -255 -> -2552^-0
rawBias/Si
Sw 0.00296569 / ( 0.0670743 * 0.000756184 ) = 58.4711 -> 58 -> 582^-0
rawBias/Si
Sw -0.00357285 / ( 0.0670743 * 0.000756184 ) = -70.4418 -> -70 -> -702^-0
rawBias/Si
Sw 0.00962356 / ( 0.0670743 * 0.000756184 ) = 189.737 -> 190 -> 1902^-0
rawBias/Si
Sw 0.00297598 / ( 0.0670743 * 0.000756184 ) = 58.6741 -> 59 -> 592^-0
rawBias/Si
Sw -0.00135108 / ( 0.0670743 * 0.000756184 ) = -26.6377 -> -27 -> -272^-0
rawBias/Si
Sw 0.00128038 / ( 0.0670743 * 0.000756184 ) = 25.2438 -> 25 -> 252^-0
rawBias/Si
Sw -0.00179181 / ( 0.0670743 * 0.000756184 ) = -35.3272 -> -35 -> -352^-0
rawBias/Si
Sw 0.010566 / ( 0.0670743 * 0.000756184 ) = 208.318 -> 208 -> 2082^-0
rawBias/Si
Sw 0.0180151 / ( 0.0670743 * 0.000756184 ) = 355.183 -> 355 -> 3552^-0
rawBias/Si
Sw -0.00460528 / ( 0.0670743 * 0.000756184 ) = -90.797 -> -91 -> -912^-0
rawBias/Si
Sw -0.000536081 / ( 0.0670743 * 0.000756184 ) = -10.5693 -> -11 -> -112^-0
rawBias/Si
Sw -0.00138855 / ( 0.0670743 * 0.000756184 ) = -27.3764 -> -27 -> -272^-0
rawBias/Si
Sw 0.00448689 / ( 0.0670743 * 0.000756184 ) = 88.463 -> 88 -> 882^-0
rawBias/Si
Sw 0.00400495 / ( 0.0670743 * 0.000756184 ) = 78.9611 -> 79 -> 792^-0
rawBias/Si
Sw 0.00287102 / ( 0.0670743 * 0.000756184 ) = 56.6046 -> 57 -> 572^-0
rawBias/Si
Sw 0.00166039 / ( 0.0670743 * 0.000756184 ) = 32.7361 -> 33 -> 332^-0
rawBias/Si
Sw -0.00473274 / ( 0.0670743 * 0.000756184 ) = -93.3102 -> -93 -> -932^-0
rawBias/Si
Sw 0.000981164 / ( 0.0670743 * 0.000756184 ) = 19.3445 -> 19 -> 192^-0
rawBias/Si
Sw -0.00258576 / ( 0.0670743 * 0.000756184 ) = -50.9806 -> -51 -> -512^-0
rawBias/Si
Sw -0.00459088 / ( 0.0670743 * 0.000756184 ) = -90.5133 -> -91 -> -912^-0
rawBias/Si
Sw -0.00192138 / ( 0.0670743 * 0.000756184 ) = -37.8817 -> -38 -> -382^-0
rawBias/Si
Sw 0.00659034 / ( 0.0670743 * 0.000756184 ) = 129.934 -> 130 -> 1302^-0
rawBias/Si
Sw -0.00184609 / ( 0.0670743 * 0.000756184 ) = -36.3974 -> -36 -> -362^-0
rawBias/Si
Sw 0.000764893 / ( 0.0670743 * 0.000756184 ) = 15.0805 -> 15 -> 152^-0
rawBias/Si
Sw -0.00330961 / ( 0.0670743 * 0.000756184 ) = -65.2518 -> -65 -> -652^-0
rawBias/Si
Sw -0.00467795 / ( 0.0670743 * 0.000756184 ) = -92.2298 -> -92 -> -922^-0
rawBias/Si
Sw 0.00309991 / ( 0.0670743 * 0.000756184 ) = 61.1174 -> 61 -> 612^-0
rawBias/Si
Sw 0.00156627 / ( 0.0670743 * 0.000756184 ) = 30.8804 -> 31 -> 312^-0
rawBias/Si
Sw 0.00300166 / ( 0.0670743 * 0.000756184 ) = 59.1803 -> 59 -> 592^-0
rawBias/Si
Sw -0.00358254 / ( 0.0670743 * 0.000756184 ) = -70.633 -> -71 -> -712^-0
rawBias/Si
Sw -0.00415914 / ( 0.0670743 * 0.000756184 ) = -82.0011 -> -82 -> -822^-0
rawBias/Si
Sw 0.0100725 / ( 0.0670743 * 0.000756184 ) = 198.589 -> 199 -> 1992^-0
rawBias/Si
Sw -0.0018276 / ( 0.0670743 * 0.000756184 ) = -36.0328 -> -36 -> -362^-0
rawBias/Si
Sw 0.00587635 / ( 0.0670743 * 0.000756184 ) = 115.857 -> 116 -> 1162^-0
rawBias/Si
Sw 0.00282938 / ( 0.0670743 * 0.000756184 ) = 55.7838 -> 56 -> 562^-0
rawBias/Si
Sw 0.00114757 / ( 0.0670743 * 0.000756184 ) = 22.6254 -> 23 -> 232^-0
rawBias/Si
Sw 0.00344898 / ( 0.0670743 * 0.000756184 ) = 67.9997 -> 68 -> 682^-0
rawBias/Si
Sw -0.00504183 / ( 0.0670743 * 0.000756184 ) = -99.404 -> -99 -> -992^-0
rawBias/Si
Sw 0.0038491 / ( 0.0670743 * 0.000756184 ) = 75.8883 -> 76 -> 762^-0
rawBias/Si
Sw -0.000436917 / ( 0.0670743 * 0.000756184 ) = -8.6142 -> -9 -> -92^-0
rawBias/Si
Sw 0.0107106 / ( 0.0670743 * 0.000756184 ) = 211.168 -> 211 -> 2112^-0
rawBias/Si
Sw 0.0101657 / ( 0.0670743 * 0.000756184 ) = 200.426 -> 200 -> 2002^-0
rawBias/Si
Sw 0.000640607 / ( 0.0670743 * 0.000756184 ) = 12.6301 -> 13 -> 132^-0
rawBias/Si
Sw 0.000200644 / ( 0.0670743 * 0.000756184 ) = 3.95588 -> 4 -> 42^-0
rawBias/Si
Sw -0.00187112 / ( 0.0670743 * 0.000756184 ) = -36.8908 -> -37 -> -372^-0
rawBias/Si
Sw -0.000905704 / ( 0.0670743 * 0.000756184 ) = -17.8567 -> -18 -> -182^-0
rawBias/Si
Sw -0.00430365 / ( 0.0670743 * 0.000756184 ) = -84.8502 -> -85 -> -852^-0
rawBias/Si
Sw -0.0114306 / ( 0.0670743 * 0.000756184 ) = -225.364 -> -225 -> -2252^-0
rawBias/Si
Sw -0.00254204 / ( 0.0670743 * 0.000756184 ) = -50.1186 -> -50 -> -502^-0
rawBias/Si
Sw 0.00999499 / ( 0.0670743 * 0.000756184 ) = 197.06 -> 197 -> 1972^-0
rawBias/Si
Sw -0.00361041 / ( 0.0670743 * 0.000756184 ) = -71.1824 -> -71 -> -712^-0
rawBias/Si
Sw -0.000814386 / ( 0.0670743 * 0.000756184 ) = -16.0563 -> -16 -> -162^-0
rawBias/Si
Sw -0.00166425 / ( 0.0670743 * 0.000756184 ) = -32.8121 -> -33 -> -332^-0
rawBias/Si
Sw 0.0036898 / ( 0.0670743 * 0.000756184 ) = 72.7476 -> 73 -> 732^-0
rawBias/Si
Sw 0.00910051 / ( 0.0670743 * 0.000756184 ) = 179.425 -> 179 -> 1792^-0
rawBias/Si
Sw -0.008081 / ( 0.0670743 * 0.000756184 ) = -159.324 -> -159 -> -1592^-0
rawBias/Si
Sw -0.00621499 / ( 0.0670743 * 0.000756184 ) = -122.534 -> -123 -> -1232^-0
rawBias/Si
Sw -0.00476039 / ( 0.0670743 * 0.000756184 ) = -93.8552 -> -94 -> -942^-0
rawBias/Si
Sw 0.00352937 / ( 0.0670743 * 0.000756184 ) = 69.5847 -> 70 -> 702^-0
rawBias/Si
Sw -0.00883984 / ( 0.0670743 * 0.000756184 ) = -174.285 -> -174 -> -1742^-0
rawBias/Si
Sw 0.01267 / ( 0.0670743 * 0.000756184 ) = 249.801 -> 250 -> 2502^-0
rawBias/Si
Sw 0.00553043 / ( 0.0670743 * 0.000756184 ) = 109.037 -> 109 -> 1092^-0
rawBias/Si
Sw 0.00413786 / ( 0.0670743 * 0.000756184 ) = 81.5815 -> 82 -> 822^-0
rawBias/Si
Sw 0.00378476 / ( 0.0670743 * 0.000756184 ) = 74.6199 -> 75 -> 752^-0
rawBias/Si
Sw -0.000376468 / ( 0.0670743 * 0.000756184 ) = -7.4224 -> -7 -> -72^-0
rawBias/Si
Sw -0.00935109 / ( 0.0670743 * 0.000756184 ) = -184.365 -> -184 -> -1842^-0
rawBias/Si
Sw 0.00111507 / ( 0.0670743 * 0.000756184 ) = 21.9846 -> 22 -> 222^-0
rawBias/Si
Sw 0.00449815 / ( 0.0670743 * 0.000756184 ) = 88.6849 -> 89 -> 892^-0
rawBias/Si
Sw 0.00232352 / ( 0.0670743 * 0.000756184 ) = 45.8102 -> 46 -> 462^-0
rawBias/Si
Sw -0.00526044 / ( 0.0670743 * 0.000756184 ) = -103.714 -> -104 -> -1042^-0
rawBias/Si
Sw -0.00657905 / ( 0.0670743 * 0.000756184 ) = -129.712 -> -130 -> -1302^-0
rawBias/Si
Sw 0.00202523 / ( 0.0670743 * 0.000756184 ) = 39.9293 -> 40 -> 402^-0
rawBias/Si
Sw 0.00783183 / ( 0.0670743 * 0.000756184 ) = 154.411 -> 154 -> 1542^-0
rawBias/Si
Sw 0.00331134 / ( 0.0670743 * 0.000756184 ) = 65.286 -> 65 -> 652^-0
rawBias/Si
Sw 0.0113301 / ( 0.0670743 * 0.000756184 ) = 223.384 -> 223 -> 2232^-0
rawBias/Si
Sw 0.000284586 / ( 0.0670743 * 0.000756184 ) = 5.61086 -> 6 -> 62^-0
rawBias/Si
Sw 0.00555074 / ( 0.0670743 * 0.000756184 ) = 109.438 -> 109 -> 1092^-0
rawBias/Si
Sw -0.00561669 / ( 0.0670743 * 0.000756184 ) = -110.738 -> -111 -> -1112^-0
rawBias/Si
Sw -0.00304549 / ( 0.0670743 * 0.000756184 ) = -60.0444 -> -60 -> -602^-0
rawBias/Si
Sw 0.00232793 / ( 0.0670743 * 0.000756184 ) = 45.8973 -> 46 -> 462^-0
rawBias/Si
Sw 0.00480359 / ( 0.0670743 * 0.000756184 ) = 94.7071 -> 95 -> 952^-0
rawBias/Si
Sw 0.00417361 / ( 0.0670743 * 0.000756184 ) = 82.2863 -> 82 -> 822^-0
rawBias/Si
Sw 0.00917487 / ( 0.0670743 * 0.000756184 ) = 180.891 -> 181 -> 1812^-0
rawBias/Si
Sw -0.0106033 / ( 0.0670743 * 0.000756184 ) = -209.054 -> -209 -> -2092^-0
rawBias/Si
Sw -0.00935452 / ( 0.0670743 * 0.000756184 ) = -184.433 -> -184 -> -1842^-0
rawBias/Si
Sw 0.000472582 / ( 0.0670743 * 0.000756184 ) = 9.31736 -> 9 -> 92^-0
rawBias/Si
Sw -0.00565795 / ( 0.0670743 * 0.000756184 ) = -111.551 -> -112 -> -1122^-0
rawBias/Si
Sw 0.00265887 / ( 0.0670743 * 0.000756184 ) = 52.4219 -> 52 -> 522^-0
rawBias/Si
Sw 0.00553717 / ( 0.0670743 * 0.000756184 ) = 109.17 -> 109 -> 1092^-0
rawBias/Si
Sw 0.00830267 / ( 0.0670743 * 0.000756184 ) = 163.694 -> 164 -> 1642^-0
rawBias/Si
Sw 0.00499382 / ( 0.0670743 * 0.000756184 ) = 98.4575 -> 98 -> 982^-0
rawBias/Si
Sw -0.00830142 / ( 0.0670743 * 0.000756184 ) = -163.67 -> -164 -> -1642^-0
rawBias/Si
Sw 0.00125424 / ( 0.0670743 * 0.000756184 ) = 24.7284 -> 25 -> 252^-0
rawBias/Si
Sw 0.017598 / ( 0.0670743 * 0.000756184 ) = 346.96 -> 347 -> 3472^-0
rawBias/Si
Sw 0.0110582 / ( 0.0670743 * 0.000756184 ) = 218.023 -> 218 -> 2182^-0
rawBias/Si
Sw -0.00110636 / ( 0.0670743 * 0.000756184 ) = -21.8128 -> -22 -> -222^-0
rawBias/Si
Sw 0.00206649 / ( 0.0670743 * 0.000756184 ) = 40.7427 -> 41 -> 412^-0
rawBias/Si
Sw 0.00600223 / ( 0.0670743 * 0.000756184 ) = 118.339 -> 118 -> 1182^-0
rawBias/Si
Sw 0.00580203 / ( 0.0670743 * 0.000756184 ) = 114.392 -> 114 -> 1142^-0
rawBias/Si
Sw 0.0095547 / ( 0.0670743 * 0.000756184 ) = 188.379 -> 188 -> 1882^-0
rawBias/Si
Sw -0.00347894 / ( 0.0670743 * 0.000756184 ) = -68.5904 -> -69 -> -692^-0
rawBias/Si
Sw -0.000947946 / ( 0.0670743 * 0.000756184 ) = -18.6896 -> -19 -> -192^-0
rawBias/Si
Sw -0.00558407 / ( 0.0670743 * 0.000756184 ) = -110.095 -> -110 -> -1102^-0
rawBias/Si
Sw 0.00457049 / ( 0.0670743 * 0.000756184 ) = 90.1113 -> 90 -> 902^-0
rawBias/Si
Sw -0.00695887 / ( 0.0670743 * 0.000756184 ) = -137.2 -> -137 -> -1372^-0
rawBias/Si
Sw 0.00425796 / ( 0.0670743 * 0.000756184 ) = 83.9493 -> 84 -> 842^-0
rawBias/Si
Sw 0.00692133 / ( 0.0670743 * 0.000756184 ) = 136.46 -> 136 -> 1362^-0
rawBias/Si
Sw 0.0110334 / ( 0.0670743 * 0.000756184 ) = 217.533 -> 218 -> 2182^-0
rawBias/Si
Sw 0.00080584 / ( 0.0670743 * 0.000756184 ) = 15.8878 -> 16 -> 162^-0
rawBias/Si
Sw 0.00967977 / ( 0.0670743 * 0.000756184 ) = 190.845 -> 191 -> 1912^-0
rawBias/Si
Sw -0.00152148 / ( 0.0670743 * 0.000756184 ) = -29.9974 -> -30 -> -302^-0
rawBias/Si
Sw 0.000685625 / ( 0.0670743 * 0.000756184 ) = 13.5177 -> 14 -> 142^-0
rawBias/Si
Sw 0.0111667 / ( 0.0670743 * 0.000756184 ) = 220.161 -> 220 -> 2202^-0
rawBias/Si
Sw -0.0122356 / ( 0.0670743 * 0.000756184 ) = -241.235 -> -241 -> -2412^-0
rawBias/Si
Sw -0.00163696 / ( 0.0670743 * 0.000756184 ) = -32.2742 -> -32 -> -322^-0
rawBias/Si
Sw 0.00874573 / ( 0.0670743 * 0.000756184 ) = 172.43 -> 172 -> 1722^-0
rawBias/Si
Sw -0.00747964 / ( 0.0670743 * 0.000756184 ) = -147.468 -> -147 -> -1472^-0
rawBias/Si
Sw 9.53092e-05 / ( 0.0670743 * 0.000756184 ) = 1.8791 -> 2 -> 22^-0
rawBias/Si
Sw 0.00125609 / ( 0.0670743 * 0.000756184 ) = 24.765 -> 25 -> 252^-0
rawBias/Si
Sw -0.00484556 / ( 0.0670743 * 0.000756184 ) = -95.5345 -> -96 -> -962^-0
rawBias/Si
Sw 0.00964425 / ( 0.0670743 * 0.000756184 ) = 190.145 -> 190 -> 1902^-0
rawBias/Si
Sw -0.0049331 / ( 0.0670743 * 0.000756184 ) = -97.2603 -> -97 -> -972^-0
rawBias/Si
Sw -0.00674903 / ( 0.0670743 * 0.000756184 ) = -133.063 -> -133 -> -1332^-0
rawBias/Si
Sw 0.00155267 / ( 0.0670743 * 0.000756184 ) = 30.6122 -> 31 -> 312^-0
rawBias/Si
Sw 0.00779544 / ( 0.0670743 * 0.000756184 ) = 153.694 -> 154 -> 1542^-0
rawBias/Si
Sw -0.00475451 / ( 0.0670743 * 0.000756184 ) = -93.7393 -> -94 -> -942^-0
rawBias/Si
Sw -0.00446209 / ( 0.0670743 * 0.000756184 ) = -87.974 -> -88 -> -882^-0
rawBias/Si
Sw 0.00994591 / ( 0.0670743 * 0.000756184 ) = 196.092 -> 196 -> 1962^-0
rawBias/Si
Sw -0.00069266 / ( 0.0670743 * 0.000756184 ) = -13.6564 -> -14 -> -142^-0
rawBias/Si
Sw 0.0028227 / ( 0.0670743 * 0.000756184 ) = 55.652 -> 56 -> 562^-0
rawBias/Si
Sw -0.00315651 / ( 0.0670743 * 0.000756184 ) = -62.2334 -> -62 -> -622^-0
rawBias/Si
Sw -0.00449883 / ( 0.0670743 * 0.000756184 ) = -88.6983 -> -89 -> -892^-0
rawBias/Si
Sw 0.0124552 / ( 0.0670743 * 0.000756184 ) = 245.565 -> 246 -> 2462^-0
rawBias/Si
Sw 0.00346543 / ( 0.0670743 * 0.000756184 ) = 68.324 -> 68 -> 682^-0
rawBias/Si
Sw -0.00341063 / ( 0.0670743 * 0.000756184 ) = -67.2436 -> -67 -> -672^-0
rawBias/Si
Sw -0.00535677 / ( 0.0670743 * 0.000756184 ) = -105.613 -> -106 -> -1062^-0
rawBias/Si
Sw -0.00169873 / ( 0.0670743 * 0.000756184 ) = -33.4919 -> -33 -> -332^-0
rawBias/Si
Sw -0.000445362 / ( 0.0670743 * 0.000756184 ) = -8.7807 -> -9 -> -92^-0
rawBias/Si
Sw -0.0113099 / ( 0.0670743 * 0.000756184 ) = -222.984 -> -223 -> -2232^-0
rawBias/Si
Sw -0.000209029 / ( 0.0670743 * 0.000756184 ) = -4.12119 -> -4 -> -42^-0
rawBias/Si
Sw -0.000552506 / ( 0.0670743 * 0.000756184 ) = -10.8931 -> -11 -> -112^-0
rawBias/Si
Sw 0.00152527 / ( 0.0670743 * 0.000756184 ) = 30.0721 -> 30 -> 302^-0
rawBias/Si
Sw 0.00539479 / ( 0.0670743 * 0.000756184 ) = 106.363 -> 106 -> 1062^-0
rawBias/Si
Sw 0.01207 / ( 0.0670743 * 0.000756184 ) = 237.971 -> 238 -> 2382^-0
rawBias/Si
Sw 0.00102972 / ( 0.0670743 * 0.000756184 ) = 20.3018 -> 20 -> 202^-0
rawBias/Si
Sw 0.00209548 / ( 0.0670743 * 0.000756184 ) = 41.3141 -> 41 -> 412^-0
rawBias/Si
Sw -0.000864239 / ( 0.0670743 * 0.000756184 ) = -17.0392 -> -17 -> -172^-0
rawBias/Si
Sw 0.0142878 / ( 0.0670743 * 0.000756184 ) = 281.696 -> 282 -> 2822^-0
rawBias/Si
Sw -0.00662105 / ( 0.0670743 * 0.000756184 ) = -130.54 -> -131 -> -1312^-0
rawBias/Si
Sw -0.000244511 / ( 0.0670743 * 0.000756184 ) = -4.82075 -> -5 -> -52^-0
rawBias/Si
Sw 0.0163849 / ( 0.0670743 * 0.000756184 ) = 323.043 -> 323 -> 3232^-0
rawBias/Si
Sw 0.0115843 / ( 0.0670743 * 0.000756184 ) = 228.395 -> 228 -> 2282^-0
rawBias/Si
Sw -0.000624234 / ( 0.0670743 * 0.000756184 ) = -12.3073 -> -12 -> -122^-0
rawBias/Si
Sw 0.00149396 / ( 0.0670743 * 0.000756184 ) = 29.4547 -> 29 -> 292^-0
rawBias/Si
Sw -0.00786256 / ( 0.0670743 * 0.000756184 ) = -155.017 -> -155 -> -1552^-0
rawBias/Si
Sw 0.00504049 / ( 0.0670743 * 0.000756184 ) = 99.3777 -> 99 -> 992^-0
rawBias/Si
Sw -0.00208822 / ( 0.0670743 * 0.000756184 ) = -41.1711 -> -41 -> -412^-0
rawBias/Si
Sw -0.0053671 / ( 0.0670743 * 0.000756184 ) = -105.817 -> -106 -> -1062^-0
rawBias/Si
Sw 0.00282316 / ( 0.0670743 * 0.000756184 ) = 55.661 -> 56 -> 562^-0
rawBias/Si
Sw 0.0123619 / ( 0.0670743 * 0.000756184 ) = 243.726 -> 244 -> 2442^-0
rawBias/Si
Sw 0.00466023 / ( 0.0670743 * 0.000756184 ) = 91.8804 -> 92 -> 922^-0
rawBias/Si
Sw -0.00132708 / ( 0.0670743 * 0.000756184 ) = -26.1645 -> -26 -> -262^-0
rawBias/Si
Sw -0.00287518 / ( 0.0670743 * 0.000756184 ) = -56.6867 -> -57 -> -572^-0
rawBias/Si
Sw 0.00934964 / ( 0.0670743 * 0.000756184 ) = 184.336 -> 184 -> 1842^-0
rawBias/Si
Sw 0.0016545 / ( 0.0670743 * 0.000756184 ) = 32.62 -> 33 -> 332^-0
rawBias/Si
Sw -0.00141152 / ( 0.0670743 * 0.000756184 ) = -27.8293 -> -28 -> -282^-0
rawBias/Si
Sw -0.00762542 / ( 0.0670743 * 0.000756184 ) = -150.342 -> -150 -> -1502^-0
rawBias/Si
Sw 0.00133011 / ( 0.0670743 * 0.000756184 ) = 26.2243 -> 26 -> 262^-0
rawBias/Si
Sw 0.00354479 / ( 0.0670743 * 0.000756184 ) = 69.8886 -> 70 -> 702^-0
rawBias/Si
Sw -0.000554709 / ( 0.0670743 * 0.000756184 ) = -10.9366 -> -11 -> -112^-0
rawBias/Si
Sw 0.00377187 / ( 0.0670743 * 0.000756184 ) = 74.3657 -> 74 -> 742^-0
rawBias/Si
Sw -0.00410209 / ( 0.0670743 * 0.000756184 ) = -80.8764 -> -81 -> -812^-0
rawBias/Si
Sw -0.0126422 / ( 0.0670743 * 0.000756184 ) = -249.252 -> -249 -> -2492^-0
rawBias/Si
Sw 0.00330497 / ( 0.0670743 * 0.000756184 ) = 65.1604 -> 65 -> 652^-0
rawBias/Si
Sw 0.00150753 / ( 0.0670743 * 0.000756184 ) = 29.7222 -> 30 -> 302^-0
rawBias/Si
Sw 0.00671193 / ( 0.0670743 * 0.000756184 ) = 132.331 -> 132 -> 1322^-0
rawBias/Si
Sw -0.000671467 / ( 0.0670743 * 0.000756184 ) = -13.2386 -> -13 -> -132^-0
rawBias/Si
Sw -0.0020892 / ( 0.0670743 * 0.000756184 ) = -41.1904 -> -41 -> -412^-0
rawBias/Si
Sw -0.00214926 / ( 0.0670743 * 0.000756184 ) = -42.3746 -> -42 -> -422^-0
rawBias/Si
Sw 0.0089044 / ( 0.0670743 * 0.000756184 ) = 175.558 -> 176 -> 1762^-0
rawBias/Si
Sw -0.00174577 / ( 0.0670743 * 0.000756184 ) = -34.4194 -> -34 -> -342^-0
rawBias/Si
Sw -0.00241748 / ( 0.0670743 * 0.000756184 ) = -47.6628 -> -48 -> -482^-0
rawBias/Si
Sw 0.00212694 / ( 0.0670743 * 0.000756184 ) = 41.9346 -> 42 -> 422^-0
rawBias/Si
Sw 0.00360922 / ( 0.0670743 * 0.000756184 ) = 71.159 -> 71 -> 712^-0
rawBias/Si
Sw 0.000103615 / ( 0.0670743 * 0.000756184 ) = 2.04286 -> 2 -> 22^-0
rawBias/Si
Sw -0.00459254 / ( 0.0670743 * 0.000756184 ) = -90.546 -> -91 -> -912^-0
rawBias/Si
Sw 0.0130479 / ( 0.0670743 * 0.000756184 ) = 257.251 -> 257 -> 2572^-0
rawBias/Si
Sw 0.00234114 / ( 0.0670743 * 0.000756184 ) = 46.1575 -> 46 -> 462^-0
rawBias/Si
Sw 0.00618787 / ( 0.0670743 * 0.000756184 ) = 121.999 -> 122 -> 1222^-0
rawBias/Si
Sw 0.00378565 / ( 0.0670743 * 0.000756184 ) = 74.6374 -> 75 -> 752^-0
rawBias/Si
Sw 0.00384775 / ( 0.0670743 * 0.000756184 ) = 75.8618 -> 76 -> 762^-0
rawBias/Si
Sw -0.00576199 / ( 0.0670743 * 0.000756184 ) = -113.603 -> -114 -> -1142^-0
rawBias/Si
Sw 0.00337918 / ( 0.0670743 * 0.000756184 ) = 66.6235 -> 67 -> 672^-0
rawBias/Si
Sw -0.00326975 / ( 0.0670743 * 0.000756184 ) = -64.466 -> -64 -> -642^-0
rawBias/Si
Sw -0.00731954 / ( 0.0670743 * 0.000756184 ) = -144.311 -> -144 -> -1442^-0
rawBias/Si
Sw -0.00448419 / ( 0.0670743 * 0.000756184 ) = -88.4096 -> -88 -> -882^-0
rawBias/Si
Sw -0.00779377 / ( 0.0670743 * 0.000756184 ) = -153.661 -> -154 -> -1542^-0
rawBias/Si
Sw -0.00449993 / ( 0.0670743 * 0.000756184 ) = -88.7201 -> -89 -> -892^-0
rawBias/Si
Sw -0.00854937 / ( 0.0670743 * 0.000756184 ) = -168.558 -> -169 -> -1692^-0
rawBias/Si
Sw 0.00736093 / ( 0.0670743 * 0.000756184 ) = 145.127 -> 145 -> 1452^-0
rawBias/Si
Sw 0.00287959 / ( 0.0670743 * 0.000756184 ) = 56.7737 -> 57 -> 572^-0
rawBias/Si
Sw 0.0059524 / ( 0.0670743 * 0.000756184 ) = 117.357 -> 117 -> 1172^-0
rawBias/Si
Sw 0.000887351 / ( 0.0670743 * 0.000756184 ) = 17.4949 -> 17 -> 172^-0
rawBias/Si
Sw -0.00268741 / ( 0.0670743 * 0.000756184 ) = -52.9846 -> -53 -> -532^-0
rawBias/Si
Sw 0.00199675 / ( 0.0670743 * 0.000756184 ) = 39.3677 -> 39 -> 392^-0
rawBias/Si
Sw -0.00819081 / ( 0.0670743 * 0.000756184 ) = -161.489 -> -161 -> -1612^-0
rawBias/Si
Sw -0.00205734 / ( 0.0670743 * 0.000756184 ) = -40.5623 -> -41 -> -412^-0
rawBias/Si
Sw 0.00159371 / ( 0.0670743 * 0.000756184 ) = 31.4215 -> 31 -> 312^-0
rawBias/Si
Sw -0.00959275 / ( 0.0670743 * 0.000756184 ) = -189.13 -> -189 -> -1892^-0
rawBias/Si
Sw -0.00528138 / ( 0.0670743 * 0.000756184 ) = -104.127 -> -104 -> -1042^-0
rawBias/Si
Sw 0.00541374 / ( 0.0670743 * 0.000756184 ) = 106.737 -> 107 -> 1072^-0
rawBias/Si
Sw -0.00262693 / ( 0.0670743 * 0.000756184 ) = -51.7922 -> -52 -> -522^-0
rawBias/Si
Sw -0.00522113 / ( 0.0670743 * 0.000756184 ) = -102.939 -> -103 -> -1032^-0
rawBias/Si
Sw 0.00781175 / ( 0.0670743 * 0.000756184 ) = 154.016 -> 154 -> 1542^-0
rawBias/Si
Sw 0.0028028 / ( 0.0670743 * 0.000756184 ) = 55.2598 -> 55 -> 552^-0
rawBias/Si
Sw 0.00183135 / ( 0.0670743 * 0.000756184 ) = 36.1068 -> 36 -> 362^-0
rawBias/Si
Sw 0.00101005 / ( 0.0670743 * 0.000756184 ) = 19.9141 -> 20 -> 202^-0
rawBias/Si
Sw 0.00178905 / ( 0.0670743 * 0.000756184 ) = 35.2726 -> 35 -> 352^-0
rawBias/Si
Sw 0.00425291 / ( 0.0670743 * 0.000756184 ) = 83.8499 -> 84 -> 842^-0
rawBias/Si
Sw 0.00325326 / ( 0.0670743 * 0.000756184 ) = 64.1408 -> 64 -> 642^-0
rawBias/Si
Sw -8.29429e-05 / ( 0.0670743 * 0.000756184 ) = -1.63529 -> -2 -> -22^-0
rawBias/Si
Sw -0.00223922 / ( 0.0670743 * 0.000756184 ) = -44.1482 -> -44 -> -442^-0
rawBias/Si
Sw -0.0016051 / ( 0.0670743 * 0.000756184 ) = -31.646 -> -32 -> -322^-0
rawBias/Si
Sw -0.000555084 / ( 0.0670743 * 0.000756184 ) = -10.944 -> -11 -> -112^-0
rawBias/Si
Sw -0.00882739 / ( 0.0670743 * 0.000756184 ) = -174.04 -> -174 -> -1742^-0
rawBias/Si
Sw 0.00275148 / ( 0.0670743 * 0.000756184 ) = 54.2479 -> 54 -> 542^-0
rawBias/Si
Sw -0.0073338 / ( 0.0670743 * 0.000756184 ) = -144.592 -> -145 -> -1452^-0
rawBias/Si
Sw -0.0130495 / ( 0.0670743 * 0.000756184 ) = -257.282 -> -257 -> -2572^-0
rawBias/Si
Sw -0.00503639 / ( 0.0670743 * 0.000756184 ) = -99.2968 -> -99 -> -992^-0
rawBias/Si
Sw -0.00335259 / ( 0.0670743 * 0.000756184 ) = -66.0993 -> -66 -> -662^-0
rawBias/Si
Sw -0.00762874 / ( 0.0670743 * 0.000756184 ) = -150.407 -> -150 -> -1502^-0
rawBias/Si
Sw -0.000773354 / ( 0.0670743 * 0.000756184 ) = -15.2474 -> -15 -> -152^-0
rawBias/Si
Sw -0.0105282 / ( 0.0670743 * 0.000756184 ) = -207.573 -> -208 -> -2082^-0
rawBias/Si
Sw -0.00604246 / ( 0.0670743 * 0.000756184 ) = -119.132 -> -119 -> -1192^-0
rawBias/Si
Sw -0.0148792 / ( 0.0670743 * 0.000756184 ) = -293.356 -> -293 -> -2932^-0
rawBias/Si
Sw 0.00509254 / ( 0.0670743 * 0.000756184 ) = 100.404 -> 100 -> 1002^-0
rawBias/Si
Sw -0.00343132 / ( 0.0670743 * 0.000756184 ) = -67.6514 -> -68 -> -682^-0
rawBias/Si
Sw 0.000884271 / ( 0.0670743 * 0.000756184 ) = 17.4342 -> 17 -> 172^-0
rawBias/Si
Sw -0.00659211 / ( 0.0670743 * 0.000756184 ) = -129.969 -> -130 -> -1302^-0
rawBias/Si
Sw -0.00150819 / ( 0.0670743 * 0.000756184 ) = -29.7353 -> -30 -> -302^-0
rawBias/Si
Sw -0.00066668 / ( 0.0670743 * 0.000756184 ) = -13.1442 -> -13 -> -132^-0
rawBias/Si
Sw -0.0123335 / ( 0.0670743 * 0.000756184 ) = -243.166 -> -243 -> -2432^-0
rawBias/Si
Sw 0.00527709 / ( 0.0670743 * 0.000756184 ) = 104.042 -> 104 -> 1042^-0
rawBias/Si
Sw 0.0030958 / ( 0.0670743 * 0.000756184 ) = 61.0364 -> 61 -> 612^-0
rawBias/Si
Sw -0.000243152 / ( 0.0670743 * 0.000756184 ) = -4.79396 -> -5 -> -52^-0
rawBias/Si
Sw 0.00163575 / ( 0.0670743 * 0.000756184 ) = 32.2503 -> 32 -> 322^-0
rawBias/Si
Sw 0.00305789 / ( 0.0670743 * 0.000756184 ) = 60.2889 -> 60 -> 602^-0
rawBias/Si
Sw -0.0032451 / ( 0.0670743 * 0.000756184 ) = -63.9799 -> -64 -> -642^-0
rawBias/Si
Sw 0.00757511 / ( 0.0670743 * 0.000756184 ) = 149.35 -> 149 -> 1492^-0
rawBias/Si
Sw -0.00441922 / ( 0.0670743 * 0.000756184 ) = -87.1289 -> -87 -> -872^-0
rawBias/Si
Sw -0.00156038 / ( 0.0670743 * 0.000756184 ) = -30.7643 -> -31 -> -312^-0
rawBias/Si
Sw -0.00333063 / ( 0.0670743 * 0.000756184 ) = -65.6662 -> -66 -> -662^-0
rawBias/Si
Sw -0.00142264 / ( 0.0670743 * 0.000756184 ) = -28.0485 -> -28 -> -282^-0
rawBias/Si
Sw -0.0023367 / ( 0.0670743 * 0.000756184 ) = -46.0701 -> -46 -> -462^-0
rawBias/Si
Sw -0.000629858 / ( 0.0670743 * 0.000756184 ) = -12.4182 -> -12 -> -122^-0
rawBias/Si
Sw -0.00742183 / ( 0.0670743 * 0.000756184 ) = -146.328 -> -146 -> -1462^-0
rawBias/Si
Sw -0.00525475 / ( 0.0670743 * 0.000756184 ) = -103.602 -> -104 -> -1042^-0
rawBias/Si
Sw 0.00106295 / ( 0.0670743 * 0.000756184 ) = 20.957 -> 21 -> 212^-0
rawBias/Si
Sw 0.00557893 / ( 0.0670743 * 0.000756184 ) = 109.994 -> 110 -> 1102^-0
rawBias/Si
Sw -0.00735296 / ( 0.0670743 * 0.000756184 ) = -144.97 -> -145 -> -1452^-0
rawBias/Si
Sw 0.0104845 / ( 0.0670743 * 0.000756184 ) = 206.71 -> 207 -> 2072^-0
rawBias/Si
Sw 0.00501339 / ( 0.0670743 * 0.000756184 ) = 98.8435 -> 99 -> 992^-0
rawBias/Si
Sw -8.00025e-05 / ( 0.0670743 * 0.000756184 ) = -1.57732 -> -2 -> -22^-0
rawBias/Si
Sw -0.00386855 / ( 0.0670743 * 0.000756184 ) = -76.2719 -> -76 -> -762^-0
rawBias/Si
Sw 0.0118741 / ( 0.0670743 * 0.000756184 ) = 234.108 -> 234 -> 2342^-0
rawBias/Si
Sw 0.00171545 / ( 0.0670743 * 0.000756184 ) = 33.8216 -> 34 -> 342^-0
rawBias/Si
Sw -0.00241817 / ( 0.0670743 * 0.000756184 ) = -47.6763 -> -48 -> -482^-0
rawBias/Si
Sw -0.00286085 / ( 0.0670743 * 0.000756184 ) = -56.4042 -> -56 -> -562^-0
rawBias/Si
Sw 0.0102796 / ( 0.0670743 * 0.000756184 ) = 202.671 -> 203 -> 2032^-0
rawBias/Si
Sw 0.00680886 / ( 0.0670743 * 0.000756184 ) = 134.243 -> 134 -> 1342^-0
rawBias/Si
Sw 0.00343526 / ( 0.0670743 * 0.000756184 ) = 67.7291 -> 68 -> 682^-0
rawBias/Si
Sw -0.00503297 / ( 0.0670743 * 0.000756184 ) = -99.2294 -> -99 -> -992^-0
rawBias/Si
Sw 0.00212876 / ( 0.0670743 * 0.000756184 ) = 41.9703 -> 42 -> 422^-0
rawBias/Si
Sw 0.00683794 / ( 0.0670743 * 0.000756184 ) = 134.816 -> 135 -> 1352^-0
rawBias/Si
Sw -0.003925 / ( 0.0670743 * 0.000756184 ) = -77.3847 -> -77 -> -772^-0
rawBias/Si
Sw 0.000772909 / ( 0.0670743 * 0.000756184 ) = 15.2386 -> 15 -> 152^-0
rawBias/Si
Sw 0.00182556 / ( 0.0670743 * 0.000756184 ) = 35.9926 -> 36 -> 362^-0
rawBias/Si
Sw -0.0014057 / ( 0.0670743 * 0.000756184 ) = -27.7146 -> -28 -> -282^-0
rawBias/Si
Sw 0.00537661 / ( 0.0670743 * 0.000756184 ) = 106.005 -> 106 -> 1062^-0
rawBias/Si
Sw -0.00120791 / ( 0.0670743 * 0.000756184 ) = -23.8149 -> -24 -> -242^-0
rawBias/Si
Sw 0.00606066 / ( 0.0670743 * 0.000756184 ) = 119.491 -> 119 -> 1192^-0
rawBias/Si
Sw -0.00678337 / ( 0.0670743 * 0.000756184 ) = -133.74 -> -134 -> -1342^-0
rawBias/Si
Sw -0.00536369 / ( 0.0670743 * 0.000756184 ) = -105.75 -> -106 -> -1062^-0
rawBias/Si
Sw 0.00152798 / ( 0.0670743 * 0.000756184 ) = 30.1254 -> 30 -> 302^-0
rawBias/Si
Sw 0.00113391 / ( 0.0670743 * 0.000756184 ) = 22.356 -> 22 -> 222^-0
rawBias/Si
Sw -0.00381443 / ( 0.0670743 * 0.000756184 ) = -75.2048 -> -75 -> -752^-0
rawBias/Si
Sw 0.00313435 / ( 0.0670743 * 0.000756184 ) = 61.7964 -> 62 -> 622^-0
rawBias/Si
Sw 0.00663192 / ( 0.0670743 * 0.000756184 ) = 130.754 -> 131 -> 1312^-0
rawBias/Si
Sw -0.00171548 / ( 0.0670743 * 0.000756184 ) = -33.8221 -> -34 -> -342^-0
rawBias/Si
Sw -0.00413377 / ( 0.0670743 * 0.000756184 ) = -81.5008 -> -82 -> -822^-0
rawBias/Si
Sw -0.00378595 / ( 0.0670743 * 0.000756184 ) = -74.6434 -> -75 -> -752^-0
rawBias/Si
Sw 0.00658897 / ( 0.0670743 * 0.000756184 ) = 129.907 -> 130 -> 1302^-0
rawBias/Si
Sw 0.0115695 / ( 0.0670743 * 0.000756184 ) = 228.103 -> 228 -> 2282^-0
rawBias/Si
Sw -0.00290412 / ( 0.0670743 * 0.000756184 ) = -57.2574 -> -57 -> -572^-0
rawBias/Si
Sw 0.00122328 / ( 0.0670743 * 0.000756184 ) = 24.1181 -> 24 -> 242^-0
rawBias/Si
Sw -0.00573779 / ( 0.0670743 * 0.000756184 ) = -113.126 -> -113 -> -1132^-0
rawBias/Si
Sw -0.00666367 / ( 0.0670743 * 0.000756184 ) = -131.38 -> -131 -> -1312^-0
rawBias/Si
Sw 0.00148788 / ( 0.0670743 * 0.000756184 ) = 29.3349 -> 29 -> 292^-0
rawBias/Si
Sw -9.36032e-05 / ( 0.0670743 * 0.000756184 ) = -1.84547 -> -2 -> -22^-0
rawBias/Si
Sw -0.00523711 / ( 0.0670743 * 0.000756184 ) = -103.254 -> -103 -> -1032^-0
rawBias/Si
Sw 0.00100092 / ( 0.0670743 * 0.000756184 ) = 19.7339 -> 20 -> 202^-0
rawBias/Si
Sw -0.00102205 / ( 0.0670743 * 0.000756184 ) = -20.1506 -> -20 -> -202^-0
rawBias/Si
Sw -0.00262461 / ( 0.0670743 * 0.000756184 ) = -51.7465 -> -52 -> -522^-0
rawBias/Si
Sw -0.00398723 / ( 0.0670743 * 0.000756184 ) = -78.6118 -> -79 -> -792^-0
rawBias/Si
Sw -0.0089795 / ( 0.0670743 * 0.000756184 ) = -177.039 -> -177 -> -1772^-0
rawBias/Si
Sw -0.00639421 / ( 0.0670743 * 0.000756184 ) = -126.068 -> -126 -> -1262^-0
rawBias/Si
Sw -0.00406703 / ( 0.0670743 * 0.000756184 ) = -80.1851 -> -80 -> -802^-0
rawBias/Si
Sw -0.00919813 / ( 0.0670743 * 0.000756184 ) = -181.349 -> -181 -> -1812^-0
rawBias/Si
Sw -0.00157012 / ( 0.0670743 * 0.000756184 ) = -30.9562 -> -31 -> -312^-0
rawBias/Si
Sw -0.00926236 / ( 0.0670743 * 0.000756184 ) = -182.616 -> -183 -> -1832^-0
rawBias/Si
Sw 0.000345377 / ( 0.0670743 * 0.000756184 ) = 6.80942 -> 7 -> 72^-0
rawBias/Si
Sw 0.0134298 / ( 0.0670743 * 0.000756184 ) = 264.78 -> 265 -> 2652^-0
rawBias/Si
Sw 0.00564769 / ( 0.0670743 * 0.000756184 ) = 111.349 -> 111 -> 1112^-0
rawBias/Si
Sw 0.0109816 / ( 0.0670743 * 0.000756184 ) = 216.512 -> 217 -> 2172^-0
rawBias/Si
Sw 0.00467368 / ( 0.0670743 * 0.000756184 ) = 92.1457 -> 92 -> 922^-0
rawBias/Si
Sw -0.00677339 / ( 0.0670743 * 0.000756184 ) = -133.543 -> -134 -> -1342^-0
rawBias/Si
Sw -0.00471123 / ( 0.0670743 * 0.000756184 ) = -92.886 -> -93 -> -932^-0
rawBias/Si
Sw 0.00475499 / ( 0.0670743 * 0.000756184 ) = 93.7488 -> 94 -> 942^-0
rawBias/Si
Sw -0.00101872 / ( 0.0670743 * 0.000756184 ) = -20.0849 -> -20 -> -202^-0
rawBias/Si
Sw 0.00660738 / ( 0.0670743 * 0.000756184 ) = 130.27 -> 130 -> 1302^-0
rawBias/Si
Sw 0.00300622 / ( 0.0670743 * 0.000756184 ) = 59.2703 -> 59 -> 592^-0
rawBias/Si
Sw -0.00118807 / ( 0.0670743 * 0.000756184 ) = -23.4239 -> -23 -> -232^-0
rawBias/Si
Sw 0.00205993 / ( 0.0670743 * 0.000756184 ) = 40.6133 -> 41 -> 412^-0
rawBias/Si
Sw -0.00326071 / ( 0.0670743 * 0.000756184 ) = -64.2877 -> -64 -> -642^-0
rawBias/Si
Sw 0.00502191 / ( 0.0670743 * 0.000756184 ) = 99.0113 -> 99 -> 992^-0
rawBias/Si
Sw -0.010096 / ( 0.0670743 * 0.000756184 ) = -199.051 -> -199 -> -1992^-0
rawBias/Si
Sw -0.0014445 / ( 0.0670743 * 0.000756184 ) = -28.4796 -> -28 -> -282^-0
rawBias/Si
Sw 0.00273264 / ( 0.0670743 * 0.000756184 ) = 53.8764 -> 54 -> 542^-0
rawBias/Si
Sw 0.00262441 / ( 0.0670743 * 0.000756184 ) = 51.7425 -> 52 -> 522^-0
rawBias/Si
Sw -0.000504633 / ( 0.0670743 * 0.000756184 ) = -9.94928 -> -10 -> -102^-0
rawBias/Si
Sw 0.0147181 / ( 0.0670743 * 0.000756184 ) = 290.18 -> 290 -> 2902^-0
rawBias/Si
Sw -0.00540654 / ( 0.0670743 * 0.000756184 ) = -106.595 -> -107 -> -1072^-0
rawBias/Si
Sw -0.00326369 / ( 0.0670743 * 0.000756184 ) = -64.3465 -> -64 -> -642^-0
rawBias/Si
Sw -0.0115329 / ( 0.0670743 * 0.000756184 ) = -227.381 -> -227 -> -2272^-0
rawBias/Si
Sw 0.002148 / ( 0.0670743 * 0.000756184 ) = 42.3497 -> 42 -> 422^-0
rawBias/Si
Sw 0.00194639 / ( 0.0670743 * 0.000756184 ) = 38.3747 -> 38 -> 382^-0
rawBias/Si
Sw -0.0101629 / ( 0.0670743 * 0.000756184 ) = -200.371 -> -200 -> -2002^-0
rawBias/Si
Sw -0.00186599 / ( 0.0670743 * 0.000756184 ) = -36.7896 -> -37 -> -372^-0
rawBias/Si
Sw 0.0116932 / ( 0.0670743 * 0.000756184 ) = 230.541 -> 231 -> 2312^-0
rawBias/Si
Sw -0.000642768 / ( 0.0670743 * 0.000756184 ) = -12.6727 -> -13 -> -132^-0
rawBias/Si
Sw -0.00464428 / ( 0.0670743 * 0.000756184 ) = -91.566 -> -92 -> -922^-0
rawBias/Si
Sw -0.00737856 / ( 0.0670743 * 0.000756184 ) = -145.475 -> -145 -> -1452^-0
rawBias/Si
Sw 0.0024578 / ( 0.0670743 * 0.000756184 ) = 48.4578 -> 48 -> 482^-0
rawBias/Si
Sw 0.00681272 / ( 0.0670743 * 0.000756184 ) = 134.319 -> 134 -> 1342^-0
rawBias/Si
Sw 0.00416895 / ( 0.0670743 * 0.000756184 ) = 82.1946 -> 82 -> 822^-0
rawBias/Si
Sw 0.00396947 / ( 0.0670743 * 0.000756184 ) = 78.2616 -> 78 -> 782^-0
rawBias/Si
Sw -0.0062771 / ( 0.0670743 * 0.000756184 ) = -123.759 -> -124 -> -1242^-0
rawBias/Si
Sw 0.000580598 / ( 0.0670743 * 0.000756184 ) = 11.447 -> 11 -> 112^-0
rawBias/Si
Sw -0.00101734 / ( 0.0670743 * 0.000756184 ) = -20.0577 -> -20 -> -202^-0
rawBias/Si
Sw -0.00331908 / ( 0.0670743 * 0.000756184 ) = -65.4387 -> -65 -> -652^-0
rawBias/Si
Sw -0.00147223 / ( 0.0670743 * 0.000756184 ) = -29.0263 -> -29 -> -292^-0
rawBias/Si
Sw 0.0143072 / ( 0.0670743 * 0.000756184 ) = 282.079 -> 282 -> 2822^-0
rawBias/Si
Sw 0.00945186 / ( 0.0670743 * 0.000756184 ) = 186.352 -> 186 -> 1862^-0
rawBias/Si
Sw 0.00212467 / ( 0.0670743 * 0.000756184 ) = 41.8898 -> 42 -> 422^-0
rawBias/Si
Sw 0.00702214 / ( 0.0670743 * 0.000756184 ) = 138.448 -> 138 -> 1382^-0
rawBias/Si
Sw 0.00546126 / ( 0.0670743 * 0.000756184 ) = 107.674 -> 108 -> 1082^-0
rawBias/Si
Sw 0.0012048 / ( 0.0670743 * 0.000756184 ) = 23.7537 -> 24 -> 242^-0
rawBias/Si
Sw 0.00901155 / ( 0.0670743 * 0.000756184 ) = 177.671 -> 178 -> 1782^-0
rawBias/Si
Sw -0.00802692 / ( 0.0670743 * 0.000756184 ) = -158.258 -> -158 -> -1582^-0
rawBias/Si
Sw 0.0110263 / ( 0.0670743 * 0.000756184 ) = 217.393 -> 217 -> 2172^-0
rawBias/Si
Sw -0.00302939 / ( 0.0670743 * 0.000756184 ) = -59.7271 -> -60 -> -602^-0
rawBias/Si
Sw -0.000107664 / ( 0.0670743 * 0.000756184 ) = -2.12269 -> -2 -> -22^-0
rawBias/Si
Sw 0.00674109 / ( 0.0670743 * 0.000756184 ) = 132.906 -> 133 -> 1332^-0
rawBias/Si
Sw -0.00715094 / ( 0.0670743 * 0.000756184 ) = -140.987 -> -141 -> -1412^-0
rawBias/Si
Sw -0.00416368 / ( 0.0670743 * 0.000756184 ) = -82.0907 -> -82 -> -822^-0
rawBias/Si
Sw -0.00109759 / ( 0.0670743 * 0.000756184 ) = -21.64 -> -22 -> -222^-0
rawBias/Si
Sw -0.00181006 / ( 0.0670743 * 0.000756184 ) = -35.6869 -> -36 -> -362^-0
rawBias/Si
Sw 0.00966231 / ( 0.0670743 * 0.000756184 ) = 190.501 -> 191 -> 1912^-0
rawBias/Si
Sw -0.00492121 / ( 0.0670743 * 0.000756184 ) = -97.0259 -> -97 -> -972^-0
rawBias/Si
Sw 0.00562308 / ( 0.0670743 * 0.000756184 ) = 110.864 -> 111 -> 1112^-0
rawBias/Si
Sw 0.00988432 / ( 0.0670743 * 0.000756184 ) = 194.878 -> 195 -> 1952^-0
rawBias/Si
Sw 0.00530393 / ( 0.0670743 * 0.000756184 ) = 104.572 -> 105 -> 1052^-0
rawBias/Si
Sw -0.000590022 / ( 0.0670743 * 0.000756184 ) = -11.6328 -> -12 -> -122^-0
rawBias/Si
Sw 0.00142536 / ( 0.0670743 * 0.000756184 ) = 28.1022 -> 28 -> 282^-0
rawBias/Si
Sw -0.00840012 / ( 0.0670743 * 0.000756184 ) = -165.616 -> -166 -> -1662^-0
rawBias/Si
Sw -0.00277301 / ( 0.0670743 * 0.000756184 ) = -54.6724 -> -55 -> -552^-0
rawBias/Si
Sw -0.00480831 / ( 0.0670743 * 0.000756184 ) = -94.8001 -> -95 -> -952^-0
rawBias/Si
Sw 0.00468205 / ( 0.0670743 * 0.000756184 ) = 92.3108 -> 92 -> 922^-0
rawBias/Si
Sw 0.00824822 / ( 0.0670743 * 0.000756184 ) = 162.621 -> 163 -> 1632^-0
rawBias/Si
Sw 0.0036084 / ( 0.0670743 * 0.000756184 ) = 71.1428 -> 71 -> 712^-0
rawBias/Si
Sw 0.00174475 / ( 0.0670743 * 0.000756184 ) = 34.3992 -> 34 -> 342^-0
rawBias/Si
Sw 0.00406262 / ( 0.0670743 * 0.000756184 ) = 80.0982 -> 80 -> 802^-0
rawBias/Si
Sw 0.00370747 / ( 0.0670743 * 0.000756184 ) = 73.0961 -> 73 -> 732^-0
rawBias/Si
Sw -0.00549466 / ( 0.0670743 * 0.000756184 ) = -108.332 -> -108 -> -1082^-0
rawBias/Si
Sw 0.00743413 / ( 0.0670743 * 0.000756184 ) = 146.57 -> 147 -> 1472^-0
rawBias/Si
Sw 0.0100927 / ( 0.0670743 * 0.000756184 ) = 198.987 -> 199 -> 1992^-0
rawBias/Si
Sw 0.0071719 / ( 0.0670743 * 0.000756184 ) = 141.4 -> 141 -> 1412^-0
rawBias/Si
Sw -0.00537823 / ( 0.0670743 * 0.000756184 ) = -106.037 -> -106 -> -1062^-0
rawBias/Si
Sw 0.00443552 / ( 0.0670743 * 0.000756184 ) = 87.4501 -> 87 -> 872^-0
rawBias/Si
Sw 0.00363082 / ( 0.0670743 * 0.000756184 ) = 71.5848 -> 72 -> 722^-0
rawBias/Si
Sw 0.00574112 / ( 0.0670743 * 0.000756184 ) = 113.191 -> 113 -> 1132^-0
rawBias/Si
Sw -0.0126982 / ( 0.0670743 * 0.000756184 ) = -250.355 -> -250 -> -2502^-0
rawBias/Si
Sw -0.00116731 / ( 0.0670743 * 0.000756184 ) = -23.0145 -> -23 -> -232^-0
rawBias/Si
Sw -0.00286545 / ( 0.0670743 * 0.000756184 ) = -56.4949 -> -56 -> -562^-0
rawBias/Si
Sw 0.0194753 / ( 0.0670743 * 0.000756184 ) = 383.972 -> 384 -> 3842^-0
rawBias/Si
Sw 0.0103246 / ( 0.0670743 * 0.000756184 ) = 203.559 -> 204 -> 2042^-0
rawBias/Si
Sw 0.00858375 / ( 0.0670743 * 0.000756184 ) = 169.236 -> 169 -> 1692^-0
rawBias/Si
Sw 0.000202983 / ( 0.0670743 * 0.000756184 ) = 4.00198 -> 4 -> 42^-0
rawBias/Si
Sw -0.00622996 / ( 0.0670743 * 0.000756184 ) = -122.829 -> -123 -> -1232^-0
rawBias/Si
Sw -0.00117932 / ( 0.0670743 * 0.000756184 ) = -23.2514 -> -23 -> -232^-0
rawBias/Si
Sw -0.0201595 / ( 0.0670743 * 0.000756184 ) = -397.463 -> -397 -> -3972^-0
rawBias/Si
Sw 0.0038894 / ( 0.0670743 * 0.000756184 ) = 76.683 -> 77 -> 772^-0
rawBias/Si
Sw 0.0100873 / ( 0.0670743 * 0.000756184 ) = 198.88 -> 199 -> 1992^-0
rawBias/Si
Sw 0.00150142 / ( 0.0670743 * 0.000756184 ) = 29.6019 -> 30 -> 302^-0
rawBias/Si
Sw -0.0285562 / ( 0.0561515 * 0.00157656 ) = -322.574 -> -323 -> -3232^-0
rawBias/Si
Sw 0.102577 / ( 0.0561515 * 0.00157656 ) = 1158.73 -> 1159 -> 11592^-0
rawBias/Si
Sw -0.0604402 / ( 0.0561515 * 0.00157656 ) = -682.74 -> -683 -> -6832^-0
rawBias/Si
Sw -0.0525417 / ( 0.0561515 * 0.00157656 ) = -593.517 -> -594 -> -5942^-0
rawBias/Si
Sw -0.00433865 / ( 0.0561515 * 0.00157656 ) = -49.0098 -> -49 -> -492^-0
rawBias/Si
Sw 0.028257 / ( 0.0561515 * 0.00157656 ) = 319.194 -> 319 -> 3192^-0
rawBias/Si
Sw -0.0173933 / ( 0.0561515 * 0.00157656 ) = -196.477 -> -196 -> -1962^-0
rawBias/Si
Sw 0.0503867 / ( 0.0561515 * 0.00157656 ) = 569.174 -> 569 -> 5692^-0
rawBias/Si
Sw 0.0181165 / ( 0.0561515 * 0.00157656 ) = 204.646 -> 205 -> 2052^-0
rawBias/Si
Sw -0.0360676 / ( 0.0561515 * 0.00157656 ) = -407.424 -> -407 -> -407*2^-0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
bias-0 Si * Sw / So = 0.00784518 * 0.00345303 / 0.0137246 = 16557
2^-23
bias-1 Si * Sw / So = 0.0137246 * 0.00122683 / 0.0670743 = 16846* 2^-26
bias-2 Si * Sw / So = 0.0670743 * 0.000756184 / 0.0561515 = 30309* 2^-25
bias-3 Si * Sw / So = 0.0561515 * 0.00157656 / 0.141637 = 20972* 2^-25
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
translating weights for n-1 bias-dims kcrs = 1,20,1,1 and size= 20
translating weights for n-3 kernel-dims kcrs = 50,20,5,5 and size= 25000
translating weights for n-4 bias-dims kcrs = 1,50,1,1 and size= 50
translating weights for n-6 kernel-dims kcrs = 500,50,4,4 and size= 400000
translating weights for n-7 bias-dims kcrs = 1,500,1,1 and size= 500
translating weights for n-10 kernel-dims kcrs = 10,500,1,1 and size= 5000
translating weights for n-11 bias-dims kcrs = 1,10,1,1 and size= 10
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
(dc-conv-0) CONV_DIRECT
total b4d needed 1
total b4w needed 1
min b4w needed 1
reserved WMB bank 0
(dc-conv-0) FI + FW mode. Nothing to split
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
(pdp-0) maxflywidth 128 out width 12
(pdp-0) maxFlyingWidth >= output_width. No need to do hw/sw PDP splits
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
(dc-conv-1) CONV_DIRECT
total b4d needed 1
total b4w needed 1
min b4w needed 1
reserved WMB bank 0
(dc-conv-1) FI + FW mode. Nothing to split
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
(pdp-1) maxflywidth 128 out width 4
(pdp-1) maxFlyingWidth >= output_width. No need to do hw/sw PDP splits
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
(fc-0) CONV_DIRECT
total b4d needed 1
total b4w needed 13
min b4w needed 1
reserved WMB bank 0
(fc-0) FI + FW mode. Nothing to split
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
(fc-1) CONV_DIRECT
total b4d needed 1
total b4w needed 1
min b4w needed 1
reserved WMB bank 0
(fc-1) FI + FW mode. Nothing to split
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
printGraph: pree fuseSDPSubEngineOps
n-0:dc-conv-0/conv1: (in)e-0[1x4x28x28][tsd-1][tt-1], (Aux)e-9[20x20x5x1][tsd-0][tt-4], (out)e-11[1x20x24x24][tsd-10][tt-8],
n-1:bias-0/conv1: (Aux)e-10[1x20x1x1][tsd-2][tt-5], (in)e-11[1x20x24x24][tsd-10][tt-8], (out)e-1[1x20x24x24][tsd-11][tt-3],
n-2:pdp-0/pool1: (in)e-1[1x20x24x24][tsd-11][tt-3], (out)e-2[1x20x12x12][tsd-12][tt-3],
n-3:dc-conv-1/conv2: (in)e-2[1x20x12x12][tsd-12][tt-3], (Aux)e-12[50x20x5x5][tsd-3][tt-4], (out)e-14[1x50x8x8][tsd-13][tt-8],
n-4:bias-1/conv2: (Aux)e-13[1x50x1x1][tsd-4][tt-5], (in)e-14[1x50x8x8][tsd-13][tt-8], (out)e-3[1x50x8x8][tsd-14][tt-3],
n-5:pdp-1/pool2: (in)e-3[1x50x8x8][tsd-14][tt-3], (out)e-4[1x50x4x4][tsd-15][tt-3],
n-6:fc-0/ip1: (in)e-4[1x50x4x4][tsd-15][tt-3], (Aux)e-15[500x50x4x4][tsd-5][tt-4], (out)e-17[1x500x1x1][tsd-16][tt-8],
n-7:bias-2/ip1: (Aux)e-16[1x500x1x1][tsd-6][tt-5], (in)e-17[1x500x1x1][tsd-16][tt-8], (out)e-6[1x500x1x1][tsd-19][tt-3],
n-10:fc-1/ip2: (in)e-6[1x500x1x1][tsd-19][tt-3], (Aux)e-20[10x500x1x1][tsd-8][tt-4], (out)e-22[1x10x1x1][tsd-20][tt-8],
n-11:bias-3/ip2: (Aux)e-21[1x10x1x1][tsd-9][tt-5], (in)e-22[1x10x1x1][tsd-20][tt-8], (out)e-7[1x10x1x1][tsd-21][tt-3],
n-12:cpu-sm-0/prob: (in)e-7[1x10x1x1][tsd-21][tt-3], (out)e-8[1x10x1x1][tsd-22][tt-2],
e-9 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-0 edge already has set surface format NVDLA_IMG_A8B8G8R8
e-10 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-12 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-13 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-15 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-16 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-20 edge already has set surface format NVDLA_WEIGHT_DC_INT8
e-21 edge already has set surface format NVDLA_BIAS_DATA_INT16
e-11 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-1 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-2 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-14 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-3 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-4 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-17 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-6 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-22 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-7 edge already has set surface format NVDLA_FEATURE_DATA_INT8
e-8 edge already has set surface format NVDLA_FEATURE_DATA_INT8
tb-0 for tsd-0 for e-9 with NVDLA_WEIGHT_DC_INT8
tb-1 for tsd-1 for e-0 with NVDLA_IMG_A8B8G8R8
tb-2 for tsd-2 for e-10 with NVDLA_BIAS_DATA_INT16
tb-3 for tsd-3 for e-12 with NVDLA_WEIGHT_DC_INT8
tb-4 for tsd-4 for e-13 with NVDLA_BIAS_DATA_INT16
tb-5 for tsd-5 for e-15 with NVDLA_WEIGHT_DC_INT8
tb-6 for tsd-6 for e-16 with NVDLA_BIAS_DATA_INT16
tb-8 for tsd-8 for e-20 with NVDLA_WEIGHT_DC_INT8
tb-9 for tsd-9 for e-21 with NVDLA_BIAS_DATA_INT16
tb-10 for tsd-10 for e-11 with NVDLA_FEATURE_DATA_INT8
tb-11 for tsd-11 for e-1 with NVDLA_FEATURE_DATA_INT8
tb-12 for tsd-12 for e-2 with NVDLA_FEATURE_DATA_INT8
tb-13 for tsd-13 for e-14 with NVDLA_FEATURE_DATA_INT8
tb-14 for tsd-14 for e-3 with NVDLA_FEATURE_DATA_INT8
tb-15 for tsd-15 for e-4 with NVDLA_FEATURE_DATA_INT8
tb-16 for tsd-16 for e-17 with NVDLA_FEATURE_DATA_INT8
tb-19 for tsd-19 for e-6 with NVDLA_FEATURE_DATA_INT8
tb-20 for tsd-20 for e-22 with NVDLA_FEATURE_DATA_INT8
tb-21 for tsd-21 for e-7 with NVDLA_FEATURE_DATA_INT8
tb-22 for tsd-22 for e-8 with NVDLA_FEATURE_DATA_INT8
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
printGraph: post fuseSDPSubEngineOps
n-0:dc-conv-0/conv1: (in)e-0[1x4x28x28][tsd-1][tt-1], (Aux)e-9[20x20x5x1][tsd-0][tt-4], (out)e-11[1x20x24x24][tsd-10][tt-8],
n-1:bias-0/conv1: (Aux)e-10[1x20x1x1][tsd-2][tt-5], (in)e-11[1x20x24x24][tsd-10][tt-8], (out)e-1[1x20x24x24][tsd-11][tt-3],
n-2:pdp-0/pool1: (in)e-1[1x20x24x24][tsd-11][tt-3], (out)e-2[1x20x12x12][tsd-12][tt-3],
n-3:dc-conv-1/conv2: (in)e-2[1x20x12x12][tsd-12][tt-3], (Aux)e-12[50x20x5x5][tsd-3][tt-4], (out)e-14[1x50x8x8][tsd-13][tt-8],
n-4:bias-1/conv2: (Aux)e-13[1x50x1x1][tsd-4][tt-5], (in)e-14[1x50x8x8][tsd-13][tt-8], (out)e-3[1x50x8x8][tsd-14][tt-3],
n-5:pdp-1/pool2: (in)e-3[1x50x8x8][tsd-14][tt-3], (out)e-4[1x50x4x4][tsd-15][tt-3],
n-6:fc-0/ip1: (in)e-4[1x50x4x4][tsd-15][tt-3], (Aux)e-15[500x50x4x4][tsd-5][tt-4], (out)e-17[1x500x1x1][tsd-16][tt-8],
n-7:bias-2/ip1: (Aux)e-16[1x500x1x1][tsd-6][tt-5], (in)e-17[1x500x1x1][tsd-16][tt-8], (out)e-6[1x500x1x1][tsd-19][tt-3],
n-10:fc-1/ip2: (in)e-6[1x500x1x1][tsd-19][tt-3], (Aux)e-20[10x500x1x1][tsd-8][tt-4], (out)e-22[1x10x1x1][tsd-20][tt-8],
n-11:bias-3/ip2: (Aux)e-21[1x10x1x1][tsd-9][tt-5], (in)e-22[1x10x1x1][tsd-20][tt-8], (out)e-7[1x10x1x1][tsd-21][tt-3],
n-12:cpu-sm-0/prob: (in)e-7[1x10x1x1][tsd-21][tt-3], (out)e-8[1x10x1x1][tsd-22][tt-2],
::Edge edge=e-0 bindable=1
::Edge edge=e-8 bindable=1
::Edge edge=e-9 bindable=0
::Surface surface=tsd-0: bindable=0
::Edge edge=e-0 bindable=1
::Edge edge=e-10 bindable=0
::Surface surface=tsd-2: bindable=0
::Edge edge=e-12 bindable=0
::Surface surface=tsd-3: bindable=0
::Edge edge=e-13 bindable=0
::Surface surface=tsd-4: bindable=0
::Edge edge=e-15 bindable=0
::Surface surface=tsd-5: bindable=0
::Edge edge=e-16 bindable=0
::Surface surface=tsd-6: bindable=0
::Edge edge=e-20 bindable=0
::Surface surface=tsd-8: bindable=0
::Edge edge=e-21 bindable=0
::Surface surface=tsd-9: bindable=0
::Edge edge=e-11 bindable=0
::Surface surface=tsd-10: bindable=0
::Edge edge=e-1 bindable=0
::Surface surface=tsd-11: bindable=0
::Edge edge=e-2 bindable=0
::Surface surface=tsd-12: bindable=0
::Edge edge=e-14 bindable=0
::Surface surface=tsd-13: bindable=0
::Edge edge=e-3 bindable=0
::Surface surface=tsd-14: bindable=0
::Edge edge=e-4 bindable=0
::Surface surface=tsd-15: bindable=0
::Edge edge=e-17 bindable=0
::Surface surface=tsd-16: bindable=0
::Edge edge=e-6 bindable=0
::Surface surface=tsd-19: bindable=0
::Edge edge=e-22 bindable=0
::Surface surface=tsd-20: bindable=0
::Edge edge=e-7 bindable=0
::Surface surface=tsd-21: bindable=0
::Edge edge=e-8 bindable=1
annid=0 node=dc-conv-0.B0 deps=1
producer: [, , , , , , , , , , ]
consumer: [, dc-conv-1(annId:3):OP_PROGRAMMED, , bias-0(annId:1):OP_PROGRAMMED, , , , , , , ]
annid=1 node=bias-0.B0 deps=1
producer: [, dc-conv-0(annId:0):OP_PROGRAMMED, , , , , , , , , ]
consumer: [, , , bias-1(annId:4):OP_PROGRAMMED, pdp-0(annId:2):OP_COMPLETED, , , , , , ]
annid=2 node=pdp-0.B0 deps=1
producer: [, , , bias-0(annId:1):OP_COMPLETED, , , , , , , ]
consumer: [, dc-conv-1(annId:3):OP_COMPLETED, , , pdp-1(annId:5):OP_PROGRAMMED, , , , , , ]
annid=3 node=dc-conv-1.B0 deps=3
producer: [, dc-conv-0(annId:0):OP_PROGRAMMED, , , pdp-0(annId:2):OP_COMPLETED, , , , , , ]
consumer: [, fc-0(annId:6):OP_PROGRAMMED, , bias-1(annId:4):OP_PROGRAMMED, , , , , , , ]
annid=4 node=bias-1.B0 deps=2
producer: [, dc-conv-1(annId:3):OP_PROGRAMMED, , bias-0(annId:1):OP_PROGRAMMED, , , , , , , ]
consumer: [, , , bias-2(annId:7):OP_PROGRAMMED, pdp-1(annId:5):OP_COMPLETED, , , , , , ]
annid=5 node=pdp-1.B0 deps=2
producer: [, , , bias-1(annId:4):OP_COMPLETED, pdp-0(annId:2):OP_PROGRAMMED, , , , , , ]
consumer: [, fc-0(annId:6):OP_COMPLETED, , , , , , , , , ]
annid=6 node=fc-0.B0 deps=3
producer: [, dc-conv-1(annId:3):OP_PROGRAMMED, , , pdp-1(annId:5):OP_COMPLETED, , , , , , ]
consumer: [, fc-1(annId:8):OP_PROGRAMMED, , bias-2(annId:7):OP_PROGRAMMED, , , , , , , ]
annid=7 node=bias-2.B0 deps=2
producer: [, fc-0(annId:6):OP_PROGRAMMED, , bias-1(annId:4):OP_PROGRAMMED, , , , , , , ]
consumer: [, fc-1(annId:8):OP_COMPLETED, , bias-3(annId:9):OP_PROGRAMMED, , , , , , , ]
annid=8 node=fc-1.B0 deps=3
producer: [, fc-0(annId:6):OP_PROGRAMMED, , bias-2(annId:7):OP_COMPLETED, , , , , , , ]
consumer: [, , , bias-3(annId:9):OP_PROGRAMMED, , , , , , , ]
annid=9 node=bias-3.B0 deps=2
producer: [, fc-1(annId:8):OP_PROGRAMMED, , bias-2(annId:7):OP_PROGRAMMED, , , , , , , ]
consumer: [cpu-sm-0(annId:0):OP_COMPLETED, , , , , , , , , , ]
annid=0 node=cpu-sm-0.B0 deps=1
producer: [, , , bias-3(annId:9):OP_COMPLETED, , , , , , , ]
consumer: [, , , , , , , , , , ]
beginning resolveMemory phase
begin memory resolver pooling=1 reuse=1 greedy_eviction=1
local cvsram size=0 local sdram size=1073741824 global sdram size=536870912
node=dc-conv-0 anno=0.0 in=[tsd-1] aux=[tsd-0] io=[] out=[tsd-10]
tsd=tsd-10 (stream tensor)
::Surface surface=tsd-1: bindable=1
::Buffer buffer=tb-1 surface=tsd-1 bindable=1
::Surface surface=tsd-1: bindable=1
resolve placement/alloc for tsd=tsd-1/tb-1 aux=0 pooling=0
placed tsd-1/tb-1 batch-0 inside DRAM @0
::Surface surface=tsd-0: bindable=0
::Buffer buffer=tb-0 surface=tsd-0 bindable=0
::Surface surface=tsd-0: bindable=0
resolve placement/alloc for tsd=tsd-0/tb-0 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-0 @0 +2048 loc=1
[MEMTOOL] t = 0 dc-conv-0’s AUX tb-0-B0 ALLOC
placed tb-0 batch-0 inside GLOBAL_DRAM_POOL@0
tsd=tsd-0 set content pooled=1
done node=dc-conv-0 rc=0
node=bias-0 anno=0.1 in=[tsd-10] aux=[tsd-2] io=[] out=[tsd-11]
tsd=tsd-10 (stream tensor)
::Surface surface=tsd-2: bindable=0
::Buffer buffer=tb-2 surface=tsd-2 bindable=0
::Surface surface=tsd-2: bindable=0
resolve placement/alloc for tsd=tsd-2/tb-2 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-2 @4096 +40 loc=1
[MEMTOOL] t = 1 bias-0’s AUX tb-2-B0 ALLOC
placed tb-2 batch-0 inside GLOBAL_DRAM_POOL@4096
tsd=tsd-2 set content pooled=1
::Surface surface=tsd-11: bindable=0
::Buffer buffer=tb-11 surface=tsd-11 bindable=0
::Surface surface=tsd-11: bindable=0
resolve placement/alloc for tsd=tsd-11/tb-11 aux=0 pooling=1
LOCAL_DRAM_POOL alloc tb-11 @0 +18432 loc=1
[MEMTOOL] t = 2 bias-0’s OUTPUT tb-11-B0 ALLOC
placed tb-11 batch-0 inside LOCAL_DRAM_POOL@0
done node=bias-0 rc=0
node=pdp-0 anno=0.2 in=[tsd-11] aux=[] io=[] out=[tsd-12]
::Surface surface=tsd-12: bindable=0
::Buffer buffer=tb-12 surface=tsd-12 bindable=0
::Surface surface=tsd-12: bindable=0
resolve placement/alloc for tsd=tsd-12/tb-12 aux=0 pooling=1
LOCAL_DRAM_POOL alloc tb-12 @32768 +4608 loc=1
[MEMTOOL] t = 3 pdp-0’s OUTPUT tb-12-B0 ALLOC
placed tb-12 batch-0 inside LOCAL_DRAM_POOL@32768
done node=pdp-0 rc=0
node=dc-conv-1 anno=0.3 in=[tsd-12] aux=[tsd-3] io=[] out=[tsd-13]
tsd=tsd-13 (stream tensor)
deallocating tsd-11/tb-11@0x7fd126829010
LOCAL_DRAM_POOL dealloc tb-11 @ 0 @ 140536270983184
[MEMTOOL] t = 4 tb-11-B0 DEALLOC
::Surface surface=tsd-3: bindable=0
::Buffer buffer=tb-3 surface=tsd-3 bindable=0
::Surface surface=tsd-3: bindable=0
resolve placement/alloc for tsd=tsd-3/tb-3 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-3 @32768 +25088 loc=1
[MEMTOOL] t = 5 dc-conv-1’s AUX tb-3-B0 ALLOC
placed tb-3 batch-0 inside GLOBAL_DRAM_POOL@32768
tsd=tsd-3 set content pooled=1
done node=dc-conv-1 rc=0
node=bias-1 anno=0.4 in=[tsd-13] aux=[tsd-4] io=[] out=[tsd-14]
tsd=tsd-13 (stream tensor)
::Surface surface=tsd-4: bindable=0
::Buffer buffer=tb-4 surface=tsd-4 bindable=0
::Surface surface=tsd-4: bindable=0
resolve placement/alloc for tsd=tsd-4/tb-4 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-4 @8192 +100 loc=1
[MEMTOOL] t = 6 bias-1’s AUX tb-4-B0 ALLOC
placed tb-4 batch-0 inside GLOBAL_DRAM_POOL@8192
tsd=tsd-4 set content pooled=1
::Surface surface=tsd-14: bindable=0
::Buffer buffer=tb-14 surface=tsd-14 bindable=0
::Surface surface=tsd-14: bindable=0
resolve placement/alloc for tsd=tsd-14/tb-14 aux=0 pooling=1
LOCAL_DRAM_POOL alloc tb-14 @40960 +4096 loc=1
[MEMTOOL] t = 7 bias-1’s OUTPUT tb-14-B0 ALLOC
placed tb-14 batch-0 inside LOCAL_DRAM_POOL@40960
done node=bias-1 rc=0
node=pdp-1 anno=0.5 in=[tsd-14] aux=[] io=[] out=[tsd-15]
deallocating tsd-12/tb-12@0x7fd126831010
LOCAL_DRAM_POOL dealloc tb-12 @ 32768 @ 140536271015952
[MEMTOOL] t = 8 tb-12-B0 DEALLOC
::Surface surface=tsd-15: bindable=0
::Buffer buffer=tb-15 surface=tsd-15 bindable=0
::Surface surface=tsd-15: bindable=0
resolve placement/alloc for tsd=tsd-15/tb-15 aux=0 pooling=1
LOCAL_DRAM_POOL alloc tb-15 @45056 +1024 loc=1
[MEMTOOL] t = 9 pdp-1’s OUTPUT tb-15-B0 ALLOC
placed tb-15 batch-0 inside LOCAL_DRAM_POOL@45056
done node=pdp-1 rc=0
node=fc-0 anno=0.6 in=[tsd-15] aux=[tsd-5] io=[] out=[tsd-16]
tsd=tsd-16 (stream tensor)
deallocating tsd-14/tb-14@0x7fd126833010
LOCAL_DRAM_POOL dealloc tb-14 @ 40960 @ 140536271024144
[MEMTOOL] t = 10 tb-14-B0 DEALLOC
::Surface surface=tsd-5: bindable=0
::Buffer buffer=tb-5 surface=tsd-5 bindable=0
::Surface surface=tsd-5: bindable=0
resolve placement/alloc for tsd=tsd-5/tb-5 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-5 @524288 +400000 loc=1
[MEMTOOL] t = 11 fc-0’s AUX tb-5-B0 ALLOC
placed tb-5 batch-0 inside GLOBAL_DRAM_POOL@524288
tsd=tsd-5 set content pooled=1
done node=fc-0 rc=0
node=bias-2 anno=0.7 in=[tsd-16] aux=[tsd-6] io=[] out=[tsd-19]
tsd=tsd-16 (stream tensor)
::Surface surface=tsd-6: bindable=0
::Buffer buffer=tb-6 surface=tsd-6 bindable=0
::Surface surface=tsd-6: bindable=0
resolve placement/alloc for tsd=tsd-6/tb-6 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-6 @12288 +1000 loc=1
[MEMTOOL] t = 12 bias-2’s AUX tb-6-B0 ALLOC
placed tb-6 batch-0 inside GLOBAL_DRAM_POOL@12288
tsd=tsd-6 set content pooled=1
::Surface surface=tsd-19: bindable=0
::Buffer buffer=tb-19 surface=tsd-19 bindable=0
::Surface surface=tsd-19: bindable=0
resolve placement/alloc for tsd=tsd-19/tb-19 aux=0 pooling=1
LOCAL_DRAM_POOL alloc tb-19 @40960 +512 loc=1
[MEMTOOL] t = 13 bias-2’s OUTPUT tb-19-B0 ALLOC
placed tb-19 batch-0 inside LOCAL_DRAM_POOL@40960
done node=bias-2 rc=0
node=fc-1 anno=0.8 in=[tsd-19] aux=[tsd-8] io=[] out=[tsd-20]
tsd=tsd-20 (stream tensor)
deallocating tsd-15/tb-15@0x7fd126834010
LOCAL_DRAM_POOL dealloc tb-15 @ 45056 @ 140536271028240
[MEMTOOL] t = 14 tb-15-B0 DEALLOC
::Surface surface=tsd-8: bindable=0
::Buffer buffer=tb-8 surface=tsd-8 bindable=0
::Surface surface=tsd-8: bindable=0
resolve placement/alloc for tsd=tsd-8/tb-8 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-8 @16384 +5120 loc=1
[MEMTOOL] t = 15 fc-1’s AUX tb-8-B0 ALLOC
placed tb-8 batch-0 inside GLOBAL_DRAM_POOL@16384
tsd=tsd-8 set content pooled=1
done node=fc-1 rc=0
node=bias-3 anno=0.9 in=[tsd-20] aux=[tsd-9] io=[] out=[tsd-21]
tsd=tsd-20 (stream tensor)
::Surface surface=tsd-9: bindable=0
::Buffer buffer=tb-9 surface=tsd-9 bindable=0
::Surface surface=tsd-9: bindable=0
resolve placement/alloc for tsd=tsd-9/tb-9 aux=1 pooling=1
GLOBAL_DRAM_POOL alloc tb-9 @24576 +20 loc=1
[MEMTOOL] t = 16 bias-3’s AUX tb-9-B0 ALLOC
placed tb-9 batch-0 inside GLOBAL_DRAM_POOL@24576
tsd=tsd-9 set content pooled=1
::Surface surface=tsd-21: bindable=0
::Buffer buffer=tb-21 surface=tsd-21 bindable=0
::Surface surface=tsd-21: bindable=0
resolve placement/alloc for tsd=tsd-21/tb-21 aux=0 pooling=1
LOCAL_DRAM_POOL alloc tb-21 @45056 +32 loc=1
[MEMTOOL] t = 17 bias-3’s OUTPUT tb-21-B0 ALLOC
placed tb-21 batch-0 inside LOCAL_DRAM_POOL@45056
done node=bias-3 rc=0
node=cpu-sm-0 anno=1.0 in=[tsd-21] aux=[] io=[] out=[tsd-22]
::Surface surface=tsd-22: bindable=1
::Buffer buffer=tb-22 surface=tsd-22 bindable=1
::Surface surface=tsd-22: bindable=1
resolve placement/alloc for tsd=tsd-22/tb-22 aux=0 pooling=0
placed tsd-22/tb-22 batch-0 inside DRAM @0
done node=cpu-sm-0 rc=0
end memory resolver
printGraph: Final
n-0:dc-conv-0/conv1: (in)e-0[1x4x28x28][tsd-1][tt-1], (Aux)e-9[20x20x5x1][tsd-0][tt-4], (out)e-11[1x20x24x24][tsd-10][tt-8],
n-1:bias-0/conv1: (Aux)e-10[1x20x1x1][tsd-2][tt-5], (in)e-11[1x20x24x24][tsd-10][tt-8], (out)e-1[1x20x24x24][tsd-11][tt-3],
n-2:pdp-0/pool1: (in)e-1[1x20x24x24][tsd-11][tt-3], (out)e-2[1x20x12x12][tsd-12][tt-3],
n-3:dc-conv-1/conv2: (in)e-2[1x20x12x12][tsd-12][tt-3], (Aux)e-12[50x20x5x5][tsd-3][tt-4], (out)e-14[1x50x8x8][tsd-13][tt-8],
n-4:bias-1/conv2: (Aux)e-13[1x50x1x1][tsd-4][tt-5], (in)e-14[1x50x8x8][tsd-13][tt-8], (out)e-3[1x50x8x8][tsd-14][tt-3],
n-5:pdp-1/pool2: (in)e-3[1x50x8x8][tsd-14][tt-3], (out)e-4[1x50x4x4][tsd-15][tt-3],
n-6:fc-0/ip1: (in)e-4[1x50x4x4][tsd-15][tt-3], (Aux)e-15[500x50x4x4][tsd-5][tt-4], (out)e-17[1x500x1x1][tsd-16][tt-8],
n-7:bias-2/ip1: (Aux)e-16[1x500x1x1][tsd-6][tt-5], (in)e-17[1x500x1x1][tsd-16][tt-8], (out)e-6[1x500x1x1][tsd-19][tt-3],
n-10:fc-1/ip2: (in)e-6[1x500x1x1][tsd-19][tt-3], (Aux)e-20[10x500x1x1][tsd-8][tt-4], (out)e-22[1x10x1x1][tsd-20][tt-8],
n-11:bias-3/ip2: (Aux)e-21[1x10x1x1][tsd-9][tt-5], (in)e-22[1x10x1x1][tsd-20][tt-8], (out)e-7[1x10x1x1][tsd-21][tt-3],
n-12:cpu-sm-0/prob: (in)e-7[1x10x1x1][tsd-21][tt-3], (out)e-8[1x10x1x1][tsd-22][tt-2],
compiler targeting dla (fw) interface 0.12
compiler targeting emu (cpu) interface 0.0
(Pool) Memory list entry=1 size=925696 used=924288 domain=0 flags=3
content: tb-0 @ 0
content: tb-2 @ 4096
content: tb-3 @ 32768
content: tb-4 @ 8192
content: tb-5 @ 524288
content: tb-6 @ 12288
content: tb-8 @ 16384
content: tb-9 @ 24576

(Pool) Memory list entry=2 size=49152 used=46080 domain=0 flags=1

::Surface surface=tsd-0: bindable=0
::Buffer buffer=tb-0 surface=tsd-0 bindable=0
::Surface surface=tsd-0: bindable=0
::Surface surface=tsd-0: bindable=0
::Buffer buffer=tb-0 surface=tsd-0 bindable=0
::Surface surface=tsd-0: bindable=0
::Surface surface=tsd-1: bindable=1
::Buffer buffer=tb-1 surface=tsd-1 bindable=1
::Surface surface=tsd-1: bindable=1
::Surface surface=tsd-1: bindable=1
::Buffer buffer=tb-1 surface=tsd-1 bindable=1
::Surface surface=tsd-1: bindable=1
::Surface surface=tsd-1: bindable=1
::Buffer boundSurface(i=0) -> tsd-1
create tensor desc precision=1 category=1 sf=12
name : data
n,c,h,w : 1,4,28,28
data format : 2
data type : 4
data category: 0
pixel format : 12
pixel mapping: 0
strides : 1 128 0 00 0 0 0
::Buffer bindId(buffer=tb-1) [ ::Surface surface=tsd-1: bindable=1
tsd-1 bind_id=1
::Surface surface=tsd-1: bindable=1
::Surface bindId(tsd-1, 0) -> 0
]
(Bindable)(Buffer) Memory list entry for tbd=tb-1:0 : 0 size=3584 domain=0 flags=5
::Surface surface=tsd-10: bindable=0
::Buffer buffer=tb-10 surface=tsd-10 bindable=0
::Surface surface=tsd-10: bindable=0
::Surface surface=tsd-11: bindable=0
::Buffer buffer=tb-11 surface=tsd-11 bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-11: bindable=0
::Buffer buffer=tb-11 surface=tsd-11 bindable=0
::Surface surface=tsd-11: bindable=0
::Surface surface=tsd-12: bindable=0
::Buffer buffer=tb-12 surface=tsd-12 bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-12: bindable=0
::Buffer buffer=tb-12 surface=tsd-12 bindable=0
::Surface surface=tsd-12: bindable=0
::Surface surface=tsd-13: bindable=0
::Buffer buffer=tb-13 surface=tsd-13 bindable=0
::Surface surface=tsd-13: bindable=0
::Surface surface=tsd-14: bindable=0
::Buffer buffer=tb-14 surface=tsd-14 bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-14: bindable=0
::Buffer buffer=tb-14 surface=tsd-14 bindable=0
::Surface surface=tsd-14: bindable=0
::Surface surface=tsd-15: bindable=0
::Buffer buffer=tb-15 surface=tsd-15 bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-15: bindable=0
::Buffer buffer=tb-15 surface=tsd-15 bindable=0
::Surface surface=tsd-15: bindable=0
::Surface surface=tsd-16: bindable=0
::Buffer buffer=tb-16 surface=tsd-16 bindable=0
::Surface surface=tsd-16: bindable=0
::Surface surface=tsd-19: bindable=0
::Buffer buffer=tb-19 surface=tsd-19 bindable=0
::Surface surface=tsd-19: bindable=0
::Surface surface=tsd-19: bindable=0
::Buffer buffer=tb-19 surface=tsd-19 bindable=0
::Surface surface=tsd-19: bindable=0
::Surface surface=tsd-2: bindable=0
::Buffer buffer=tb-2 surface=tsd-2 bindable=0
::Surface surface=tsd-2: bindable=0
::Surface surface=tsd-2: bindable=0
::Buffer buffer=tb-2 surface=tsd-2 bindable=0
::Surface surface=tsd-2: bindable=0
::Surface surface=tsd-20: bindable=0
::Buffer buffer=tb-20 surface=tsd-20 bindable=0
::Surface surface=tsd-20: bindable=0
::Surface surface=tsd-21: bindable=0
::Buffer buffer=tb-21 surface=tsd-21 bindable=0
::Surface surface=tsd-21: bindable=0
::Surface surface=tsd-21: bindable=0
::Buffer buffer=tb-21 surface=tsd-21 bindable=0
::Surface surface=tsd-21: bindable=0
::Surface surface=tsd-22: bindable=1
::Buffer buffer=tb-22 surface=tsd-22 bindable=1
::Surface surface=tsd-22: bindable=1
::Surface surface=tsd-22: bindable=1
::Buffer buffer=tb-22 surface=tsd-22 bindable=1
::Surface surface=tsd-22: bindable=1
::Surface surface=tsd-22: bindable=1
::Buffer boundSurface(i=0) -> tsd-22
create tensor desc precision=1 category=3 sf=63
name : prob
n,c,h,w : 1,10,1,1
data format : 3
data type : 4
data category: 2
pixel format : 36
pixel mapping: 0
strides : 1 32 32 00 0 0 0
::Buffer bindId(buffer=tb-22) [ ::Surface surface=tsd-22: bindable=1
tsd-22 bind_id=1
::Surface surface=tsd-22: bindable=1
::Surface bindId(tsd-22, 1) -> 0
]
(Bindable)(Buffer) Memory list entry for tbd=tb-22:0 : 0 size=32 domain=0 flags=9
::Surface surface=tsd-3: bindable=0
::Buffer buffer=tb-3 surface=tsd-3 bindable=0
::Surface surface=tsd-3: bindable=0
::Surface surface=tsd-3: bindable=0
::Buffer buffer=tb-3 surface=tsd-3 bindable=0
::Surface surface=tsd-3: bindable=0
::Surface surface=tsd-4: bindable=0
::Buffer buffer=tb-4 surface=tsd-4 bindable=0
::Surface surface=tsd-4: bindable=0
::Surface surface=tsd-4: bindable=0
::Buffer buffer=tb-4 surface=tsd-4 bindable=0
::Surface surface=tsd-4: bindable=0
::Surface surface=tsd-5: bindable=0
::Buffer buffer=tb-5 surface=tsd-5 bindable=0
::Surface surface=tsd-5: bindable=0
::Surface surface=tsd-5: bindable=0
::Buffer buffer=tb-5 surface=tsd-5 bindable=0
::Surface surface=tsd-5: bindable=0
::Surface surface=tsd-6: bindable=0
::Buffer buffer=tb-6 surface=tsd-6 bindable=0
::Surface surface=tsd-6: bindable=0
::Surface surface=tsd-6: bindable=0
::Buffer buffer=tb-6 surface=tsd-6 bindable=0
::Surface surface=tsd-6: bindable=0
::Surface surface=tsd-8: bindable=0
::Buffer buffer=tb-8 surface=tsd-8 bindable=0
::Surface surface=tsd-8: bindable=0
::Surface surface=tsd-8: bindable=0
::Buffer buffer=tb-8 surface=tsd-8 bindable=0
::Surface surface=tsd-8: bindable=0
::Surface surface=tsd-9: bindable=0
::Buffer buffer=tb-9 surface=tsd-9 bindable=0
::Surface surface=tsd-9: bindable=0
::Surface surface=tsd-9: bindable=0
::Buffer buffer=tb-9 surface=tsd-9 bindable=0
::Surface surface=tsd-9: bindable=0
set symbol content name=tb-0 size=2048
(Surface) Address list entry for tsd=tsd-0/tb-0:0 -> 1 offset=0 size=2048
(Surface) Address list entry for tsd=tsd-1/tb-1:0 -> 3 offset=0 size=3584
(Surface) Address list entry for tsd=tsd-11/tb-11:0 -> 2 offset=0 size=18432
(Surface) Address list entry for tsd=tsd-12/tb-12:0 -> 2 offset=32768 size=4608
(Surface) Address list entry for tsd=tsd-14/tb-14:0 -> 2 offset=40960 size=4096
(Surface) Address list entry for tsd=tsd-15/tb-15:0 -> 2 offset=45056 size=1024
(Surface) Address list entry for tsd=tsd-19/tb-19:0 -> 2 offset=40960 size=512
set symbol content name=tb-2 size=40
(Surface) Address list entry for tsd=tsd-2/tb-2:0 -> 1 offset=4096 size=40
(Surface) Address list entry for tsd=tsd-21/tb-21:0 -> 2 offset=45056 size=32
(Surface) Address list entry for tsd=tsd-22/tb-22:0 -> 4 offset=0 size=32
set symbol content name=tb-3 size=25088
(Surface) Address list entry for tsd=tsd-3/tb-3:0 -> 1 offset=32768 size=25088
set symbol content name=tb-4 size=100
(Surface) Address list entry for tsd=tsd-4/tb-4:0 -> 1 offset=8192 size=100
set symbol content name=tb-5 size=400000
(Surface) Address list entry for tsd=tsd-5/tb-5:0 -> 1 offset=524288 size=400000
set symbol content name=tb-6 size=1000
(Surface) Address list entry for tsd=tsd-6/tb-6:0 -> 1 offset=12288 size=1000
set symbol content name=tb-8 size=5120
(Surface) Address list entry for tsd=tsd-8/tb-8:0 -> 1 offset=16384 size=5120
set symbol content name=tb-9 size=20
(Surface) Address list entry for tsd=tsd-9/tb-9:0 -> 1 offset=24576 size=20
emit discovered 2 tasks
the initial mem list size is 4 entries
the initial addr list size is 4 entries
task_id=0 has 10 op slots and 1 batches
address list task context at [5, 11)
data cube access by tsd:batch=tsd-1:0 id[offs]=3
::Surface surface=tsd-1: bindable=1
data cube access by tsd:batch=tsd-0:0 id[offs]=1
::Surface surface=tsd-0: bindable=0
Convolution node @ op_slot = 0 batch_id = 0
src data loc: 0
dst data loc: 2
post y extension: 1
in_precision 0
out_precision 0
pad_val 0
conv mode 0
data_reuse 0
weight_reuse 0
skip_data_rls 0
skip_wt_rls 0
eps 1
fetch_grain 1
data_format 12
pixel_mapping 0
batch 1
weight_format 0
b4d 1
b4w 1
batch_stride 0
release 28
post_extension 1
pixel_override 1
mean_format 0
stride-x 1
stride-y 1
pad-left 0
pad-top 0
pad-right 0
pad-bottom 0
dilationx-x 1
dilation-y 1
pra_truncate 0
inputwidthcsc 28
inputheightcsc 28
inputchannelcsc 4
kernelwidthcsc 1
kernelheightcsc 5
kernelchannelcsc 20
inputwidthcmac 24
inputheightcmac 24
bytesperkernel 100
offsetU 0
dependencyCount 1
src tsd:tsd-1
src addr=3
src size 3584
src width 28
src height 28
src channel 4
src linestride 128
src surfstride 0
dst tsd:tsd-10
dst addr=-1
dst size 18432
dst width 24
dst height 24
dst channel 20
dst linestride 768
dst surfstride 18432
wt tsd:tsd-0
weight addr=1
wt size 2048
wt width 1
wt height 5
wt channel 20
data cube access by tsd:batch=tsd-11:0 id[offs]=2
::Surface surface=tsd-11: bindable=0
data cube access by tsd:batch=tsd-2:0 id[offs]=1
::Surface surface=tsd-2: bindable=0
SDP bias node @ op_slot = 1 batch_id = 0
src precision 0
dst precision 0
x1 enable 1
x1 precision 1
x1 aluType 2
x1 type 2
x1 mode 1
x1 act 0
x1 shiftValue 0
x1 aluOperand 0
x1 mulOperand 1
x1 truncate 0
x2 enable 0
y enable 0
src tsd:tsd-10
dst tsd:tsd-11/tb-11:off= 0
bias tsd:tsd-2
dependencyCount1
conv_mode 0
src addr=-1
src type=2
src size 18432
src width 24
src height 24
src channel 20
src linestride 768
src surfstride 18432
bias addr=1
bias type=0
bias size 40
bias width 1
bias height 1
bias channel 20
bias linestride 64
bias surfstride 64
dst addr=2
dst type=0
dst size 18432
dst width 24
dst height 24
dst channel 20
dst linestride 768
dst surfstride 18432
out_cvt enable 1
out_cvt scale 16557
out_cvt offset 0
out_cvt truncate 23
data cube access by tsd:batch=tsd-11:0 id[offs]=2
::Surface surface=tsd-11: bindable=0
data cube access by tsd:batch=tsd-12:0 id[offs]=2
::Surface surface=tsd-12: bindable=0
PDP node @ op_slot = 2 batch_id = 0
pdp precision0
pdp pool mode1
src tsd:tsd-11
dst tsd:tsd-12
src addr=2
src type=0
dependencyCount1
splitNum 1
padLeft 0
padTop 0
padRight 0
padBottom 0
pool height 1
pool width 1
stride x 2
stride y 2
src size 18432
src width 24
src height 24
src channel 20
src linestride 768
src surfstride 18432
dst addr=2
dst type=0
dst size 4608
dst width 12
dst height 12
dst channel 20
dst linestride 384
dst surfstride 4608
data cube access by tsd:batch=tsd-12:0 id[offs]=2
::Surface surface=tsd-12: bindable=0
data cube access by tsd:batch=tsd-3:0 id[offs]=1
::Surface surface=tsd-3: bindable=0
Convolution node @ op_slot = 3 batch_id = 0
src data loc: 0
dst data loc: 2
post y extension: 0
in_precision 0
out_precision 0
pad_val 0
conv mode 0
data_reuse 0
weight_reuse 0
skip_data_rls 0
skip_wt_rls 0
eps 3
fetch_grain 1
data_format 36
pixel_mapping 0
batch 1
weight_format 0
b4d 1
b4w 1
batch_stride 0
release 12
post_extension 0
pixel_override 1
mean_format 0
stride-x 1
stride-y 1
pad-left 0
pad-top 0
pad-right 0
pad-bottom 0
dilationx-x 1
dilation-y 1
pra_truncate 0
inputwidthcsc 12
inputheightcsc 12
inputchannelcsc 20
kernelwidthcsc 5
kernelheightcsc 5
kernelchannelcsc 20
inputwidthcmac 8
inputheightcmac 8
bytesperkernel 500
offsetU 0
dependencyCount 3
src tsd:tsd-12
src addr=2
src size 4608
src width 12
src height 12
src channel 20
src linestride 384
src surfstride 4608
dst tsd:tsd-13
dst addr=-1
dst size 4096
dst width 8
dst height 8
dst channel 50
dst linestride 256
dst surfstride 2048
wt tsd:tsd-3
weight addr=1
wt size 25088
wt width 5
wt height 5
wt channel 20
data cube access by tsd:batch=tsd-14:0 id[offs]=2
::Surface surface=tsd-14: bindable=0
data cube access by tsd:batch=tsd-4:0 id[offs]=1
::Surface surface=tsd-4: bindable=0
SDP bias node @ op_slot = 4 batch_id = 0
src precision 0
dst precision 0
x1 enable 1
x1 precision 1
x1 aluType 2
x1 type 2
x1 mode 1
x1 act 0
x1 shiftValue 0
x1 aluOperand 0
x1 mulOperand 1
x1 truncate 0
x2 enable 0
y enable 0
src tsd:tsd-13
dst tsd:tsd-14/tb-14:off= 0
bias tsd:tsd-4
dependencyCount2
conv_mode 0
src addr=-1
src type=2
src size 4096
src width 8
src height 8
src channel 50
src linestride 256
src surfstride 2048
bias addr=1
bias type=0
bias size 100
bias width 1
bias height 1
bias channel 50
bias linestride 64
bias surfstride 64
dst addr=2
dst type=0
dst size 4096
dst width 8
dst height 8
dst channel 50
dst linestride 256
dst surfstride 2048
out_cvt enable 1
out_cvt scale 16846
out_cvt offset 0
out_cvt truncate 26
data cube access by tsd:batch=tsd-14:0 id[offs]=2
::Surface surface=tsd-14: bindable=0
data cube access by tsd:batch=tsd-15:0 id[offs]=2
::Surface surface=tsd-15: bindable=0
PDP node @ op_slot = 5 batch_id = 0
pdp precision0
pdp pool mode1
src tsd:tsd-14
dst tsd:tsd-15
src addr=2
src type=0
dependencyCount2
splitNum 1
padLeft 0
padTop 0
padRight 0
padBottom 0
pool height 1
pool width 1
stride x 2
stride y 2
src size 4096
src width 8
src height 8
src channel 50
src linestride 256
src surfstride 2048
dst addr=2
dst type=0
dst size 1024
dst width 4
dst height 4
dst channel 50
dst linestride 128
dst surfstride 512
data cube access by tsd:batch=tsd-15:0 id[offs]=2
::Surface surface=tsd-15: bindable=0
data cube access by tsd:batch=tsd-5:0 id[offs]=1
::Surface surface=tsd-5: bindable=0
FullyConnected node @ op_slot = 6 batch_id = 0
src data loc: 0
dst data loc: 2
post y extension: 0
in_precision 0
out_precision 0
pad_val 0
conv mode 0
data_reuse 0
weight_reuse 0
skip_data_rls 0
skip_wt_rls 0
eps 2
fetch_grain 1
data_format 36
pixel_mapping 0
batch 1
weight_format 0
b4d 1
b4w 13
batch_stride 0
release 4
post_extension 0
pixel_override 0
mean_format 0
stride-x 1
stride-y 1
pad-left 0
pad-top 0
pad-right 0
pad-bottom 0
dilationx-x 1
dilation-y 1
pra_truncate 0
inputwidthcsc 4
inputheightcsc 4
inputchannelcsc 50
kernelwidthcsc 4
kernelheightcsc 4
kernelchannelcsc 50
inputwidthcmac 1
inputheightcmac 1
bytesperkernel 800
offsetU 0
dependencyCount 3
src tsd:tsd-15
src addr=2
src size 1024
src width 4
src height 4
src channel 50
src linestride 128
src surfstride 512
dst tsd:tsd-16
dst addr=-1
dst size 512
dst width 1
dst height 1
dst channel 500
dst linestride 32
dst surfstride 32
wt tsd:tsd-5
weight addr=1
wt size 400000
wt width 4
wt height 4
wt channel 50
data cube access by tsd:batch=tsd-19:0 id[offs]=2
::Surface surface=tsd-19: bindable=0
data cube access by tsd:batch=tsd-6:0 id[offs]=1
::Surface surface=tsd-6: bindable=0
SDP bias node @ op_slot = 7 batch_id = 0
src precision 0
dst precision 0
x1 enable 1
x1 precision 1
x1 aluType 2
x1 type 2
x1 mode 1
x1 act 1
x1 shiftValue 0
x1 aluOperand 0
x1 mulOperand 1
x1 truncate 0
x2 enable 0
y enable 0
src tsd:tsd-16
dst tsd:tsd-19/tb-19:off= 0
bias tsd:tsd-6
dependencyCount2
conv_mode 0
src addr=-1
src type=2
src size 512
src width 1
src height 1
src channel 500
src linestride 32
src surfstride 32
bias addr=1
bias type=0
bias size 1000
bias width 1
bias height 1
bias channel 500
bias linestride 64
bias surfstride 64
dst addr=2
dst type=0
dst size 512
dst width 1
dst height 1
dst channel 500
dst linestride 32
dst surfstride 32
out_cvt enable 1
out_cvt scale 30309
out_cvt offset 0
out_cvt truncate 25
data cube access by tsd:batch=tsd-19:0 id[offs]=2
::Surface surface=tsd-19: bindable=0
data cube access by tsd:batch=tsd-8:0 id[offs]=1
::Surface surface=tsd-8: bindable=0
FullyConnected node @ op_slot = 8 batch_id = 0
src data loc: 0
dst data loc: 2
post y extension: 0
in_precision 0
out_precision 0
pad_val 0
conv mode 0
data_reuse 0
weight_reuse 0
skip_data_rls 0
skip_wt_rls 0
eps 4
fetch_grain 1
data_format 36
pixel_mapping 0
batch 1
weight_format 0
b4d 1
b4w 1
batch_stride 0
release 1
post_extension 0
pixel_override 0
mean_format 0
stride-x 1
stride-y 1
pad-left 0
pad-top 0
pad-right 0
pad-bottom 0
dilationx-x 1
dilation-y 1
pra_truncate 0
inputwidthcsc 1
inputheightcsc 1
inputchannelcsc 500
kernelwidthcsc 1
kernelheightcsc 1
kernelchannelcsc 500
inputwidthcmac 1
inputheightcmac 1
bytesperkernel 500
offsetU 0
dependencyCount 3
src tsd:tsd-19
src addr=2
src size 512
src width 1
src height 1
src channel 500
src linestride 32
src surfstride 32
dst tsd:tsd-20
dst addr=-1
dst size 32
dst width 1
dst height 1
dst channel 10
dst linestride 32
dst surfstride 32
wt tsd:tsd-8
weight addr=1
wt size 5120
wt width 1
wt height 1
wt channel 500
data cube access by tsd:batch=tsd-21:0 id[offs]=2
::Surface surface=tsd-21: bindable=0
data cube access by tsd:batch=tsd-9:0 id[offs]=1
::Surface surface=tsd-9: bindable=0
SDP bias node @ op_slot = 9 batch_id = 0
src precision 0
dst precision 0
x1 enable 1
x1 precision 1
x1 aluType 2
x1 type 2
x1 mode 1
x1 act 0
x1 shiftValue 0
x1 aluOperand 0
x1 mulOperand 1
x1 truncate 0
x2 enable 0
y enable 0
src tsd:tsd-20
dst tsd:tsd-21/tb-21:off= 0
bias tsd:tsd-9
dependencyCount2
conv_mode 0
src addr=-1
src type=2
src size 32
src width 1
src height 1
src channel 10
src linestride 32
src surfstride 32
bias addr=1
bias type=0
bias size 20
bias width 1
bias height 1
bias channel 10
bias linestride 64
bias surfstride 64
dst addr=2
dst type=0
dst size 32
dst width 1
dst height 1
dst channel 10
dst linestride 32
dst surfstride 32
out_cvt enable 1
out_cvt scale 20972
out_cvt offset 0
out_cvt truncate 25
set symbol content name=task-0-addr0 size=40
set symbol content name=task-0-dep_graph size=360
set symbol content name=task-0-op_list size=1160
set symbol content name=task-0-surf_list size=6440
set symbol content name=task-0-lut_list size=700
task address list (indices into global address list):
5
1
2
3
4
6
7
8
9
10
<>
gathered reloc entry: address id=3 writeId=8 useBase=1ed9130 originalOffset=1ed91a4 offset=74 interface=1 subInterface=4 relocType=1
gathered reloc entry: address id=3 writeId=8 useBase=1ed9130 originalOffset=1ed91a8 offset=78 interface=1 subInterface=4 relocType=2
task_id=1 has 1 op slots and 1 batches
address list task context at [11, 17)
::Surface surface=tsd-21: bindable=0
::Surface surface=tsd-22: bindable=1
Softmax node @ op_slot = 0 batch_id = 0
src addr=2[45056]
dst addr=4[0]
input scale factor 0.141637
output scale factor 0.00831604
src size=32
src format=0
src width=1
src height=1
src channel=10
dst size=32
dst format=0
dst width=1
dst height=1
dst channel=10
set symbol content name=task-1-addr0 size=256
set symbol content name=task-1-op_list size=24
set symbol content name=task-1-op_buf_list size=512
task address list (indices into global address list):
11
1
2
3
4
12
13
14
15
16
<>
gathered reloc entry: address id=4 writeId=13 useBase=1edf5f0 originalOffset=1edf702 offset=112 interface=2 subInterface=4 relocType=1
gathered reloc entry: address id=4 writeId=13 useBase=1edf5f0 originalOffset=1edf706 offset=116 interface=2 subInterface=4 relocType=2
profile insertLoadable saving loadable with name fast-math
profile getLoadable looked for loadable with name fast-math
profile getLoadable looked for loadable with name fast-math
closing wisdom context…

下图是我整理的,Compiler工作的主要路线图,其中蓝色标注的部分是编译部分的主体,在正式进行编译之前,编译器会把 CaffeModel 转换成 Network 对象;把输入的profileName 转换成 Profile 对象,例如指定 fast-math 那么在profile 对象里就会把一些优化的开关打开,把targetname,就是nv_small/nv_large 这种,会转换成TargetConfig对象,内有硬件相关的配置信息。

WorkFlow of  NVDLA Compiler

之后,通过 Network,生成一个标准的抽象语法树,再转换成与硬件相关的 engine_ast,engine_ast 就是编译器最核心的 IR,接下来的各种优化都是对该 IR 做变换。

cangraph2enggraph

在这里可以看到 engine_ast 已经包含了硬件信息,在左边的抽象语法树里,只有算子的信息,右边会把算子对应的映射到硬件上去,映射关系在 log 里也可以找到:

1
2
3
4
5
6
7
8
n-0:->n-0:dc-conv-0 n-1:bias-0 
n-1:->n-2:pdp-0
n-2:->n-3:dc-conv-1 n-4:bias-1
n-3:->n-5:pdp-1
n-4:->n-6:fc-0 n-7:bias-2
n-5:->n-8:sdp-scale-0 n-9:act-0
n-6:->n-10:fc-1 n-11:bias-3
n-7:->n-12:cpu-sm-0

例如,n-0 原来是卷积运算的op节点,在engine_ast 中,乘 weight 运算被映射到 Convolution Engine 上,bias 添加运算被映射到 SDP Engine 上。

2.2 main.c

在 main 函数里,编译器首先做的事情是对命令行的参数进行解析,不得不感叹这部分的逻辑用的居然是if、else语句来解析参数:

1
2
3
4
5
6
7
8
9
10
if (ii >= argc)
break;
const char* arg = argv[ii];

if (std::strcmp(arg, "-h") == 0) // help
{
// Print usage
showHelp = true;
break;
}

我觉得,改成类似 Caffe 的,使用 gtest 的思路不会更好吗?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
using caffe::Timer;
using caffe::vector;
using std::ostringstream;

DEFINE_string(gpu, "",
"Optional; run in GPU mode on given device IDs separated by ','."
"Use '-gpu all' to run on all available GPUs. The effective training "
"batch size is multiplied by the number of devices.");
DEFINE_string(solver, "",
"The solver definition protocol buffer text file.");
DEFINE_string(model, "",
"The model definition protocol buffer text file.");
DEFINE_string(phase, "",
"Optional; network phase (TRAIN or TEST). Only used for 'time'.");
DEFINE_int32(level, 0,
"Optional; network level.");

解析参数的过程中,会初始化TestAppArgs这个结构体,而在程序的开始,已经给其赋了初始值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
static TestAppArgs defaultTestAppArgs =
{
/* .project = */ "OpenDLA",
/* .inputPath = */ "./",
/* .inputName = */ "",
/* .outputPath = */ "./",
/* .testname = */ "",
/* .testArgs = */ "",
/* .prototxt = */ "",
/* .caffemodel = */ "",
/* .cachemodel = */ "",
/* .profileName = */ "fast-math",
/* .profileFile = */ "",
/* .configtarget = */ TARGET_CONFIG_NAME,
/* .calibtable = */ "",
/* .quantizationMode = */ DEFAULT_QUANT_MODE,
/* .numBatches = */ DEFAULT_BATCH_SIZE,
/* .inDataFormat = */ DEFAULT_DATA_FMT,
/* .computePrecision = */ nvdla::DataType::INT8
};

初始化完成之后,再调用同样写在该文件的 LaunchTest 函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
NvDlaError launchTest(const TestAppArgs* appArgs)
{
NvDlaError e = NvDlaSuccess;
TestInfo testInfo;

PROPAGATE_ERROR_FAIL(testSetup(appArgs, &testInfo));

PROPAGATE_ERROR_FAIL(parseAndCompile(appArgs, &testInfo));

return NvDlaSuccess;

fail:
return e;
}

在 testSetup 函数里,主要是检验一下对应的文件是否真的存在等。

其中又有一个非常重要的结构体:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
struct TestInfo
{
// runtime
nvdla::IRuntime* runtime;
std::string inputLoadablePath;
NvU8 *inputHandle;
NvU8 *outputHandle;
NvU8 *pData;
bool dlaServerRunning;
NvS32 dlaRemoteSock;
NvS32 dlaServerSock;
NvU32 numInputs;
NvU32 numOutputs;
NvDlaImage* inputImage;
NvDlaImage* outputImage;
};

在该函数里,还有一个让人窒息的操作,给大家康康:

1
2
3
4
5
// Clear wisdomPath if any exist
removeCmd += "rm -rf " + wisdomPath;
ii = std::system(removeCmd.c_str()); // This is pretty awful
if (ii != 0)
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "system command failed: \"%s\"", removeCmd.c_str());

2.3 工厂设计模式

在 NVDLA 的软件栈代码里,用了很多工厂模式的设计,例如创建一个 Network 对象,需要调用 NetworkFactory 的 newNetwork 方法,其他的,像是 Tensor、Node、Layer 都有对应的工厂,来生产不同的 Tensor、 Node、Layer。

此外,针对 Network、Tensor、Node、Layer 等等对象,其都定义了对应的虚接口类,如 INetwork、ITensor、INode、ILayer、这些接口类里通过纯虚函数定义了继承的子类必须实现的函数。

比如举个 Network 的例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
class Network : public INetwork
{
public: // externally facing

virtual ITensor* addInput(const char* name, Dims4 dimensions);

// virtual void markChanged(const ILayer*);
virtual bool markInput(ITensor * tensor);
virtual void markOutput(ITensor* tensor);

virtual IConvolutionLayer * addConvolution(ITensor* input, int numOutputs, int paddingValue,
Dims2 kernelSize, Dims2 tlPadding, Dims2 brPadding, Dims2 stride, Dims2 dilation,
Weights kernelWeights, Weights biasWeights, BiasMode biasmode, int numGroups);
virtual IFullyConnectedLayer * addFullyConnected(ITensor* input, int outputSize, Weights kernelWeights, Weights biasWeights, BiasMode biasMode);
virtual IActivationLayer * addActivation(ITensor* input, ActivationType type);
virtual IPoolingLayer * addPooling(ITensor* input, PoolingType type,
Dims2 windowSize, Dims2 stride, Dims2 tlPadding, Dims2 brPadding);
virtual ILRNLayer * addLRN(ITensor* input, int window, float alpha, float beta, float k);
virtual IScaleLayer * addScale(ITensor* input, ScaleMode mode, Weights shift, Weights scale, Weights power);
virtual IBatchNormLayer * addBatchNorm(ITensor* input, BatchNormMode mode, Weights mean, Weights variance, float epsilon);
virtual ISoftMaxLayer * addSoftMax(ITensor* input);
virtual IConcatenationLayer * addConcatenation(ITensor * const * inputs, int numInputs);
virtual ISliceLayer * addSlice(ITensor* input, int numOutputs);
virtual IDeconvolutionLayer * addDeconvolution(ITensor* input, int numOutputs, int paddingValue,
Dims2 kernelSize, Dims2 tlPadding, Dims2 brPadding, Dims2 stride, Dims2 dilation,
Weights kernelWeights, Weights biasWeights, BiasMode biasMode, int numGroups);
virtual IElementWiseLayer * addElementWise(ITensor* input0, ITensor* input1, ElementWiseOperation op);

virtual int getNumInputs() const;
virtual int getNumOutputs() const;
virtual int getNumLayers() const ;

virtual ILayer * getLayer(int index) const;
virtual ITensor * getOutput(int index) const;
virtual ITensor * getInput(int index) const;

virtual void setPoolingOutputDimensionsFormula (OutputDimensionsFormula* callback);
virtual void setConvolutionOutputDimensionsFormula (OutputDimensionsFormula* callback);
virtual void setDeconvolutionOutputDimensionsFormula(OutputDimensionsFormula* callback);

virtual OutputDimensionsFormula& getPoolingOutputDimensionsFormula() const;
virtual OutputDimensionsFormula& getConvolutionOutputDimensionsFormula() const;
virtual OutputDimensionsFormula& getDeconvolutionOutputDimensionsFormula() const;

virtual const std::vector<ITensor *>& getInputs() const;
virtual const std::vector<ILayer * >& getLayers() const;
virtual const std::vector<ITensor *>& getOutputs() const;

virtual NvU16 getFactoryType() const;


public: // internally facing
Network();
virtual ~Network();
virtual bool serializeTo(WisdomContainerEntry *) const;
virtual bool deserializeFrom(WisdomContainerEntry *);
virtual bool assignSymbols(Wisdom *);

protected:
friend class Wisdom;
friend class NetworkFactory;

void destroy();

private:

std::string newLayerName() const;
std::string newTensorName() const;

ITensor* addTensor(const std::string & s);
const ILayer* findLayer(const std::string& name) const;
bool checkNames(const char* name);

std::vector<ITensor *> mTensors;
std::vector<ILayer *> mLayers;
std::vector<ITensor *> mInputs;
std::vector<ITensor *> mOutputs;

// provides layer dimension caching. Layers can be mutated in any order and dimensions queried at any point.
// So mutating a layer trims this, and querying always refills the cache up to the queried layer
// mutable std::vector<Dims3> mDimensions;

// internal flags used by the builder that are not accessible through the API
// int mInternalBuildFlags{ InternalBuildFlags::kENABLE_GRAPH_OPTIMIZATIONS };
OutputDimensionsFormula* mConvDims, *mDeconvDims, *mPoolDims;
};

3. Caffe Parser

该部分是编译器的最前端,将 Caffemodel 和 Prototxt 转化成内部通用的 Network 对象。

image-20210629203635668

将模型转化成 Network 对象,搞不懂返回 IBlobNameToTensor 这个对象有什么用,后面也没有用到数据,只做判断解析是否完成用了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
const IBlobNameToTensor* CaffeParser::parse(const char* deployFile,
const char* modelFile,
INetwork * network)
{

CHECK_NULL_RET_NULL(deployFile);
CHECK_NULL_RET_NULL(modelFile);
assert(mDimsCallback == 0);

if (!mDimsCallback) {
mDimsCallback = new CaffeParserPoolingDimsCallback;
}
// 设置 Network 输出层长宽的计算公式(向上取整)
network->setPoolingOutputDimensionsFormula(mDimsCallback);

// this is used to deal with dropout layers which have different input and output
// 解析 Caffemodel 文件,提取权重数据等,转化到 mModel 对象中
mModel = new dc::NetParameter();
if (!readBinaryProto(mModel/*.get()*/, modelFile, mProtobufBufferSize))
{
gLogError << "Could not parse model file" << std::endl;
return 0;
}
// 解析 Prototxt 文件,转化到 mDeploy 对象中
mDeploy = new dc::NetParameter();
if (!readTextProto(mDeploy/*.get()*/, deployFile))
{
gLogError << "Could not parse deploy file" << std::endl;
return 0;
}

bool ok = true;
CaffeWeightFactory weights(*mModel/**mModel.get()*/,
false /*weightType == DataType::kHALF*/, mTmpAllocs);

mBlobNameToTensor = new BlobNameToTensor();

for (int i = 0; i < mDeploy->input_size(); i++)
{
Dims4 dims;
if (mDeploy->input_shape_size()) {
dims.n = (int)mDeploy->input_shape().Get(i).dim().Get(0);
dims.c = (int)mDeploy->input_shape().Get(i).dim().Get(1);
dims.h = (int)mDeploy->input_shape().Get(i).dim().Get(2);
dims.w = (int)mDeploy->input_shape().Get(i).dim().Get(3);
}
else { // deprecated, but still used in a lot of networks
dims.n = (int)mDeploy->input_dim().Get(i * 4 + 0);
dims.c = (int)mDeploy->input_dim().Get(i * 4 + 1);
dims.h = (int)mDeploy->input_dim().Get(i * 4 + 2);
dims.w = (int)mDeploy->input_dim().Get(i * 4 + 3);
}

ITensor* tensor = network->addInput(mDeploy->input().Get(0).c_str(), dims);
mBlobNameToTensor->add(mDeploy->input().Get(0), tensor);

}

for (int i = 0; i < mDeploy->layer_size() && ok; i++)
{
const dc::LayerParameter& layerMsg = mDeploy->layer(i);
if (layerMsg.has_phase() && layerMsg.phase() == dc::TEST) {
continue;
}

if (layerMsg.type() == "Dropout")
{
mBlobNameToTensor->add(layerMsg.top().Get(0),
mBlobNameToTensor->find(layerMsg.bottom().Get(0).c_str()));
continue;
}

if (layerMsg.type() == "Input")
{
const dc::InputParameter& p = layerMsg.input_param();
for (int i = 0; i < layerMsg.top_size(); i++)
{
const dc::BlobShape& shape = p.shape().Get(i);
Dims4 dims(shape.dim().Get(0), shape.dim().Get(1), shape.dim().Get(2), shape.dim().Get(3));
ITensor* tensor = network->addInput(layerMsg.top(i).c_str(), dims);
mBlobNameToTensor->add(layerMsg.top().Get(i), tensor);
}
continue;
}
if (layerMsg.type() == "Flatten")
{
ITensor* tensor = (*mBlobNameToTensor)[layerMsg.bottom().Get(0)];
(*mBlobNameToTensor)[layerMsg.top().Get(0)] = tensor;
std::cout << "Warning: Flatten layer ignored." << std::endl;
continue;
}

LayerParseFnMap::iterator v = gParseTable.find(layerMsg.type());

if (v == gParseTable.end())
{
gLogError << "could not parse layer type " << layerMsg.type() << std::endl;
ok = false;
}
else
{
ILayer* layer = (*v->second)(network, layerMsg, weights, mBlobNameToTensor);

if (layer == 0)
{
gLogError << "error: parsing layer type " << layerMsg.type() <<
" index " << i << std::endl;
ok = false;
}
else
{
layer->setName(layerMsg.name().c_str());
mBlobNameToTensor->add(layerMsg.top(0), layer->getOutput(0));
}
}
}

mBlobNameToTensor->setTensorNames();
return ok && weights.isOK() ? mBlobNameToTensor : 0;
}

确定最终网络的输出层是哪一层:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

int CaffeParser::identifyOutputs(INetwork * network)
{
std::set< ITensor* > outputTensors;
std::set< ITensor* > inputTensors;

// 缓存每层的输入 Tensor 和输出 Tensor
for (int l = 0; l < network->getNumLayers(); ++l)
{
// 获取每一层
ILayer* layer = network->getLayer(l);
if (!layer)
return -1;
// 将输入 Tensor 存入inputTensors 里
for (int ii = 0; ii < layer->getNumInputs(); ++ii) {
inputTensors.insert(layer->getInput(ii));
}
// 将输出 Tensor 存入 OutputTensors 里
for (int oo = 0; oo < layer->getNumOutputs(); ++oo)
{
outputTensors.insert(layer->getOutput(oo));
}
}

for (std::set<ITensor*>::iterator oi = outputTensors.begin(); oi != outputTensors.end(); ++oi)
{
// oi 是遍历 outputTensors,如果这个 Tensors 不是输入的 Tensor 那他就是最后一层输出的 Tensor,这个逻辑没毛病。
// an output tensor which is not an input to any other layers is a network output tensor
if (inputTensors.find(*oi) == inputTensors.end())
{
network->markOutput(*oi);
gLogInfo << "mark " << (*oi)->getName() << std::endl;
}
}

return network->getNumOutputs();
}

4. parseTensorScales

然后是解析量化的 Scales Json 文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
NvDlaError parseTensorScales(const TestAppArgs* appArgs, TestInfo *i, nvdla::INetwork* network)
{
NvDlaError e = NvDlaSuccess;
NvDlaStatType stat;
std::string calibTableFile = /*i->calibTablesPath + "/" + */appArgs->calibTable;

PROPAGATE_ERROR_FAIL(NvDlaStat(calibTableFile.c_str(), &stat));

// populate the scaling factor/dynamic range of each of the tensors on the network
{
FILE* fp = fopen(calibTableFile.c_str(), "r");
char readBuffer[TEST_PARAM_FILE_MAX_SIZE] = {0};

rapidjson::Document doc;
rapidjson::FileReadStream inStr(fp, readBuffer, sizeof(readBuffer));

doc.ParseStream(inStr);
if (doc.HasParseError())
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "JSON parsing error: %s", GetParseError_En(doc.GetParseError()));
}

{
std::vector<nvdla::ILayer*> networkLayers = network->getLayers();
std::vector<nvdla::ITensor*> networkInputs = network->getInputs();

std::vector<nvdla::ILayer*>::iterator li = networkLayers.begin();
std::vector<nvdla::ITensor*>::iterator nii = networkInputs.begin();

// set scaling factor for the network input tensors
// 这里是对输入的数据进行量化,一般是 data 层
for (; nii != networkInputs.end(); ++nii)
{
NvF32 scale = 0.0f;
NvF32 min = 0.0f;
NvF32 max = 0.0f;
std::string tName = (*nii)->getName();
// 可以发现这里是根据每个 Layer 的 Name 来索引网络的 scale,也就是说 TensorRT 量化出来的那个是不能直接用的,详情见之前的博客
if (doc[tName.c_str()].HasMember("scale")) {
// 因为出来的都是 scale,所以总是会使用这个分支的
scale = doc[tName.c_str()]["scale"].GetFloat();
// min 和 max 都是根据 scale 来调整上下届,具体的量化算法是什么呢?
min = scale * -127.0f;
max = scale * 127.0f;
}
else if (doc[tName.c_str()].HasMember("min") && doc[tName.c_str()].HasMember("max")) {
min = doc[tName.c_str()]["min"].GetFloat();
max = doc[tName.c_str()]["max"].GetFloat();
}
else {
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "Atleast 1 of scale or min-max should be specified for %s\n", tName.c_str());
}

// set same dynamic range for all channels of the tensor (cIndex = -1)
PROPAGATE_ERROR_FAIL( (*nii)->setChannelDynamicRange(-1, min, max) );
const_cast<TestAppArgs*>(appArgs)->tensorScales.insert(std::pair<std::string, NvF32>(tName, scale));
}
// 这里是提取每一层Layer的Scale信息
for (; li != networkLayers.end(); ++li)
{
NvF32 scale = 0.0f;
NvF32 min = 0.0f;
NvF32 max = 0.0f;
std::string lName = (*li)->getName();
nvdla::ITensor* outTensor = (*li)->getOutput(0);

if (doc[lName.c_str()].HasMember("scale")) {
scale = doc[lName.c_str()]["scale"].GetFloat();
min = scale * -127.0f;
max = scale * 127.0f;
}
else if (doc[lName.c_str()].HasMember("min") && doc[lName.c_str()].HasMember("max")) {
min = doc[lName.c_str()]["min"].GetFloat();
max = doc[lName.c_str()]["max"].GetFloat();
}
else {
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "Atleast 1 of scale or min-max should be specified for %s\n", lName.c_str());
}

// set same dynamic range for all channels of the tensor (cIndex = -1)
PROPAGATE_ERROR_FAIL( outTensor->setChannelDynamicRange(-1, min, max) );
const_cast<TestAppArgs*>(appArgs)->tensorScales.insert(std::pair<std::string, NvF32>(lName, scale));
}
}

fclose(fp);
}

fail:
return e;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
NvDlaError Tensor::setChannelDynamicRange(NvS32 chnlIndx, NvF32 min, NvF32 max)
{
NvDlaError e = NvDlaSuccess;
// min/max = scale * +- 127 -> max(min,max)/127.0 = scale
NvF32 scaleFactor = std::max<NvF32>(std::fabs(min), std::fabs(max))/127.0;

// 输入的 chnlIndx 数不能大于 Tensor 的 Channel 数
if (chnlIndx >= mDimensions.c)
{
e = NvDlaError_BadParameter;
goto fail;
}
else if (chnlIndx == -1)
{
// clear existing scales and adjust vector capacity if need be before inserting
mChnlScales.clear();
// mChnlScales 是一个 Vector、reserve 是为其开辟了内存空间,但是这个 Vector 的 Size 没有改变
mChnlScales.reserve(mDimensions.c);
// As we can see, 如果指定 -1,那就是对 Tensor 的每个通道都使用同一个 scaleFactor
for (NvU32 cc = 0; cc < static_cast<NvU32>(mDimensions.c); ++cc)
{
mChnlScales.push_back(scaleFactor);
}
}
else
{
// adjust vector capacity before inserting
// 否则就只设置 chnlIndx 这个通道的 scaleFactor
if (mChnlScales.capacity() < (size_t)mDimensions.c)
{
mChnlScales.reserve(mDimensions.c);
}
mChnlScales.insert(mChnlScales.begin() + chnlIndx, scaleFactor);
}

fail:
return e;
}

5. setNetworkTransient

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
bool Wisdom::setNetworkTransient(INetwork *inetwork)
{
Network *network = NetworkFactory::priv(inetwork);
if ( !network ) {
gLogError << "unrecognized network presented to Wisdom" << endl;
return false;
}
//
// prior to serialization all objects need entries in the symbol table.
// since the network ultimately refers to them all, just triggering
// it's symbols to be resolved is sufficient to be able to deserialize
// it later.
// 设置符号表(symbol table),具体可以看 wisdom -> symbol table
bool ok = network->assignSymbols(m_container->wisdom_priv());
m_network = network;
return ok;
}

6.profile

根据 “fast-math” 配置到 i->wisdom->m_profiler 里去,也就是进行一个字符串到具体对象的转化;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
NvDlaError Compiler::compileInternal(const char *tp_name, const char *target_config_name, ILoadable **peli, bool fullCompile)
{
NvDlaError e = NvDlaSuccess;
DLAInterface *target_dla = 0;
DLAInterface *dla_if = 0;
Profiler *profiler = 0;
ProfileFactory::ProfilePrivPair p_profile;
Profile *profile = 0;
TargetConfig *target_config = 0;
vector<engine_ast::Graph *> g;
NVDLA_UNUSED(target_dla);
NVDLA_UNUSED(dla_if);


LoadableFactory::LoadablePrivPair l(0, 0);

// && == with fullCompile or otherwise ok during compileCheck?

DumpCanonicalGraphJson dump_can;
DumpEngineGraphJson dump_eng;

if ( !m_wisdom )
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "No wisdom available.");
}

profiler = ProfilerFactory::priv(m_wisdom->getProfiler());
if ( !profiler )
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "No profiler available.");
}
// ????这里为什么要再来一下,总之很迷惑,这个在之前已经解析过了啊
profile = ProfileFactory::priv(profiler->getProfile(tp_name));
if ( !profile )
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "Couldn't find profile to compile.");
}
// 将 target_config_name 转换为 target_config 对象
target_config = TargetConfigFactory::priv(profiler->getTargetConfig(target_config_name));
if ( !target_config )
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "Couldn't find target config to compile.");
}

// 正式的编译在这里
PROPAGATE_ERROR_FAIL( compileInternal(profile, target_config, peli, fullCompile) );
fail:
return e;

}

7. Network 到 canonical_ast

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
//
// the following generates a 1:1 mapping with the Canonical graph input.
//
canonical_ast::Graph *canonical_ast::generateGraph(Network *network)
{
vector<canonical_ast::Edge *> input_edges;
vector<canonical_ast::Edge *> output_edges;

map<canonical_ast::Node *, Layer *, Graph::nodeCompareFn> node_layer;
map<canonical_ast::Node *, Layer *, Graph::nodeCompareFn>::iterator lni;

map<Tensor *, canonical_ast::Edge *> tensor_edge;
map<Tensor *, Tensor *> nw_tensor_to_can_tensor;
map<Tensor *, canonical_ast::Edge *>::iterator tei;

Graph *graph = new Graph();
// 下面这个循环把network中的inputTensor汇总成一个数组,并把Graph中input_edges数组大小设定成
// network中的inputTensor的数组大小
vector<Tensor *> network_inputs;
for (int ni = 0; ni < network->getNumInputs(); ++ni)
{
network_inputs.push_back(TensorFactory::priv(network->getInput(ni)));
}
input_edges.resize(network_inputs.size());

//下面这个循环把network中的outputTensor汇总成一个数组,并把Graph中outputedges数组大小设定成
//network中的outputTensor的数组大小
vector<Tensor *> network_outputs;
for (int ni = 0; ni < network->getNumOutputs(); ++ni)
{
network_outputs.push_back(TensorFactory::priv(network->getOutput(ni)));
}
output_edges.resize(network_outputs.size());

// gLogInfo << "canonical_ast::" << __func__ << " network shows " << network_inputs.size() << " inputs and " <<
// network_outputs.size() << " outputs" << endl;
//下面这个循环迭代network中的layer序列,根据每个layer的信息分别建立Graph中的相应的Node
for (int li = 0; li < network->getNumLayers(); li++)
{
ILayer *ilayer = network->getLayer(li);
Layer *layer = LayerFactory::priv(ilayer);
if ( !(ilayer && layer) )
{
gLogError << __func__ << " encountered null layer at network layer index=" << li << endl;
continue;
}
//根据network中的layer,建立相应的Node
canonical_ast::Node *can_node = newCanonicalNode(layer);
if ( !can_node )
{
delete graph; // blow up
graph = 0;
goto done;
}
can_node->setGraph(graph); // 这样可以从layer直接找到layer所在的graph
graph->insertNode(can_node); // 为 graph 插入node

can_node->setId(graph->nextNodeId()); // 设置 node 的id,n-0 n-1 这样
can_node->setName(layer->getName()); // 设置 node 的name

node_layer[can_node] = layer; // 设置 node 和 layer 的 map 关系

}

//
// Now all the layer nodes are in the graph.
// For each layer assemble the edges.
//
//现在,所有network中的layer都在graph中建立了相应的node,并且这个对应关系也记录在了node_layer的MAP中
//下面循环迭代这个MAP中的每一项
for (lni = node_layer.begin(); lni != node_layer.end(); ++lni)
{
canonical_ast::Node *node = lni->first;
Layer *l = lni->second;

size_t input_tensors = 0, output_tensors = 0, aux_input_tensors = 0;
vector<Tensor *> io_tensors, aux_tensors;
NVDLA_UNUSED(aux_input_tensors);
//针对network中当前迭代的这个layer,找出其全部inputTensors并加入io_tensors列表
for(int ii = 0, II = l->getNumInputs(); ii < II; ++ii)
{
Tensor *tensor = TensorFactory::priv(l->getInput(ii));
if ( !tensor )
{
gLogError << __func__ << " 3.<null>.i." << ii << endl;
continue;
}
io_tensors.push_back(tensor);
input_tensors++;
}
//针对network中当前迭代的这个layer,找出其全部outputTensors并加入io_tensors列表
for(int oo = 0, OO = l->getNumOutputs(); oo < OO; ++oo)
{
Tensor *tensor = TensorFactory::priv(l->getOutput(oo));
if ( ! tensor )
{
gLogError << __func__ << " 3.<null>.o." << oo << endl;
continue;
}
io_tensors.push_back(tensor);
output_tensors++;
}
//针对当前layer,迭代刚刚找到的全部iotensor的列表
for(size_t io = 0, IO = io_tensors.size(); io < IO; ++io)
{
Tensor *nw_tensor = io_tensors[io];
bool is_input = io < input_tensors;//根据当前tensor在列表中的位置判断是input还是output
// edge_side是个enum值,input=SECOND,output=FIRST
ast::EdgeSide edge_side( is_input ? ast::EdgeSideEnum::SECOND : ast::EdgeSideEnum::FIRST);
// edge_dir是个enum值,有单向双向和无方向三种,这里统一设定为单向
ast::EdgeDirection edge_dir(ast::EdgeDirectionEnum::DIRECTED);
// 在tensor_edge映射MAP中查找当前tensor的对应项
map<Tensor *, canonical_ast::Edge *>::iterator f = tensor_edge.find(nw_tensor);
canonical_ast::Edge *can_edge = 0;// graph中的edge
Tensor* can_tensor = 0;// graph中的tensor
if ( f == tensor_edge.end() ) //如果没有在MAP中找到对应项
{
can_edge = new canonical_ast::Edge(); //新建一个graph中的edge
can_edge->setGraph(graph); //把新建的edge的container设定为graph

can_tensor = nw_tensor->clone();//把network中的tensor复制到一个新的变量can_tensor
can_tensor->setNetwork(NULL); //由于这个新的tensor变量将加入graph所以其network指针清空,不在指向原来的network(这里是复制一份tensor,network中原来的tensor还在)
can_tensor->setTensorType(TensorType::kIO);//graph中的tensor设定为IO类型
can_edge->setId(graph->nextEdgeId()); //graph中edge的Id设定为string,e-0,e-1,e-2等
can_edge->setOriginalTensor(can_tensor);//graph中的edge的原始tensor设定为can_tensor,注意,这里的OriginalTensor指向的是从network中复制clone过来的一个副本,并不在network中,可以看出这里的包含关系,graph-->can_edge-->can_tensor
graph->insertEdge(can_edge);//把根据network中1个layer的iotensor新建的edge加入graph列表
//tensor_edge映射MAP加入nw中tensor到graph中edge映射
//nw_tensor_to_can_tensor映射MAP加入nw中tensor到graph中edge的tensor映射
tensor_edge[nw_tensor] = can_edge;
nw_tensor_to_can_tensor[nw_tensor] = can_tensor;
} else {
can_edge = f->second;
}
//把当前新建的edge加入到node的edge_side侧列表当中
graph->appendNodeToEdge(can_edge, edge_side, node);

// if this is an input node it could be one of the network inputs.
// if so keep track of it.
if ( is_input )
{
//迭代整个network的inputTensors列表
for ( size_t iti = 0; iti < network_inputs.size(); iti++)
{
//如果当前node对应的这个inputTensor在整个network的inputTensors列表当中
if ( nw_tensor == network_inputs[iti] )
{
// gLogInfo << " identified input edge: " << (int)iti << " tensor id " << tensor->getName() << endl;
input_edges[iti] = can_edge; //把当前edge加入graph的input_edges列表当中
can_tensor = nw_tensor_to_can_tensor[nw_tensor];
//设定当前tensor属性为INPUT
can_tensor->setTensorType(TensorType::kNW_INPUT);
break;
}
}
node->markInputEdge(can_edge); //告诉当前node,你的这个edge是一个网络inputedge
}
else
{
// 相对的,mark output
for ( size_t oti = 0; oti < network_outputs.size(); oti++)
{
if ( nw_tensor == network_outputs[oti] )
{
// gLogInfo << " identified output edge: " << (int)oti << " tensor id " << tensor->getName() << endl;
output_edges[oti] = can_edge;
can_tensor = nw_tensor_to_can_tensor[nw_tensor];
can_tensor->setTensorType(TensorType::kNW_OUTPUT);
break;
}
}
node->markOutputEdge(can_edge);
}
}
}

if ( input_edges.size() )
{
graph->setInputEdges(input_edges); //设定整个graph的inputedges队列为input_edges
}
if ( output_edges.size() )
{
graph->setOutputEdges(output_edges); //设定整个graph的outputedges队列为output_edges
}

graph->scoredOrdering()->generate(); // graph 计分牌生成,这部分比较复杂
graph->markClean(); // 清除graph的m_dirty脏标志,所有对graph的更改都要设定m_dirty为true

done:
return graph; // 把按照network生成的graph作为返回值返回
}

8. canonical_ast to engine_ast

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
//这个函数完成的是两个graph的转换,通过参数可以看到,输入不仅仅由can_graph,还有编译器的profile和编译目标配置target_config,说明转换后的graph应该反应部分硬件和编译选项的要求
engine_ast::Graph *engine_ast::generateGraph(Profile *profile, TargetConfig *target_config, canonical_ast::Graph *can_graph)
{
NvDlaError e = NvDlaSuccess;
vector<engine_ast::Edge *> input_edges;
vector<engine_ast::Edge *> output_edges;

vector<canonical_ast::Node *> can_edge_first_nodes, can_edge_second_nodes;
map<canonical_ast::Node *, engine_ast::Node *> can_to_eng_sink_node_map, can_to_eng_source_node_map;
map<canonical_ast::Edge *, engine_ast::Edge *> can_to_eng_edge_map;
vector<canonical_ast::Node *>::const_iterator f, begin, end;
vector<engine_ast::Node *> first_nodes, second_nodes;
engine_ast::Graph *eng_graph;

if ( !profile )
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "must associate profile with Engine AST generateGraph");
}

if ( !target_config )
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "must associate target_config with Engine AST generateGraph");
}
//编译目标是否支持批处理
if (target_config->isBatchModeCapable())
{
NvU32 numBatches = profile->multiBatchSize();
NvU32 maxBatches = target_config->maxBatchSize();

if (numBatches > maxBatches)
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadValue, "numbatches is greater than allowed maxbatches (%d)", maxBatches);
}
}
// 建立engine_graph对象,参数是profile和target_config
eng_graph = new engine_ast::Graph(profile, target_config);
if ( !eng_graph )
{
ORIGINATE_ERROR_FAIL(NvDlaError_InsufficientMemory, "Can't create a new Engine AST");
}
//初始化eng_graph的资源,主要是内存池和LutManager,内存池包括GLOBAL_DRAM_POOL,LOCAL_DRAM_POOL
//如果profile开启了SRAM,那么还有LOCAL_CVSRAM_POOL,这三个mempool的大小由profile参数指定
e = eng_graph->initGraphResources();
if (e != NvDlaSuccess)
{
delete eng_graph;
eng_graph = NULL;
ORIGINATE_ERROR_FAIL(NvDlaError_InsufficientMemory, "Couldn't initialize all graph resources");
}
// engine graph 访问计分板
eng_graph->setScoredOrdering( new ScoredDependencyOrdering(eng_graph) );
eng_graph->setOrdering(new DependencyOrdering(eng_graph->scoredOrdering()));

//
// create edges to mirror the canonical edges.
// 这里可以看语法树的转换,canonical_ast 的 edge 是被复制到 engine_ast 上的
for ( set<canonical_ast::Edge *>::iterator cei = can_graph->edges().begin(), CEI = can_graph->edges().end();
cei != CEI; ++cei )
{
//根据canonical_ast::Edge建立engine_ast::Edge对象
engine_ast::Edge* engine_edge = new engine_ast::Edge(*cei);
Tensor* engine_tensor = 0;
if ( !engine_edge )
{
delete eng_graph; // blow up
eng_graph = NULL;
ORIGINATE_ERROR_FAIL(NvDlaError_InsufficientMemory, "Couldn't transform canonical edge '%s' into engine edge", (*cei)->id().c_str());
}
//engine_tensor复制自can_tensor,前面讲过can_tensor其实是clone自network的tensor
engine_tensor = (*cei)->originalTensor()->clone();
engine_tensor->setDataFormat(nvdla::DataFormat::NCHW); // all tensors are NCHW unless otherwise specified
engine_tensor->setNetwork(NULL); // get rid of any connections back to the network builder 这里在创建can_tensor的时候已经变成 NULL 了。。

engine_edge->setGraph(eng_graph); //指定engine_edge的container为eng_graph
engine_edge->setId(eng_graph->nextEdgeId()); //设定engine_edge的Id,string类型,e-0,e-1等
engine_edge->setDataEdge(); //设定edge的type为DATA
engine_edge->setOriginalTensor(engine_tensor); //指定edge关联的tensor

can_to_eng_edge_map[*cei] = engine_edge; //建立can_edge和engine_edge的关联MAP
eng_graph->insertEdge(engine_edge); //把engine_edge加入eng_graph的edge列表

}
//如果没有指定multibatchsize,则根据network的input tensor的n指定推导multibatchsize
//如果指定了multibatchsize,那就按照multibatchsize来执行
if (profile->multiBatchSize() == 0)
{
// Patch up profile->multiBatchSize()
// The compiler should be querying this information from the network instead of the profile

// Collect the multibatch size of the network, based on the input tensor dimensions
// 设置输入的维度
for ( vector<canonical_ast::Edge *>::const_iterator cie = can_graph->inputEdges().begin();
cie != can_graph->inputEdges().end(); ++cie)
{
engine_ast::Edge *input_edge = can_to_eng_edge_map[*cie];
Dims4 networkDims = input_edge->originalTensor()->getDimensions();

//根据input_edge的tensor Dimension的n,设定profile的multibatchsize
PROPAGATE_ERROR_FAIL(profile->setMultiBatchSize(networkDims.n));
}
}

//
// create nodes to mirror the canonical nodes
// 迭代can_graph的所有nodes
for ( set<canonical_ast::Node *>::iterator cni = can_graph->nodes().begin(), CNI = can_graph->nodes().end();
cni != CNI; ++cni )
{
engine_ast::Graph::EdgeSequence engSrcEdges; //engine_graph的SrcEdges
engine_ast::Graph::EdgeSequence engSinkEdges; //engine_graph的SinkEdges
engine_ast::Graph::NodeSequence engNodes; //engine_graph的Nodes
canonical_ast::Graph::EdgeSequence canSrcEdges = can_graph->nodeEdges(*cni, ast::EdgeSideEnum::SECOND); //can_graph的当前node的inputedge的总和
canonical_ast::Graph::EdgeSequence canSinkEdges = can_graph->nodeEdges(*cni, ast::EdgeSideEnum::FIRST); //can_graph的当前node的outputedge的总和
canonical_ast::Graph::EdgeSequenceIterator cei;
// 这里就是复制一下Src、SinkEdge
// 找出所有canSrcEdges对应的engine_edge,放入engSrcEdges列表
for (cei = canSrcEdges.begin(); cei != canSrcEdges.end(); ++cei)
{
engSrcEdges.push_back(can_to_eng_edge_map[*cei]);
}
// 找出所有canSinkEdges对应的engine_edge,放入engSinkEdges列表
for (cei = canSinkEdges.begin(); cei != canSinkEdges.end(); ++cei)
{
engSinkEdges.push_back(can_to_eng_edge_map[*cei]);
}
//从当前的can_node转化出eng_nodes,之所以是end_nodes是因为一个can_node可以对应2,3个eng_nodes
//转换完毕是否把结果的engNodes挂在eng_graph上???需要详细看transformCanNode()函数代码
e = transformCanNode(eng_graph, *cni, engSrcEdges, engSinkEdges, engNodes);
if ( e != NvDlaSuccess )
{
delete eng_graph; // blow up
eng_graph = NULL;
ORIGINATE_ERROR_FAIL(e, "Couldn't transform canonical node '%s' into engine node", (*cni)->id().c_str());
}
//n-0:->n-0:dc-conv-0 n-1:bias-0
//n-1:->n-2:pdp-0
//n-2:->n-3:dc-conv-1 n-4:bias-1
//n-3:->n-5:pdp-1
//n-4:->n-6:fc-0 n-7:bias-2
//n-5:->n-8:sdp-scale-0 n-9:act-0
//n-6:->n-10:fc-1 n-11:bias-3
//n-7:->n-12:cpu-sm-0
//上面列出的就是transformCanNode()函数的转换结果,可以看到1个can_node有可能转换成2个eng_node
//是因为can_node是直接对那个network模型的node,而在engine中,一个network模型中的node有可能是
//需要2个engine前后协同计算才能得到结果,所有这里的eng_node其实已经是映射到硬件上的node了
if ( eng_graph->debugGraphDump() )
{
gLogInfo << (*cni)->id() << ":->";
for (vector<engine_ast::Node *>::iterator ni = engNodes.begin(); ni != engNodes.end(); ++ni)
{
gLogInfo << (*ni)->id() << ":" << (*ni)->name() << " ";
}
gLogInfo << std::endl;
}
}

//迭代can_graph的所有inputEdges
for ( vector<canonical_ast::Edge *>::const_iterator cie = can_graph->inputEdges().begin();
cie != can_graph->inputEdges().end(); ++cie)
{
//找出can_graph的首个inputEdge对应的eng_edge
engine_ast::Edge *first_edge = can_to_eng_edge_map[can_graph->inputEdges().front()];
//当前迭代的can_edge对应的eng_edge
engine_ast::Edge *input_edge = can_to_eng_edge_map[*cie];
//当前eng_edge对应的tensor格式设定为profile指定的InputDataFormat
input_edge->originalTensor()->setDataFormat(profile->networkInputDataFormat());
// 要求所有的inputedge的multibatch参数n必须一致
// Determine if multibatch parameters are consistent for all input tensors
if (first_edge->originalTensor()->getDimensions().n != input_edge->originalTensor()->getDimensions().n)
{
ORIGINATE_ERROR_FAIL(NvDlaError_BadValue, "Input tensor multibatch dimensions mismatch: %d != %d", first_edge->originalTensor()->getDimensions().n, input_edge->originalTensor()->getDimensions().n);
}

Dims4 networkDims = input_edge->originalTensor()->getDimensions();
//拿所有inputedge的multibatch参数n和profile指定的multibatch参数进行比较,如果不一致
//则以profile指定的参数为准,并把inputedge中的tensor变量的networkDims.n更新为profile指定的值
if ( networkDims.n != (NvS32)profile->multiBatchSize() )
{
gLogWarning << "Overriding input multibatch size from " << networkDims.n << " to " << profile->multiBatchSize() << endl;
networkDims.n = profile->multiBatchSize();
input_edge->originalTensor()->setDimensions(networkDims);
}

// if it is IMG input format, ensure #chnls match between model and profile params
// 如果profile指定的输入IMG tensor的channel数与network提供的networkDims.c不一致
// 则以profile设定的input tensor的channel值为准,同时更新engine_graph的inputedge对应的tensor
// 的networkDims.c的值
if ( profile->networkInputSurfaceFormat().category() == surface::SurfaceCategoryEnum::IMG &&
networkDims.c != profile->networkInputSurfaceFormat().channelsPerAtom())
{
gLogWarning << "Prototxt #chnls (C = "
<< networkDims.c
<< ") != Profile #chnls for input ("
<< profile->networkInputSurfaceFormat().c_str()
<< ": C = "
<< (int)profile->networkInputSurfaceFormat().channelsPerAtom()
<< "). Preferring #chnls from Profile for compiling."
<< endl;
networkDims.c = profile->networkInputSurfaceFormat().channelsPerAtom();
input_edge->originalTensor()->setDimensions(networkDims);

// copy the tensor scales and offsets to the extra channel if any
if (input_edge->originalTensor()->getChannelScales().size())
{
NvF32 tensorScale = input_edge->originalTensor()->getChannelScales().at(0);
std::vector<NvF32> channelScales;
for (NvU32 cc = 0; cc < (NvU32)networkDims.c; ++cc)
{
channelScales.push_back(tensorScale);
}
input_edge->originalTensor()->setChannelScales(channelScales);
}

if (input_edge->originalTensor()->getChannelOffsets().size())
{
NvF32 tensorOffset = input_edge->originalTensor()->getChannelOffsets().at(0);
std::vector<NvF32> channelOffsets;
for (NvU32 cc = 0; cc < (NvU32)networkDims.c; ++cc)
{
channelOffsets.push_back(tensorOffset);
}
input_edge->originalTensor()->setChannelOffsets(channelOffsets);
}
}
// 这个bindid好像只是整个图的input和output的edge才设定,这个函数只是设定两个变量而已
// m_bindDomain = bindDomain; m_bindId = id; bingDomain有input output debug三种
// 这个bindid也是随着inputedge的增加顺序往后排
input_edge->setBindId(input_edges.size(), IOD_Input);
if ( eng_graph->debugBinding() )
{
gLogInfo << "EngineAST graph level input edge[" << input_edges.size() << "] is " << input_edge->id() << endl;
gLogInfo << "input bind id: " << input_edge->bindId() << endl;
}

input_edges.push_back( input_edge );
};

// 设定整个eng_graph的inputedge列表为input_edges
if ( input_edges.size() )
{
eng_graph->setInputEdges(input_edges);
}

//按照以上处理inputedge的方法,处理所有的outputedges
for ( vector<canonical_ast::Edge *>::const_iterator coe = can_graph->outputEdges().begin();
coe != can_graph->outputEdges().end(); ++coe)
{
engine_ast::Edge *output_edge = can_to_eng_edge_map[*coe];
output_edge->originalTensor()->setDataFormat(profile->networkOutputDataFormat());

Dims4 networkDims = output_edge->originalTensor()->getDimensions();
if ( networkDims.n != (NvS32)profile->multiBatchSize() )
{
gLogWarning << "Overriding output multibatch size from " << networkDims.n << " to " << profile->multiBatchSize() << endl;
networkDims.n = profile->multiBatchSize();
output_edge->originalTensor()->setDimensions(networkDims);
}

output_edge->setBindId(output_edges.size(), IOD_Output);
if ( eng_graph->debugBinding() )
{
gLogInfo << "EngineAST graph level output edge[" << output_edges.size() << "] is " << output_edge->id() << endl;
gLogInfo << "output bind id: " << output_edge->bindId() << endl;
}

output_edges.push_back( output_edge );
};
//设定整个eng_graph的outputedge列表为output_edges

if ( output_edges.size() )
{
eng_graph->setOutputEdges(output_edges);
}
// 打印所有eng_node的name,编号,以及对应的can_node的name
// 同时打印每个eng_node的所有input output aux类型的edge
//libnvdla<3> dc-conv-0/n-0/conv1:
//libnvdla<3> in e-0
//libnvdla<3> out e-11
//libnvdla<3> aux e-9
//libnvdla<3> bias-0/n-1/conv1:
//libnvdla<3> in e-11
//libnvdla<3> out e-1
//libnvdla<3> aux e-10
// cache input/output/aux edges of each node into their respective data ports
if ( eng_graph->debugGraphDump() )
{
engine_ast::Graph::NodeSet engineNodes = eng_graph->nodes();
engine_ast::Graph::NodeSetIterator eni = engineNodes.begin();
for ( ; eni != engineNodes.end(); ++eni)
{
typedef std::vector<Edge*>::const_iterator ESI;

std::string canNodeName;
if ((*eni)->canonicalNode() == NULL)
{
canNodeName = "(No canonical node)";
}
else
{
canNodeName = (*eni)->canonicalNode()->name();
}
gLogInfo << (*eni)->name() << "/" << (*eni)->id() << "/"
<< canNodeName << ":" << endl;
for (ESI ii = (*eni)->inputEdges().begin(); ii != (*eni)->inputEdges().end(); ++ii)
gLogInfo << "\tin " << (*ii)->id() << endl;
for (ESI ii = (*eni)->outputEdges().begin(); ii != (*eni)->outputEdges().end(); ++ii)
gLogInfo << "\tout " << (*ii)->id() << endl;
for (ESI ii = (*eni)->auxEdges().begin(); ii != (*eni)->auxEdges().end(); ++ii)
gLogInfo << "\taux " << (*ii)->id() << endl;
}
}

eng_graph->ordering()->generate();
eng_graph->markClean();

// force N = 1 for all non-Aux tensors represented by non-bindable edges;
// until we allow contiguous non-bindable tensors for multi-batch
{
engine_ast::Graph::EdgeSequence engineEdges = eng_graph->orderedEdges();
for (engine_ast::Graph::EdgeSequenceIterator eei = engineEdges.begin(); eei != engineEdges.end(); ++eei)
{
if (!(*eei)->bindable() && !(*eei)->isAuxEdge() && (*eei)->originalTensor())
{
Dims4 nonBindableTensorDims = (*eei)->originalTensor()->getDimensions();
if ( eng_graph->debugGraphDump() )
{
if (nonBindableTensorDims.n != 1)
gLogInfo << "Forcing batch size '1' for non-bindable non-aux edge " << (*eei)->id() << endl;
}
nonBindableTensorDims.n = 1;
(*eei)->originalTensor()->setDimensions(nonBindableTensorDims);
}
}
}
return eng_graph;

fail:
return NULL;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
NvDlaError engine_ast::transformCanNode
(
engine_ast::Graph* engGraph,
canonical_ast::Node *canNode,
engine_ast::Graph::EdgeSequence engSrcEdges,
engine_ast::Graph::EdgeSequence engSinkEdges,
engine_ast::Graph::NodeSequence& transformedEngNodes
)
{
NvDlaError e = NvDlaSuccess;
switch (canNode->canonicalOpType().v())
{
case canonical_ast::CONVOLUTION:
PROPAGATE_ERROR_FAIL(transformCanConvOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::FULLY_CONNECTED:
PROPAGATE_ERROR_FAIL(transformCanFCOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::ACTIVATION:
PROPAGATE_ERROR_FAIL(transformCanActOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::POOLING:
PROPAGATE_ERROR_FAIL(transformCanPoolingOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::LRN:
PROPAGATE_ERROR_FAIL(transformCanLRNOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::SCALE:
PROPAGATE_ERROR_FAIL(transformCanScaleOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::BATCH_NORM:
PROPAGATE_ERROR_FAIL(transformCanBNOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::SOFTMAX:
PROPAGATE_ERROR_FAIL(transformCanSoftMaxOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::DECONVOLUTION:
PROPAGATE_ERROR_FAIL(transformCanDeconvOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::CONCATENATION:
PROPAGATE_ERROR_FAIL(transformCanConcatOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::ELEMENTWISE:
PROPAGATE_ERROR_FAIL(transformCanEWOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
case canonical_ast::SPLIT:
PROPAGATE_ERROR_FAIL(transformCanSplitOp(engGraph, canNode, engSrcEdges, engSinkEdges, transformedEngNodes)); break;
default:
ORIGINATE_ERROR_FAIL(NvDlaError_BadParameter, "Unexpected canonical node '%s' of type '%s' ", canNode->id().c_str(), canNode->canonicalOpType().c_str());
}
fail:
return e;
}

4. Compiler 量化原理

在C程序中,可以用宏代码提高执行效率。宏代码本身不是函数,但使用起来象函数。预处理器用复制宏代码的方式代替函数调用,省去了参数压栈、生成汇编语言的CALL调用、返回参数、执行return等过程,从而提高了速度。 但使用宏代码最大的缺点是容易出错,预处理器在复制宏代码时常常产生意想不到的边际效应。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#define PRECISION_SWITCH(modelPrec, computePrec, retVal, func, ...)     \
switch(modelPrec) { \
case nvdla::DataType::INT8: \
switch(computePrec) { \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT8: \
retVal = func<NvS8, NvS8>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT16: \
retVal = func<NvS8, NvS16>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_FP16: \
retVal = func<NvS8, half_float::half>(__VA_ARGS__); break; \
default: \
REPORT_ERROR(NvDlaError_NotSupported, "Don't support %d", computePrec); \
}; break; \
case nvdla::DataType::INT16: \
switch(computePrec) { \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT8: \
retVal = func<NvS16, NvS8>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT16: \
retVal = func<NvS16, NvS16>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_FP16: \
retVal = func<NvS16, half_float::half>(__VA_ARGS__); break; \
default: \
REPORT_ERROR(NvDlaError_NotSupported, "Don't support %d", computePrec); \
}; break; \
case nvdla::DataType::HALF: \
switch(computePrec) { \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT8: \
retVal = func<half_float::half, NvS8>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT16: \
retVal = func<half_float::half, NvS16>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_FP16: \
retVal = func<half_float::half, half_float::half>(__VA_ARGS__); break; \
default: \
REPORT_ERROR(NvDlaError_NotSupported, "Don't support %d", computePrec); \
}; break; \
case nvdla::DataType::FLOAT: \
switch(computePrec) { \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT8: \
retVal = func<NvF32, NvS8>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_INT16: \
retVal = func<NvF32, NvS16>(__VA_ARGS__); break; \
case surface::SurfacePrecisionEnum::NVDLA_PRECISION_FP16: \
retVal = func<NvF32, half_float::half>(__VA_ARGS__); break; \
default: \
REPORT_ERROR(NvDlaError_NotSupported, "Don't support %d", computePrec); \
}; break; \
default: \
REPORT_ERROR(NvDlaError_NotSupported, "Don't support %d", modelPrec); \
}

具体的量化套路在 preProcessAuxData 和 quantizeAuxData 两个函数中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
template <typename MP, typename CP>
static std::vector<NvF32> perKernelQuantizeWts
(
Weights highPrecWts,
NvS32 G, NvS32 K, NvS32 C, NvS32 RS, NvS32 kStride, NvS32 cStride,
NvS8* quantizedWts
)
{
std::vector<NvF32> filterScales;
NvF32 max = std::numeric_limits<NvF32>::lowest(); // 先把 max 初始化为 FLoat 32 的最小值
const MP* origWts = reinterpret_cast<const MP*>(const_cast<void*>(highPrecWts.values));
// 获取所有Tensor的最大值
for (NvS32 g = 0; g < G; g++)
{
NvS32 gOffset = g * K * C * RS;
for (NvS32 k = 0; k < K; k++)
{
for (NvS32 c = 0; c < C; c++)
{
for (NvS32 rs = 0; rs < RS; rs++)
// max = std::max(max, std::abs(origWts[gOffset + k * kStride + c * cStride + rs] * inScale));
max = std::max<NvF32>(max, std::fabs(origWts[gOffset + k * kStride + c * cStride + rs]));
}
}
}

NvF32 scale = max / 127, invScale = 1 / scale;
// 开始量化
for (NvS32 g = 0; g < G; g++)
{
NvS32 gOffset = g * K * C * RS;
for (NvS32 k = 0; k < K; k++)
{
for (NvS32 c = 0; c < C; c++)
{
for (NvS32 rs = 0; rs < RS; rs++)
{
NvS32 index = gOffset + k * kStride + c * cStride + rs;
// quantizedWts[index] = int8_t(std::floor(origWts[index] * inScale * invScale + 0.5f));

// quantizedWts[index] = static_cast<NvS8>(std::floor(origWts[index] * invScale + 0.5f));
NvS32 int32Weight = static_cast<NvS32>(std::floor(origWts[index] * invScale + 0.5f));
quantizedWts[index] = static_cast<NvS8>(std::max(std::min(int32Weight, static_cast<NvS32>(std::numeric_limits<NvS8>::max())),
static_cast<NvS32>(std::numeric_limits<NvS8>::lowest())));
}
}
filterScales.push_back(scale);
}
}
return filterScales;
}

$$
Scale = \frac{max}{127} \
invScale = \frac{1}{Scale} = \frac{127}{max} \
Tensor_{int32} = floor(Tensor_{float32} * invScale + 0.5) = floor(\frac{Tensor_{float32}}{max} * 127 + 0.5) \
Tensor_{int8} = max(min(Tensor_{int32}, 127), -128)
$$

待解决的问题

  1. 量化前后的数据输出对比。
  2. symbol table (符号表)有什么用?
  3. Graph 计分牌机制?
  4. raw weights of bias 到底是权重数据还是bias数据
ChinaDA 6th 2021 会议报告:EDA、专用处理器与新型架构设计技术 NASGuard: A Novel Accelerator Architecture for Robust Neural Architecture Search (NAS) Networks

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×