Класс: dlhdl. Технологический процесс
Пакет: dlhdl
Компиляция объекта рабочего процесса
compile компилирует dlhdl.Workflow и формирует параметры для развертывания сети на целевом устройстве.
compile( компилирует Name,Value)dlhdl.Workflow и генерирует параметры для развертывания сети на целевом устройстве с дополнительными опциями, заданными одним или несколькими Name,Value аргументы пары.
Функция возвращает две матрицы. Одна матрица описывает уровни сети. Conv Controller (Scheduling) и FC Controller (Scheduling) модули в IP-процессоре глубокого обучения используют эту матрицу для планирования операций свертки и полностью подключенного уровня. Вторая матрица содержит веса, смещения и входы нейронной сети. Эта информация загружается в память DDR и используется Generic Convolution Processor и Generic FC Processor в процессоре глубокого обучения.
Укажите дополнительные пары, разделенные запятыми Name,Value аргументы. Name является именем аргумента и Value - соответствующее значение. Name должен отображаться внутри кавычек. Можно указать несколько аргументов пары имен и значений в любом порядке как Name1,Value1,...,NameN,ValueN.
InputFrameNumberLimit - Максимальный предел числа входных кадров Параметр, указывающий максимальный предел числа входных кадров для вычисления распределения доступа к памяти DDR.
Пример: 'InputFrameNumberLimit',30
dlhdl.Workflow объект Скомпилировать dlhdl.Workflow объект, для развертывания в комплект разработки Intel ® Arria ® 10 SoC, single типы данных.
Создать dlhdl.Workflow и затем используйте compile для развертывания предварительно обученной сети на целевом оборудовании.
snet = vgg19; hT = dlhdl.Target('Intel'); hW = dlhdl.Workflow('network', snet, 'Bitstream', 'arria10soc_single','Target',hT); hW.compile
После выполнения кода результат будет следующим:
hW.compile
offset_name offset_address allocated_space
_______________________ ______________ _________________
"InputDataOffset" "0x00000000" "24.0 MB"
"OutputResultOffset" "0x01800000" "4.0 MB"
"SystemBufferOffset" "0x01c00000" "52.0 MB"
"InstructionDataOffset" "0x05000000" "20.0 MB"
"ConvWeightDataOffset" "0x06400000" "276.0 MB"
"FCWeightDataOffset" "0x17800000" "472.0 MB"
"EndOffset" "0x35000000" "Total: 848.0 MB"
ans =
struct with fields:
Operators: [1×1 struct]
LayerConfigs: [1×1 struct]
NetConfigs: [1×1 struct]
Создать dlhdl.Workflow и затем используйте compile функция с необязательным аргументом InputFrameNumberLimit для развертывания предварительно обученной сети на целевом оборудовании.
snet = alexnet; hT = dlhdl.Target('Xilinx'); hW = dlhdl.Workflow('network', snet, 'Bitstream', 'zcu102_single','Target',hT); hW.compile('InputFrameNumberLimit',30);
Результат выполнения кода:
### Compiling network for Deep Learning FPGA prototyping ...
### Targeting FPGA bitstream zcu102_single ...
### The network includes the following layers:
1 'data' Image Input 227×227×3 images with 'zerocenter' normalization (SW Layer)
2 'conv1' Convolution 96 11×11×3 convolutions with stride [4 4] and padding [0 0 0 0] (HW Layer)
3 'relu1' ReLU ReLU (HW Layer)
4 'norm1' Cross Channel Normalization cross channel normalization with 5 channels per element (HW Layer)
5 'pool1' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0] (HW Layer)
6 'conv2' Grouped Convolution 2 groups of 128 5×5×48 convolutions with stride [1 1] and padding [2 2 2 2] (HW Layer)
7 'relu2' ReLU ReLU (HW Layer)
8 'norm2' Cross Channel Normalization cross channel normalization with 5 channels per element (HW Layer)
9 'pool2' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0] (HW Layer)
10 'conv3' Convolution 384 3×3×256 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
11 'relu3' ReLU ReLU (HW Layer)
12 'conv4' Grouped Convolution 2 groups of 192 3×3×192 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
13 'relu4' ReLU ReLU (HW Layer)
14 'conv5' Grouped Convolution 2 groups of 128 3×3×192 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
15 'relu5' ReLU ReLU (HW Layer)
16 'pool5' Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0] (HW Layer)
17 'fc6' Fully Connected 4096 fully connected layer (HW Layer)
18 'relu6' ReLU ReLU (HW Layer)
19 'drop6' Dropout 50% dropout (HW Layer)
20 'fc7' Fully Connected 4096 fully connected layer (HW Layer)
21 'relu7' ReLU ReLU (HW Layer)
22 'drop7' Dropout 50% dropout (HW Layer)
23 'fc8' Fully Connected 1000 fully connected layer (HW Layer)
24 'prob' Softmax softmax (SW Layer)
25 'output' Classification Output crossentropyex with 'tench' and 999 other classes (SW Layer)
3 Memory Regions created.
Skipping: data
Compiling leg: conv1>>pool5 ...
Compiling leg: conv1>>pool5 ... complete.
Compiling leg: fc6>>fc8 ...
Compiling leg: fc6>>fc8 ... complete.
Skipping: prob
Skipping: output
Creating Schedule...
.......
Creating Schedule...complete.
Creating Status Table...
......
Creating Status Table...complete.
Emitting Schedule...
......
Emitting Schedule...complete.
Emitting Status Table...
........
Emitting Status Table...complete.
### Allocating external memory buffers:
offset_name offset_address allocated_space
_______________________ ______________ _________________
"InputDataOffset" "0x00000000" "24.0 MB"
"OutputResultOffset" "0x01800000" "4.0 MB"
"SchedulerDataOffset" "0x01c00000" "4.0 MB"
"SystemBufferOffset" "0x02000000" "28.0 MB"
"InstructionDataOffset" "0x03c00000" "4.0 MB"
"ConvWeightDataOffset" "0x04000000" "16.0 MB"
"FCWeightDataOffset" "0x05000000" "224.0 MB"
"EndOffset" "0x13000000" "Total: 304.0 MB"
### Network compilation complete.
dagnet сетевой объектСоздать dlhdl.Workflow объект с resnet18 в качестве сети для развертывания на плате Xilinx ® Zynq ® UltraScale+™ MPSoC ZCU102, которая используетsingle типы данных.
snet = resnet18; hTarget = dlhdl.Target('Xilinx'); hW = dlhdl.Workflow('N',snet,'B','zcu102_single','T',hTarget);
Позвоните в compile функция на hW
hW.compile
Вызов compile функция, возвращает:
### Compiling network for Deep Learning FPGA prototyping ...
### Targeting FPGA bitstream zcu102_single ...
### The network includes the following layers:
1 'data' Image Input 224×224×3 images with 'zscore' normalization (SW Layer)
2 'conv1' Convolution 64 7×7×3 convolutions with stride [2 2] and padding [3 3 3 3] (HW Layer)
3 'bn_conv1' Batch Normalization Batch normalization with 64 channels (HW Layer)
4 'conv1_relu' ReLU ReLU (HW Layer)
5 'pool1' Max Pooling 3×3 max pooling with stride [2 2] and padding [1 1 1 1] (HW Layer)
6 'res2a_branch2a' Convolution 64 3×3×64 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
7 'bn2a_branch2a' Batch Normalization Batch normalization with 64 channels (HW Layer)
8 'res2a_branch2a_relu' ReLU ReLU (HW Layer)
9 'res2a_branch2b' Convolution 64 3×3×64 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
10 'bn2a_branch2b' Batch Normalization Batch normalization with 64 channels (HW Layer)
11 'res2a' Addition Element-wise addition of 2 inputs (HW Layer)
12 'res2a_relu' ReLU ReLU (HW Layer)
13 'res2b_branch2a' Convolution 64 3×3×64 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
14 'bn2b_branch2a' Batch Normalization Batch normalization with 64 channels (HW Layer)
15 'res2b_branch2a_relu' ReLU ReLU (HW Layer)
16 'res2b_branch2b' Convolution 64 3×3×64 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
17 'bn2b_branch2b' Batch Normalization Batch normalization with 64 channels (HW Layer)
18 'res2b' Addition Element-wise addition of 2 inputs (HW Layer)
19 'res2b_relu' ReLU ReLU (HW Layer)
20 'res3a_branch2a' Convolution 128 3×3×64 convolutions with stride [2 2] and padding [1 1 1 1] (HW Layer)
21 'bn3a_branch2a' Batch Normalization Batch normalization with 128 channels (HW Layer)
22 'res3a_branch2a_relu' ReLU ReLU (HW Layer)
23 'res3a_branch2b' Convolution 128 3×3×128 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
24 'bn3a_branch2b' Batch Normalization Batch normalization with 128 channels (HW Layer)
25 'res3a' Addition Element-wise addition of 2 inputs (HW Layer)
26 'res3a_relu' ReLU ReLU (HW Layer)
27 'res3a_branch1' Convolution 128 1×1×64 convolutions with stride [2 2] and padding [0 0 0 0] (HW Layer)
28 'bn3a_branch1' Batch Normalization Batch normalization with 128 channels (HW Layer)
29 'res3b_branch2a' Convolution 128 3×3×128 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
30 'bn3b_branch2a' Batch Normalization Batch normalization with 128 channels (HW Layer)
31 'res3b_branch2a_relu' ReLU ReLU (HW Layer)
32 'res3b_branch2b' Convolution 128 3×3×128 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
33 'bn3b_branch2b' Batch Normalization Batch normalization with 128 channels (HW Layer)
34 'res3b' Addition Element-wise addition of 2 inputs (HW Layer)
35 'res3b_relu' ReLU ReLU (HW Layer)
36 'res4a_branch2a' Convolution 256 3×3×128 convolutions with stride [2 2] and padding [1 1 1 1] (HW Layer)
37 'bn4a_branch2a' Batch Normalization Batch normalization with 256 channels (HW Layer)
38 'res4a_branch2a_relu' ReLU ReLU (HW Layer)
39 'res4a_branch2b' Convolution 256 3×3×256 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
40 'bn4a_branch2b' Batch Normalization Batch normalization with 256 channels (HW Layer)
41 'res4a' Addition Element-wise addition of 2 inputs (HW Layer)
42 'res4a_relu' ReLU ReLU (HW Layer)
43 'res4a_branch1' Convolution 256 1×1×128 convolutions with stride [2 2] and padding [0 0 0 0] (HW Layer)
44 'bn4a_branch1' Batch Normalization Batch normalization with 256 channels (HW Layer)
45 'res4b_branch2a' Convolution 256 3×3×256 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
46 'bn4b_branch2a' Batch Normalization Batch normalization with 256 channels (HW Layer)
47 'res4b_branch2a_relu' ReLU ReLU (HW Layer)
48 'res4b_branch2b' Convolution 256 3×3×256 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
49 'bn4b_branch2b' Batch Normalization Batch normalization with 256 channels (HW Layer)
50 'res4b' Addition Element-wise addition of 2 inputs (HW Layer)
51 'res4b_relu' ReLU ReLU (HW Layer)
52 'res5a_branch2a' Convolution 512 3×3×256 convolutions with stride [2 2] and padding [1 1 1 1] (HW Layer)
53 'bn5a_branch2a' Batch Normalization Batch normalization with 512 channels (HW Layer)
54 'res5a_branch2a_relu' ReLU ReLU (HW Layer)
55 'res5a_branch2b' Convolution 512 3×3×512 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
56 'bn5a_branch2b' Batch Normalization Batch normalization with 512 channels (HW Layer)
57 'res5a' Addition Element-wise addition of 2 inputs (HW Layer)
58 'res5a_relu' ReLU ReLU (HW Layer)
59 'res5a_branch1' Convolution 512 1×1×256 convolutions with stride [2 2] and padding [0 0 0 0] (HW Layer)
60 'bn5a_branch1' Batch Normalization Batch normalization with 512 channels (HW Layer)
61 'res5b_branch2a' Convolution 512 3×3×512 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
62 'bn5b_branch2a' Batch Normalization Batch normalization with 512 channels (HW Layer)
63 'res5b_branch2a_relu' ReLU ReLU (HW Layer)
64 'res5b_branch2b' Convolution 512 3×3×512 convolutions with stride [1 1] and padding [1 1 1 1] (HW Layer)
65 'bn5b_branch2b' Batch Normalization Batch normalization with 512 channels (HW Layer)
66 'res5b' Addition Element-wise addition of 2 inputs (HW Layer)
67 'res5b_relu' ReLU ReLU (HW Layer)
68 'pool5' Global Average Pooling Global average pooling (HW Layer)
69 'fc1000' Fully Connected 1000 fully connected layer (HW Layer)
70 'prob' Softmax softmax (SW Layer)
71 'ClassificationLayer_predictions' Classification Output crossentropyex with 'tench' and 999 other classes (SW Layer)
### Optimizing series network: Fused 'nnet.cnn.layer.BatchNormalizationLayer' into 'nnet.cnn.layer.Convolution2DLayer'
5 Memory Regions created.
Skipping: data
Compiling leg: conv1>>pool1 ...
Compiling leg: conv1>>pool1 ... complete.
Compiling leg: res2a_branch2a>>res2a_branch2b ...
Compiling leg: res2a_branch2a>>res2a_branch2b ... complete.
Compiling leg: res2b_branch2a>>res2b_branch2b ...
Compiling leg: res2b_branch2a>>res2b_branch2b ... complete.
Compiling leg: res3a_branch2a>>res3a_branch2b ...
Compiling leg: res3a_branch2a>>res3a_branch2b ... complete.
Compiling leg: res3a_branch1 ...
Compiling leg: res3a_branch1 ... complete.
Compiling leg: res3b_branch2a>>res3b_branch2b ...
Compiling leg: res3b_branch2a>>res3b_branch2b ... complete.
Compiling leg: res4a_branch2a>>res4a_branch2b ...
Compiling leg: res4a_branch2a>>res4a_branch2b ... complete.
Compiling leg: res4a_branch1 ...
Compiling leg: res4a_branch1 ... complete.
Compiling leg: res4b_branch2a>>res4b_branch2b ...
Compiling leg: res4b_branch2a>>res4b_branch2b ... complete.
Compiling leg: res5a_branch2a>>res5a_branch2b ...
Compiling leg: res5a_branch2a>>res5a_branch2b ... complete.
Compiling leg: res5a_branch1 ...
Compiling leg: res5a_branch1 ... complete.
Compiling leg: res5b_branch2a>>res5b_branch2b ...
Compiling leg: res5b_branch2a>>res5b_branch2b ... complete.
Compiling leg: pool5 ...
Compiling leg: pool5 ... complete.
Compiling leg: fc1000 ...
Compiling leg: fc1000 ... complete.
Skipping: prob
Skipping: ClassificationLayer_predictions
Creating Schedule...
...........................
Creating Schedule...complete.
Creating Status Table...
..........................
Creating Status Table...complete.
Emitting Schedule...
..........................
Emitting Schedule...complete.
Emitting Status Table...
............................
Emitting Status Table...complete.
### Allocating external memory buffers:
offset_name offset_address allocated_space
_______________________ ______________ _________________
"InputDataOffset" "0x00000000" "24.0 MB"
"OutputResultOffset" "0x01800000" "4.0 MB"
"SchedulerDataOffset" "0x01c00000" "4.0 MB"
"SystemBufferOffset" "0x02000000" "28.0 MB"
"InstructionDataOffset" "0x03c00000" "4.0 MB"
"ConvWeightDataOffset" "0x04000000" "52.0 MB"
"FCWeightDataOffset" "0x07400000" "4.0 MB"
"EndOffset" "0x07800000" "Total: 120.0 MB"
### Network compilation complete.
ans =
struct with fields:
weights: [1×1 struct]
instructions: [1×1 struct]
registers: [1×1 struct]
syncInstructions: [1×1 struct]1. Если смысл перевода понятен, то лучше оставьте как есть и не придирайтесь к словам, синонимам и тому подобному. О вкусах не спорим.
2. Не дополняйте перевод комментариями “от себя”. В исправлении не должно появляться дополнительных смыслов и комментариев, отсутствующих в оригинале. Такие правки не получится интегрировать в алгоритме автоматического перевода.
3. Сохраняйте структуру оригинального текста - например, не разбивайте одно предложение на два.
4. Не имеет смысла однотипное исправление перевода какого-то термина во всех предложениях. Исправляйте только в одном месте. Когда Вашу правку одобрят, это исправление будет алгоритмически распространено и на другие части документации.
5. По иным вопросам, например если надо исправить заблокированное для перевода слово, обратитесь к редакторам через форму технической поддержки.