compile

Класс: dlhdl. Рабочий процесс
Пакет: dlhdl

Скомпилируйте объект рабочего процесса

Синтаксис

Описание

пример

compile компилирует dlhdl.Workflow и генерирует параметры для развертывания сети на целевом устройстве.

compile(Name,Value) компилирует dlhdl.Workflow объект и генерирует параметры для развертывания сети на целевом устройстве с дополнительными опциями, заданными одним или несколькими Name,Value аргументы в виде пар.

Функция возвращает две матрицы. Одна матрица описывает слои сети. The Conv Controller (Scheduling) и FC Controller (Scheduling) модули в процессоре глубокого обучения IP используют эту матрицу для планирования операций свертки и полносвязного слоя. Вторая матрица содержит веса, смещения и входы нейронной сети. Эта информация загружается в память DDR и используется Generic Convolution Processor и Generic FC Processor в процессоре глубокого обучения.

Входные параметры

Аргументы в виде пар имя-значение

Задайте необязательные разделенные разделенными запятой парами Name,Value аргументы. Name - имя аргумента и Value - соответствующее значение. Name должны находиться внутри кавычек. Можно задать несколько аргументов в виде пар имен и значений в любом порядке Name1,Value1,...,NameN,ValueN.

Параметр для задания максимального предела входного числа входных кадров для вычисления распределения доступа к памяти DDR.

Пример: 'InputFrameNumberLimit',30

Примеры

расширить все

Скомпилируйте dlhdl.Workflow объект, для развертывания в Intel® Arria® 10 Комплект для разработки SoC, который имеет single типы данных.

Создайте dlhdl.Workflow Объект и затем используйте compile функция для развертывания предварительно обученной сети на целевом компьютере.

snet = vgg19;
hT = dlhdl.Target('Intel');
hW = dlhdl.Workflow('network', snet, 'Bitstream', 'arria10soc_single','Target',hT);
hW.compile

После выполнения кода результат следующий:

  hW.compile
          offset_name          offset_address     allocated_space 
    _______________________    ______________    _________________

    "InputDataOffset"           "0x00000000"     "24.0 MB"        
    "OutputResultOffset"        "0x01800000"     "4.0 MB"         
    "SystemBufferOffset"        "0x01c00000"     "52.0 MB"        
    "InstructionDataOffset"     "0x05000000"     "20.0 MB"        
    "ConvWeightDataOffset"      "0x06400000"     "276.0 MB"       
    "FCWeightDataOffset"        "0x17800000"     "472.0 MB"       
    "EndOffset"                 "0x35000000"     "Total: 848.0 MB"


ans = 

  struct with fields:

       Operators: [1×1 struct]
    LayerConfigs: [1×1 struct]
      NetConfigs: [1×1 struct]

 

  1. Создайте dlhdl.Workflow Объект и затем используйте compile функция с необязательным аргументом InputFrameNumberLimit развертывание предварительно обученной сети на целевом компьютере.

    snet = alexnet;
    hT = dlhdl.Target('Xilinx');
    hW = dlhdl.Workflow('network', snet, 'Bitstream', 'zcu102_single','Target',hT);
    hW.compile('InputFrameNumberLimit',30);
  2. Результатом выполнения кода является:

    ### Compiling network for Deep Learning FPGA prototyping ...
    ### Targeting FPGA bitstream zcu102_single ...
    ### The network includes the following layers:
    
         1   'data'     Image Input                   227×227×3 images with 'zerocenter' normalization                                  (SW Layer)
         2   'conv1'    Convolution                   96 11×11×3 convolutions with stride [4  4] and padding [0  0  0  0]               (HW Layer)
         3   'relu1'    ReLU                          ReLU                                                                              (HW Layer)
         4   'norm1'    Cross Channel Normalization   cross channel normalization with 5 channels per element                           (HW Layer)
         5   'pool1'    Max Pooling                   3×3 max pooling with stride [2  2] and padding [0  0  0  0]                       (HW Layer)
         6   'conv2'    Grouped Convolution           2 groups of 128 5×5×48 convolutions with stride [1  1] and padding [2  2  2  2]   (HW Layer)
         7   'relu2'    ReLU                          ReLU                                                                              (HW Layer)
         8   'norm2'    Cross Channel Normalization   cross channel normalization with 5 channels per element                           (HW Layer)
         9   'pool2'    Max Pooling                   3×3 max pooling with stride [2  2] and padding [0  0  0  0]                       (HW Layer)
        10   'conv3'    Convolution                   384 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]              (HW Layer)
        11   'relu3'    ReLU                          ReLU                                                                              (HW Layer)
        12   'conv4'    Grouped Convolution           2 groups of 192 3×3×192 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        13   'relu4'    ReLU                          ReLU                                                                              (HW Layer)
        14   'conv5'    Grouped Convolution           2 groups of 128 3×3×192 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        15   'relu5'    ReLU                          ReLU                                                                              (HW Layer)
        16   'pool5'    Max Pooling                   3×3 max pooling with stride [2  2] and padding [0  0  0  0]                       (HW Layer)
        17   'fc6'      Fully Connected               4096 fully connected layer                                                        (HW Layer)
        18   'relu6'    ReLU                          ReLU                                                                              (HW Layer)
        19   'drop6'    Dropout                       50% dropout                                                                       (HW Layer)
        20   'fc7'      Fully Connected               4096 fully connected layer                                                        (HW Layer)
        21   'relu7'    ReLU                          ReLU                                                                              (HW Layer)
        22   'drop7'    Dropout                       50% dropout                                                                       (HW Layer)
        23   'fc8'      Fully Connected               1000 fully connected layer                                                        (HW Layer)
        24   'prob'     Softmax                       softmax                                                                           (SW Layer)
        25   'output'   Classification Output         crossentropyex with 'tench' and 999 other classes                                 (SW Layer)
    
    3 Memory Regions created.
    
    Skipping: data
    Compiling leg: conv1>>pool5 ...
    Compiling leg: conv1>>pool5 ... complete.
    Compiling leg: fc6>>fc8 ...
    Compiling leg: fc6>>fc8 ... complete.
    Skipping: prob
    Skipping: output
    Creating Schedule...
    .......
    Creating Schedule...complete.
    Creating Status Table...
    ......
    Creating Status Table...complete.
    Emitting Schedule...
    ......
    Emitting Schedule...complete.
    Emitting Status Table...
    ........
    Emitting Status Table...complete.
    
    ### Allocating external memory buffers:
    
              offset_name          offset_address     allocated_space 
        _______________________    ______________    _________________
    
        "InputDataOffset"           "0x00000000"     "24.0 MB"        
        "OutputResultOffset"        "0x01800000"     "4.0 MB"         
        "SchedulerDataOffset"       "0x01c00000"     "4.0 MB"         
        "SystemBufferOffset"        "0x02000000"     "28.0 MB"        
        "InstructionDataOffset"     "0x03c00000"     "4.0 MB"         
        "ConvWeightDataOffset"      "0x04000000"     "16.0 MB"        
        "FCWeightDataOffset"        "0x05000000"     "224.0 MB"       
        "EndOffset"                 "0x13000000"     "Total: 304.0 MB"
    
    ### Network compilation complete.
     

  1. Создайте dlhdl.Workflow объект с resnet18 как сеть для развертывания на Xilinx® Zynq® UltraScale+™ плату ZCU102 MPSoC, которая использует single типы данных.

    snet = resnet18;
    hTarget = dlhdl.Target('Xilinx');
    hW = dlhdl.Workflow('N',snet,'B','zcu102_single','T',hTarget);
  2. Вызовите compile функция на hW

    hW.compile

    Вызов compile function, возвращает:

    ### Compiling network for Deep Learning FPGA prototyping ...
    ### Targeting FPGA bitstream zcu102_single ...
    ### The network includes the following layers:
    
         1   'data'                              Image Input              224×224×3 images with 'zscore' normalization                          (SW Layer)
         2   'conv1'                             Convolution              64 7×7×3 convolutions with stride [2  2] and padding [3  3  3  3]     (HW Layer)
         3   'bn_conv1'                          Batch Normalization      Batch normalization with 64 channels                                  (HW Layer)
         4   'conv1_relu'                        ReLU                     ReLU                                                                  (HW Layer)
         5   'pool1'                             Max Pooling              3×3 max pooling with stride [2  2] and padding [1  1  1  1]           (HW Layer)
         6   'res2a_branch2a'                    Convolution              64 3×3×64 convolutions with stride [1  1] and padding [1  1  1  1]    (HW Layer)
         7   'bn2a_branch2a'                     Batch Normalization      Batch normalization with 64 channels                                  (HW Layer)
         8   'res2a_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
         9   'res2a_branch2b'                    Convolution              64 3×3×64 convolutions with stride [1  1] and padding [1  1  1  1]    (HW Layer)
        10   'bn2a_branch2b'                     Batch Normalization      Batch normalization with 64 channels                                  (HW Layer)
        11   'res2a'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        12   'res2a_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        13   'res2b_branch2a'                    Convolution              64 3×3×64 convolutions with stride [1  1] and padding [1  1  1  1]    (HW Layer)
        14   'bn2b_branch2a'                     Batch Normalization      Batch normalization with 64 channels                                  (HW Layer)
        15   'res2b_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
        16   'res2b_branch2b'                    Convolution              64 3×3×64 convolutions with stride [1  1] and padding [1  1  1  1]    (HW Layer)
        17   'bn2b_branch2b'                     Batch Normalization      Batch normalization with 64 channels                                  (HW Layer)
        18   'res2b'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        19   'res2b_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        20   'res3a_branch2a'                    Convolution              128 3×3×64 convolutions with stride [2  2] and padding [1  1  1  1]   (HW Layer)
        21   'bn3a_branch2a'                     Batch Normalization      Batch normalization with 128 channels                                 (HW Layer)
        22   'res3a_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
        23   'res3a_branch2b'                    Convolution              128 3×3×128 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        24   'bn3a_branch2b'                     Batch Normalization      Batch normalization with 128 channels                                 (HW Layer)
        25   'res3a'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        26   'res3a_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        27   'res3a_branch1'                     Convolution              128 1×1×64 convolutions with stride [2  2] and padding [0  0  0  0]   (HW Layer)
        28   'bn3a_branch1'                      Batch Normalization      Batch normalization with 128 channels                                 (HW Layer)
        29   'res3b_branch2a'                    Convolution              128 3×3×128 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        30   'bn3b_branch2a'                     Batch Normalization      Batch normalization with 128 channels                                 (HW Layer)
        31   'res3b_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
        32   'res3b_branch2b'                    Convolution              128 3×3×128 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        33   'bn3b_branch2b'                     Batch Normalization      Batch normalization with 128 channels                                 (HW Layer)
        34   'res3b'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        35   'res3b_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        36   'res4a_branch2a'                    Convolution              256 3×3×128 convolutions with stride [2  2] and padding [1  1  1  1]  (HW Layer)
        37   'bn4a_branch2a'                     Batch Normalization      Batch normalization with 256 channels                                 (HW Layer)
        38   'res4a_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
        39   'res4a_branch2b'                    Convolution              256 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        40   'bn4a_branch2b'                     Batch Normalization      Batch normalization with 256 channels                                 (HW Layer)
        41   'res4a'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        42   'res4a_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        43   'res4a_branch1'                     Convolution              256 1×1×128 convolutions with stride [2  2] and padding [0  0  0  0]  (HW Layer)
        44   'bn4a_branch1'                      Batch Normalization      Batch normalization with 256 channels                                 (HW Layer)
        45   'res4b_branch2a'                    Convolution              256 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        46   'bn4b_branch2a'                     Batch Normalization      Batch normalization with 256 channels                                 (HW Layer)
        47   'res4b_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
        48   'res4b_branch2b'                    Convolution              256 3×3×256 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        49   'bn4b_branch2b'                     Batch Normalization      Batch normalization with 256 channels                                 (HW Layer)
        50   'res4b'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        51   'res4b_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        52   'res5a_branch2a'                    Convolution              512 3×3×256 convolutions with stride [2  2] and padding [1  1  1  1]  (HW Layer)
        53   'bn5a_branch2a'                     Batch Normalization      Batch normalization with 512 channels                                 (HW Layer)
        54   'res5a_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
        55   'res5a_branch2b'                    Convolution              512 3×3×512 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        56   'bn5a_branch2b'                     Batch Normalization      Batch normalization with 512 channels                                 (HW Layer)
        57   'res5a'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        58   'res5a_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        59   'res5a_branch1'                     Convolution              512 1×1×256 convolutions with stride [2  2] and padding [0  0  0  0]  (HW Layer)
        60   'bn5a_branch1'                      Batch Normalization      Batch normalization with 512 channels                                 (HW Layer)
        61   'res5b_branch2a'                    Convolution              512 3×3×512 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        62   'bn5b_branch2a'                     Batch Normalization      Batch normalization with 512 channels                                 (HW Layer)
        63   'res5b_branch2a_relu'               ReLU                     ReLU                                                                  (HW Layer)
        64   'res5b_branch2b'                    Convolution              512 3×3×512 convolutions with stride [1  1] and padding [1  1  1  1]  (HW Layer)
        65   'bn5b_branch2b'                     Batch Normalization      Batch normalization with 512 channels                                 (HW Layer)
        66   'res5b'                             Addition                 Element-wise addition of 2 inputs                                     (HW Layer)
        67   'res5b_relu'                        ReLU                     ReLU                                                                  (HW Layer)
        68   'pool5'                             Global Average Pooling   Global average pooling                                                (HW Layer)
        69   'fc1000'                            Fully Connected          1000 fully connected layer                                            (HW Layer)
        70   'prob'                              Softmax                  softmax                                                               (SW Layer)
        71   'ClassificationLayer_predictions'   Classification Output    crossentropyex with 'tench' and 999 other classes                     (SW Layer)
    
    ### Optimizing series network: Fused 'nnet.cnn.layer.BatchNormalizationLayer' into 'nnet.cnn.layer.Convolution2DLayer'
    5 Memory Regions created.
    
    Skipping: data
    Compiling leg: conv1>>pool1 ...
    Compiling leg: conv1>>pool1 ... complete.
    Compiling leg: res2a_branch2a>>res2a_branch2b ...
    Compiling leg: res2a_branch2a>>res2a_branch2b ... complete.
    Compiling leg: res2b_branch2a>>res2b_branch2b ...
    Compiling leg: res2b_branch2a>>res2b_branch2b ... complete.
    Compiling leg: res3a_branch2a>>res3a_branch2b ...
    Compiling leg: res3a_branch2a>>res3a_branch2b ... complete.
    Compiling leg: res3a_branch1 ...
    Compiling leg: res3a_branch1 ... complete.
    Compiling leg: res3b_branch2a>>res3b_branch2b ...
    Compiling leg: res3b_branch2a>>res3b_branch2b ... complete.
    Compiling leg: res4a_branch2a>>res4a_branch2b ...
    Compiling leg: res4a_branch2a>>res4a_branch2b ... complete.
    Compiling leg: res4a_branch1 ...
    Compiling leg: res4a_branch1 ... complete.
    Compiling leg: res4b_branch2a>>res4b_branch2b ...
    Compiling leg: res4b_branch2a>>res4b_branch2b ... complete.
    Compiling leg: res5a_branch2a>>res5a_branch2b ...
    Compiling leg: res5a_branch2a>>res5a_branch2b ... complete.
    Compiling leg: res5a_branch1 ...
    Compiling leg: res5a_branch1 ... complete.
    Compiling leg: res5b_branch2a>>res5b_branch2b ...
    Compiling leg: res5b_branch2a>>res5b_branch2b ... complete.
    Compiling leg: pool5 ...
    Compiling leg: pool5 ... complete.
    Compiling leg: fc1000 ...
    Compiling leg: fc1000 ... complete.
    Skipping: prob
    Skipping: ClassificationLayer_predictions
    Creating Schedule...
    ...........................
    Creating Schedule...complete.
    Creating Status Table...
    ..........................
    Creating Status Table...complete.
    Emitting Schedule...
    ..........................
    Emitting Schedule...complete.
    Emitting Status Table...
    ............................
    Emitting Status Table...complete.
    
    ### Allocating external memory buffers:
    
              offset_name          offset_address     allocated_space 
        _______________________    ______________    _________________
    
        "InputDataOffset"           "0x00000000"     "24.0 MB"        
        "OutputResultOffset"        "0x01800000"     "4.0 MB"         
        "SchedulerDataOffset"       "0x01c00000"     "4.0 MB"         
        "SystemBufferOffset"        "0x02000000"     "28.0 MB"        
        "InstructionDataOffset"     "0x03c00000"     "4.0 MB"         
        "ConvWeightDataOffset"      "0x04000000"     "52.0 MB"        
        "FCWeightDataOffset"        "0x07400000"     "4.0 MB"         
        "EndOffset"                 "0x07800000"     "Total: 120.0 MB"
    
    ### Network compilation complete.
    
    
    ans = 
    
      struct with fields:
    
                 weights: [1×1 struct]
            instructions: [1×1 struct]
               registers: [1×1 struct]
        syncInstructions: [1×1 struct]
Введенный в R2020b