Несопоставимость стерео

В этом примере используются:

В этом примере показано, как создать функцию CUDA ® MEX из функции MATLAB ®, которая вычисляет стереопараметрию двух изображений.

Предварительные условия для сторонних производителей

Необходимый

В этом примере создается CUDA MEX со следующими требованиями сторонних производителей.

Графический процессор NVIDIA ® с поддержкой CUDA и совместимый драйвер. Для генерации кода половинной точности устройство графического процессора должно иметь минимальную вычислительную способность 6.0.

Дополнительный

Для построений, отличных от MEX, таких как статические, динамические библиотеки или исполняемые файлы, этот пример имеет следующие дополнительные требования.

Инструментарий NVIDIA.
Переменные среды для компиляторов и библиотек. Дополнительные сведения см. в разделах Аппаратное обеспечение сторонних производителей и Настройка необходимых продуктов.

Проверка среды графического процессора

Чтобы убедиться, что компиляторы и библиотеки, необходимые для выполнения этого примера, настроены правильно, используйте coder.checkGpuInstall функция.

envCfg = coder.gpuEnvConfig('host');
envCfg.BasicCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);

Расчет стереопереходности

stereoDisparity.m функция точки входа принимает два изображения и возвращает стереопараметрию, вычисленную из двух изображений.

type stereoDisparity

%% Modified Algorithm for Stereo Disparity Block Matching
% In this implementation, instead of finding shifted image, indices are 
% mapped accordingly to save memory and some processing. RGBA column major 
% packed data is used as input for compatibility with CUDA intrinsics. 
% Convolution is performed using separable filters (horizontal and then 
% vertical).

function [out_disp] = stereoDisparity(img0,img1) %#codegen

%   Copyright 2017-2019 The MathWorks, Inc.

% GPU code generation pragma
coder.gpu.kernelfun;

%% Stereo Disparity Parameters
% |WIN_RAD| is the radius of the window to be operated. |min_disparity| is 
% the minimum disparity level the search continues for. |max_disparity| is 
% the maximum disparity level the search continues for.
WIN_RAD = 8;
min_disparity = -16;
max_disparity = 0;

%% Image Dimensions for Loop Control
% The number of channels packed are 4 (RGBA) so as nChannels are 4.
[imgHeight,imgWidth]=size(img0);
nChannels = 4;
imgHeight = imgHeight/nChannels;

%% Store the Raw Differences
diff_img = zeros([imgHeight+2*WIN_RAD,imgWidth+2*WIN_RAD],'int32');

% Store the minimum cost
min_cost = zeros([imgHeight,imgWidth],'int32');
min_cost(:,:) = 99999999;

% Store the final disparity
out_disp = zeros([imgHeight,imgWidth],'int16');

%% Filters for Aggregating the Differences
% |filter_h| is the horizontal filter used in separable convolution.
% |filter_v| is the vertical filter used in separable convolution which
% operates on the output of the row convolution.
filt_h = ones([1 17],'int32');
filt_v = ones([17 1],'int32');

% Main Loop that runs for all the disparity levels. This loop is
% expected to run on CPU.
for d=min_disparity:max_disparity
    
    % Find the difference matrix for the current disparity level. Expect
    % this to generate a Kernel function.
    coder.gpu.kernel;
    for colIdx=1:imgWidth+2*WIN_RAD
        coder.gpu.kernel;
        for rowIdx=1:imgHeight+2*WIN_RAD
            % Row index calculation.
            ind_h = rowIdx - WIN_RAD;
            
            % Column indices calculation for left image.
            ind_w1 = colIdx - WIN_RAD;
            
            % Row indices calculation for right image.
            ind_w2 = colIdx + d - WIN_RAD;
            
            % Border clamping for row Indices.
            if ind_h <= 0
                ind_h = 1;
            end
            if ind_h > imgHeight
                ind_h = imgHeight;
            end
            
            % Border clamping for column indices for left image.
            if ind_w1 <= 0
                ind_w1 = 1;
            end
            if ind_w1 > imgWidth
                ind_w1 = imgWidth;
            end
            
            % Border clamping for column indices for right image.
            if ind_w2 <= 0
                ind_w2 = 1;
            end
            if ind_w2 > imgWidth
                ind_w2 = imgWidth;
            end
            
            % In this step, Sum of absolute Differences is performed
            % across tour channels.
            tDiff = int32(0);
            for chIdx = 1:nChannels
                tDiff = tDiff + abs(int32(img0((ind_h-1)*(nChannels)+chIdx,ind_w1))-int32(img1((ind_h-1)*(nChannels)+chIdx,ind_w2)));
            end
            
            % Store the SAD cost into a matrix.
            diff_img(rowIdx,colIdx) = tDiff;
        end
    end
    
    % Aggregating the differences using separable convolution. Expect this
    % to generate two kernels using shared memory.The first kernel is the 
    % convolution with the horizontal kernel and second kernel operates on 
    % its output the column wise convolution.
    cost_v = conv2(diff_img,filt_h,'valid');
    cost = conv2(cost_v,filt_v,'valid');
    
    % This part updates the min_cost matrix with by comparing the values
    % with current disparity level.
    for ll=1:imgWidth
        for kk=1:imgHeight
            % load the cost
            temp_cost = int32(cost(kk,ll));
            
            % Compare against the minimum cost available and store the
            % disparity value.
            if min_cost(kk,ll) > temp_cost
                min_cost(kk,ll) = temp_cost;
                out_disp(kk,ll) = abs(d) + 8;
            end
            
        end
    end
    
end
end

Считывание изображений и упаковка данных в упакованный столбец RGBA - основной заказ

img0 = imread('scene_left.png');
img1 = imread('scene_right.png');

[imgRGB0] = pack_rgbData(img0);
[imgRGB1] = pack_rgbData(img1);

Левое изображение

Правое изображение

Создать код графического процессора

cfg = coder.gpuConfig('mex');
codegen -config cfg -args {imgRGB0, imgRGB1} stereoDisparity;

Code generation successful: To view the report, open('codegen/mex/stereoDisparity/html/report.mldatx').

Выполнить сгенерированный MEX и показать несоответствие выходных данных

out_disp = stereoDisparity_mex(imgRGB0,imgRGB1);
imagesc(out_disp);

Половинная точность

Вычисления в этом примере также можно выполнять в числах с плавающей запятой с полупрецизионной точностью, используя функцию начальной точки stereoDisparityPercision.m. Для создания и выполнения кода с типами данных с полупрецизионной точностью необходима вычислительная способность CUDA 6.0 или выше. Установите ComputeCapability свойство объекта конфигурации кода для '6.0'. Для половинной точности режим выделения памяти (malloc) для генерации кода CUDA должен быть установлен в «Дискретный».

cfg.GpuConfig.ComputeCapability = '6.0';
cfg.GpuConfig.MallocMode = 'Discrete';

Стандарт imread команда представляет RGB-каналы изображений с целыми числами, по одному для каждого пикселя. Целые числа находятся в диапазоне от 0 до 255. Простое приведение входных данных к половинному типу может привести к переполнению во время свертки. В этом случае можно масштабировать изображения до значений от 0 до 1. «imread» представляет RGB-каналы изображений с целыми числами, по одному для каждого пикселя. Целые числа находятся в диапазоне от 0 до 255. Простое приведение входных данных к половинному типу может привести к переполнению во время свертки. В этом случае можно масштабировать изображения до значений от 0 до 1.

img0 = imread('scene_left.png');
img1 = imread('scene_right.png');

[imgRGB0] = half(pack_rgbData(img0))/255;
[imgRGB1] = half(pack_rgbData(img1))/255;

Создание CUDA MEX для функции

Создание кода на stereo_disparity_half_precision.m функция.

codegen -config cfg -args {imgRGB0, imgRGB1} stereoDisparityHalfPrecision;

Code generation successful: To view the report, open('codegen/mex/stereoDisparityHalfPrecision/html/report.mldatx').

См. также

Функции

codegen | coder.checkGpuInstall | coder.gpu.constantMemory | coder.gpu.kernel | coder.gpu.kernelfun | gpucoder.matrixMatrixKernel | gpucoder.stencilKernel

Объекты

coder.CodeConfig | coder.EmbeddedCodeConfig | coder.gpuConfig | coder.gpuEnvConfig

Документация