Монокулярная визуальная одновременная локализация и картография

Скрипт Open Live Script

Визуальная одновременная локализация и картография (vSLAM), относится к процессу вычисления положения и ориентации камеры относительно ее среды, одновременно сопоставляя среду. Процесс использует только визуальные входные параметры от камеры. Приложения для vSLAM включают дополненную реальность, робототехнику и автономное управление автомобилем.

В этом примере показано, как обработать данные изображения от монокулярной камеры, чтобы создать карту внутренней среды и оценить траекторию камеры. Пример использует ORB-SLAM [1], который является основанным на функции vSLAM алгоритмом.

Чтобы ускорить расчеты, можно включить параллельные вычисления из диалогового окна Computer Vision Toolbox Preferences. Чтобы открыть настройки Computer Vision Toolbox™, на вкладке Home, в разделе Environment, нажимают Preferences. Затем выберите Computer Vision Toolbox.

Глоссарий

Следующие термины часто используются в этом примере:

Ключевые кадры: подмножество видеокадров, которые содержат сигналы для локализации и отслеживания. Два последовательных ключевых кадра обычно включают достаточное визуальное изменение.
Точки карты: список 3-D точек, которые представляют карту среды, восстановленной от ключевых кадров.
График Covisibility: график, состоящий из ключевого кадра как узлы. Два ключевых кадра соединяются ребром, если они совместно используют общие точки карты. Вес ребра является количеством разделяемых точек карты.
Существенный График: подграф covisibility графика, содержащего только ребра с высоким весом, i.e. более разделяемые точки карты.
База данных распознавания: база данных раньше распознавала, посетили ли место в прошлом. База данных хранит визуальное отображение слова к изображению на основе входного набора признаков. Это используется, чтобы искать изображение, которое визуально похоже на изображение запроса.

Обзор ORB-SLAM

Конвейер ORB-SLAM включает:

Инициализация карты: ORB-SLAM запускается путем инициализации карты 3-D точек от двух видеокадров. 3-D точки и относительное положение камеры вычисляются с помощью триангуляции на основе 2D соответствий функции ORB.
Отслеживание: Если карта инициализируется для каждой новой системы координат, положение камеры оценивается путем соответствия с функциями в текущей системе координат к функциям в последнем ключевом кадре. Предполагаемое положение камеры усовершенствовано путем отслеживания локальной карты.
Локальное Отображение: текущая система координат используется, чтобы создать новые 3-D точки карты, если она идентифицирована как ключевой кадр. На данном этапе свяжитесь, корректировка используется, чтобы минимизировать ошибки перепроекции путем корректировки положения камеры и 3-D точек.
Закрытие цикла: Циклы обнаруживаются для каждого ключевого кадра путем сравнения его против всех предыдущих ключевых кадров с помощью подхода набора признаков. Если закрытие цикла обнаруживается, график положения оптимизирован, чтобы совершенствовать положения камеры всех ключевых кадров.

Загрузите и исследуйте входную последовательность изображений

Данные, используемые в этом примере, от Сравнительный тест RGB-D TUM [2]. Можно загрузить данные на временную директорию с помощью веб-браузера или путем выполнения следующего кода:

baseDownloadURL = 'https://vision.in.tum.de/rgbd/dataset/freiburg3/rgbd_dataset_freiburg3_long_office_household.tgz'; 
dataFolder      = fullfile(tempdir, 'tum_rgbd_dataset', filesep); 
options         = weboptions('Timeout', Inf);
tgzFileName     = [dataFolder, 'fr3_office.tgz'];
folderExists    = exist(dataFolder, 'dir');

% Create a folder in a temporary directory to save the downloaded file
if ~folderExists  
    mkdir(dataFolder); 
    disp('Downloading fr3_office.tgz (1.38 GB). This download can take a few minutes.') 
    websave(tgzFileName, baseDownloadURL, options); 
    
    % Extract contents of the downloaded file
    disp('Extracting fr3_office.tgz (1.38 GB) ...') 
    untar(tgzFileName, dataFolder); 
end

Создайте imageDatastore объект смотреть изображения RGB.

imageFolder   = [dataFolder,'rgbd_dataset_freiburg3_long_office_household/rgb/'];
imds          = imageDatastore(imageFolder);

% Inspect the first image
currFrameIdx = 1;
currI = readimage(imds, currFrameIdx);
himage = imshow(currI);

Сопоставьте инициализацию

Конвейер ORB-SLAM запускается путем инициализации карты, которая содержит 3-D мировые точки. Этот шаг крайне важен и оказывает значительное влияние на точность итогового результата SLAM. Начальные соответствия характерной точки ORB найдены с помощью matchFeatures между парой изображений. После того, как соответствия найдены, две модели геометрического преобразования используются, чтобы установить инициализацию карты:

Homography: Если сцена является плоской, homography, проективное преобразование является лучшим выбором описать соответствия характерной точки.
Основная Матрица: Если сцена является неплоской, основная матрица должна использоваться вместо этого.

homography и основная матрица могут быть вычислены с помощью estimateGeometricTransform2D и estimateFundamentalMatrix, соответственно. Модель, которая приводит к меньшей ошибке перепроекции, выбрана, чтобы оценить относительное вращение и перевод между двумя системами координат с помощью relativeCameraPose. Поскольку изображения RGB взяты монокулярной камерой, которая не предоставляет информацию о глубине, относительный перевод может только быть восстановлен до определенного масштабного коэффициента.

Учитывая относительное положение камеры и совпадающие характерные точки в двух изображениях, 3-D местоположения совпадающих точек определяются с помощью triangulate функция. Триангулированная точка карты допустима, когда она расположена перед обеими камерами, когда ее ошибка перепроекции является низкой, и когда параллакс двух представлений точки является достаточно большим.

% Set random seed for reproducibility
rng(0);

% Create a cameraIntrinsics object to store the camera intrinsic parameters.
% The intrinsics for the dataset can be found at the following page:
% https://vision.in.tum.de/data/datasets/rgbd-dataset/file_formats
% Note that the images in the dataset are already undistorted, hence there
% is no need to specify the distortion coefficients.
focalLength    = [535.4, 539.2];    % in units of pixels
principalPoint = [320.1, 247.6];    % in units of pixels
imageSize      = size(currI,[1 2]);  % in units of pixels
intrinsics     = cameraIntrinsics(focalLength, principalPoint, imageSize);

% Detect and extract ORB features
scaleFactor = 1.2;
numLevels   = 8;
[preFeatures, prePoints] = helperDetectAndExtractFeatures(currI, scaleFactor, numLevels); 

currFrameIdx = currFrameIdx + 1;
firstI       = currI; % Preserve the first frame 

isMapInitialized  = false;

% Map initialization loop
while ~isMapInitialized && currFrameIdx < numel(imds.Files)
    currI = readimage(imds, currFrameIdx);

    [currFeatures, currPoints] = helperDetectAndExtractFeatures(currI, scaleFactor, numLevels); 
    
    currFrameIdx = currFrameIdx + 1;
    
    % Find putative feature matches
    indexPairs = matchFeatures(preFeatures, currFeatures, 'Unique', true, ...
        'MaxRatio', 0.9, 'MatchThreshold', 40);
    
    preMatchedPoints  = prePoints(indexPairs(:,1),:);
    currMatchedPoints = currPoints(indexPairs(:,2),:);

    % If not enough matches are found, check the next frame
    minMatches = 100;
    if size(indexPairs, 1) < minMatches
        continue
    end
    
    preMatchedPoints  = prePoints(indexPairs(:,1),:);
    currMatchedPoints = currPoints(indexPairs(:,2),:);
    
    % Compute homography and evaluate reconstruction
    [tformH, scoreH, inliersIdxH] = helperComputeHomography(preMatchedPoints, currMatchedPoints);

    % Compute fundamental matrix and evaluate reconstruction
    [tformF, scoreF, inliersIdxF] = helperComputeFundamentalMatrix(preMatchedPoints, currMatchedPoints);
    
    % Select the model based on a heuristic
    ratio = scoreH/(scoreH + scoreF);
    ratioThreshold = 0.45;
    if ratio > ratioThreshold
        inlierTformIdx = inliersIdxH;
        tform          = tformH;
    else
        inlierTformIdx = inliersIdxF;
        tform          = tformF;
    end

    % Computes the camera location up to scale. Use half of the 
    % points to reduce computation
    inlierPrePoints  = preMatchedPoints(inlierTformIdx);
    inlierCurrPoints = currMatchedPoints(inlierTformIdx);
    [relOrient, relLoc, validFraction] = relativeCameraPose(tform, intrinsics, ...
        inlierPrePoints(1:2:end), inlierCurrPoints(1:2:end));
    
    % If not enough inliers are found, move to the next frame
    if validFraction < 0.9 || numel(size(relOrient))==3
        continue
    end
    
    % Triangulate two views to obtain 3-D map points
    relPose = rigid3d(relOrient, relLoc);
    minParallax = 1; % In degrees
    [isValid, xyzWorldPoints, inlierTriangulationIdx] = helperTriangulateTwoFrames(...
        rigid3d, relPose, inlierPrePoints, inlierCurrPoints, intrinsics, minParallax);
    
    if ~isValid
        continue
    end
    
    % Get the original index of features in the two key frames
    indexPairs = indexPairs(inlierTformIdx(inlierTriangulationIdx),:);
    
    isMapInitialized = true;
    
    disp(['Map initialized with frame 1 and frame ', num2str(currFrameIdx-1)])
end % End of map initialization loop

Map initialized with frame 1 and frame 26

if isMapInitialized
    close(himage.Parent.Parent); % Close the previous figure
    % Show matched features
    hfeature = showMatchedFeatures(firstI, currI, prePoints(indexPairs(:,1)), ...
        currPoints(indexPairs(:, 2)), 'Montage');
else
    error('Unable to initialize map.')
end

Сохраните начальные ключевые кадры и сопоставьте точки

После того, как карта инициализируется с помощью двух систем координат, можно использовать imageviewset, worldpointset и helperViewDirectionAndDepth сохранить эти два ключевых кадра и соответствующие точки карты:

imageviewset хранит ключевые кадры и их атрибуты, такие как дескрипторы ORB, характерные точки и положения камеры и связи между ключевыми кадрами, такими как соответствия характерных точек и относительные положения камеры. Это также создает и обновляет график положения. Абсолютные положения камеры и относительные положения камеры ребер одометрии хранятся как rigid3d объекты. Относительные положения камеры ребер закрытия цикла хранятся как affine3d объекты.
worldpointset хранит 3-D положения точек карты и 3-D в 2D соответствия проекции: которые сопоставляют точки, наблюдаются в ключевом кадре и какие ключевые кадры наблюдают точку карты.
helperViewDirectionAndDepth хранит другие атрибуты точек карты, такие как среднее направление представления, представительные дескрипторы ORB и область значений расстояния, на котором может наблюдаться точка карты.

% Create an empty imageviewset object to store key frames
vSetKeyFrames = imageviewset;

% Create an empty worldpointset object to store 3-D map points
mapPointSet   = worldpointset;

% Create a helperViewDirectionAndDepth object to store view direction and depth 
directionAndDepth = helperViewDirectionAndDepth(size(xyzWorldPoints, 1));

% Add the first key frame. Place the camera associated with the first 
% key frame at the origin, oriented along the Z-axis
preViewId     = 1;
vSetKeyFrames = addView(vSetKeyFrames, preViewId, rigid3d, 'Points', prePoints,...
    'Features', preFeatures.Features);

% Add the second key frame
currViewId    = 2;
vSetKeyFrames = addView(vSetKeyFrames, currViewId, relPose, 'Points', currPoints,...
    'Features', currFeatures.Features);

% Add connection between the first and the second key frame
vSetKeyFrames = addConnection(vSetKeyFrames, preViewId, currViewId, relPose, 'Matches', indexPairs);

% Add 3-D map points
[mapPointSet, newPointIdx] = addWorldPoints(mapPointSet, xyzWorldPoints);

% Add observations of the map points
preLocations  = prePoints.Location;
currLocations = currPoints.Location;
preScales     = prePoints.Scale;
currScales    = currPoints.Scale;

% Add image points corresponding to the map points in the first key frame
mapPointSet   = addCorrespondences(mapPointSet, preViewId, newPointIdx, indexPairs(:,1));

% Add image points corresponding to the map points in the second key frame
mapPointSet   = addCorrespondences(mapPointSet, currViewId, newPointIdx, indexPairs(:,2));

Совершенствуйте и визуализируйте начальную реконструкцию

Совершенствуйте начальную реконструкцию с помощью bundleAdjustment, это оптимизирует и положения камеры и точки мира, чтобы минимизировать полные ошибки перепроекции. После улучшения обновляются атрибуты точек карты включая 3-D местоположения, направление представления и область значений глубины. Можно использовать helperVisualizeMotionAndStructure визуализировать точки карты и местоположения камеры.

% Run full bundle adjustment on the first two key frames
tracks       = findTracks(vSetKeyFrames);
cameraPoses  = poses(vSetKeyFrames);

[refinedPoints, refinedAbsPoses] = bundleAdjustment(xyzWorldPoints, tracks, ...
    cameraPoses, intrinsics, 'FixedViewIDs', 1, ...
    'PointsUndistorted', true, 'AbsoluteTolerance', 1e-7,...
    'RelativeTolerance', 1e-15, 'MaxIteration', 50);

% Scale the map and the camera pose using the median depth of map points
medianDepth   = median(vecnorm(refinedPoints.'));
refinedPoints = refinedPoints / medianDepth;

refinedAbsPoses.AbsolutePose(currViewId).Translation = ...
    refinedAbsPoses.AbsolutePose(currViewId).Translation / medianDepth;
relPose.Translation = relPose.Translation/medianDepth;

% Update key frames with the refined poses
vSetKeyFrames = updateView(vSetKeyFrames, refinedAbsPoses);
vSetKeyFrames = updateConnection(vSetKeyFrames, preViewId, currViewId, relPose);

% Update map points with the refined positions
mapPointSet   = updateWorldPoints(mapPointSet, newPointIdx, refinedPoints);

% Update view direction and depth 
directionAndDepth = update(directionAndDepth, mapPointSet, vSetKeyFrames.Views, newPointIdx, true);

% Visualize matched features in the current frame
close(hfeature.Parent.Parent);
featurePlot   = helperVisualizeMatchedFeatures(currI, currPoints(indexPairs(:,2)));

% Visualize initial map points and camera trajectory
mapPlot       = helperVisualizeMotionAndStructure(vSetKeyFrames, mapPointSet);

% Show legend
showLegend(mapPlot);

Отслеживание

Процесс отслеживания выполняется с помощью каждой системы координат и определяет, когда вставить новый ключевой кадр. Чтобы упростить этот пример, мы отключим процесс отслеживания, если закрытие цикла найдено.

% ViewId of the current key frame
currKeyFrameId   = currViewId;

% ViewId of the last key frame
lastKeyFrameId   = currViewId;

% ViewId of the reference key frame that has the most co-visible 
% map points with the current key frame
refKeyFrameId    = currViewId;

% Index of the last key frame in the input image sequence
lastKeyFrameIdx  = currFrameIdx - 1; 

% Indices of all the key frames in the input image sequence
addedFramesIdx   = [1; lastKeyFrameIdx];

isLoopClosed     = false;

Каждый кадр обрабатывается можно следующим образом:

Функции ORB извлечены для каждой новой системы координат и затем соответствующие (использование matchFeatures), с функциями в последнем ключевом кадре, которые знали соответствующие 3-D точки карты.
Оцените положение камеры с алгоритмом Перспективного n значений с помощью estimateWorldCameraPose.
Учитывая положение камеры, спроектируйте точки карты, наблюдаемые последним ключевым кадром в текущую систему координат, и ищите соответствия функции с помощью matchFeaturesInRadius.
С 3-D к 2D соответствию в текущей системе координат совершенствуйте положение камеры путем выполнения корректировки пакета только для движения с помощью bundleAdjustmentMotion.
Спроектируйте локальные точки карты в текущую систему координат, чтобы искать больше соответствий функции с помощью matchFeaturesInRadius и совершенствуйте положение камеры снова с помощью bundleAdjustmentMotion.
Последний шаг отслеживания должен решить, является ли текущая система координат новым ключевым кадром. Если текущая система координат является ключевым кадром, продолжите к Локальному процессу Отображения. В противном случае начните Отслеживать для следующей системы координат.

% Main loop
while ~isLoopClosed && currFrameIdx < numel(imds.Files)  
    currI = readimage(imds, currFrameIdx);

    [currFeatures, currPoints] = helperDetectAndExtractFeatures(currI, scaleFactor, numLevels);

    % Track the last key frame
    % mapPointsIdx:   Indices of the map points observed in the current frame
    % featureIdx:     Indices of the corresponding feature points in the 
    %                 current frame
    [currPose, mapPointsIdx, featureIdx] = helperTrackLastKeyFrame(mapPointSet, ...
        vSetKeyFrames.Views, currFeatures, currPoints, lastKeyFrameId, intrinsics, scaleFactor);
    
    % Track the local map
    % refKeyFrameId:      ViewId of the reference key frame that has the most 
    %                     co-visible map points with the current frame
    % localKeyFrameIds:   ViewId of the connected key frames of the current frame
    [refKeyFrameId, localKeyFrameIds, currPose, mapPointsIdx, featureIdx] = ...
        helperTrackLocalMap(mapPointSet, directionAndDepth, vSetKeyFrames, mapPointsIdx, ...
        featureIdx, currPose, currFeatures, currPoints, intrinsics, scaleFactor, numLevels);
    
    % Check if the current frame is a key frame. 
    % A frame is a key frame if both of the following conditions are satisfied:
    %
    % 1. At least 20 frames have passed since the last key frame or the 
    %    current frame tracks fewer than 100 map points
    % 2. The map points tracked by the current frame are fewer than 90% of 
    %    points tracked by the reference key frame
    isKeyFrame = helperIsKeyFrame(mapPointSet, refKeyFrameId, lastKeyFrameIdx, ...
        currFrameIdx, mapPointsIdx);
    
    % Visualize matched features
    updatePlot(featurePlot, currI, currPoints(featureIdx));
    
    if ~isKeyFrame
        currFrameIdx = currFrameIdx + 1;
        continue
    end
    
    % Update current key frame ID
    currKeyFrameId  = currKeyFrameId + 1;

Локальное отображение

Локальное отображение выполняется для каждого ключевого кадра. Когда новый ключевой кадр будет определен, добавьте его в ключевые кадры и обновите атрибуты точек карты, наблюдаемых новым ключевым кадром. Гарантировать тот mapPointSet содержит как можно меньше выбросов, допустимая точка карты должна наблюдаться по крайней мере в 3 ключевых кадрах.

Новые точки карты создаются путем триангулирования характерных точек ORB в текущем ключевом кадре и его связанных ключевых кадрах. Для каждой несопоставленной характерной точки в текущем ключевом кадре ищите соответствие с другими несопоставленными точками в связанных ключевых кадрах с помощью matchFeatures. Локальная корректировка пакета совершенствовала положение текущего ключевого кадра, положения связанных ключевых кадров и все точки карты, наблюдаемые в этих ключевых кадрах.

    % Add the new key frame 
    [mapPointSet, vSetKeyFrames] = helperAddNewKeyFrame(mapPointSet, vSetKeyFrames, ...
        currPose, currFeatures, currPoints, mapPointsIdx, featureIdx, localKeyFrameIds);
    
    % Remove outlier map points that are observed in fewer than 3 key frames
    [mapPointSet, directionAndDepth, mapPointsIdx] = helperCullRecentMapPoints(mapPointSet, ...
        directionAndDepth, mapPointsIdx, newPointIdx);
    
    % Create new map points by triangulation
    minNumMatches = 20;
    minParallax   = 3;
    [mapPointSet, vSetKeyFrames, newPointIdx] = helperCreateNewMapPoints(mapPointSet, vSetKeyFrames, ...
        currKeyFrameId, intrinsics, scaleFactor, minNumMatches, minParallax);
    
    % Update view direction and depth
    directionAndDepth = update(directionAndDepth, mapPointSet, vSetKeyFrames.Views, ...
        [mapPointsIdx; newPointIdx], true);
    
    % Local bundle adjustment
    [mapPointSet, directionAndDepth, vSetKeyFrames, newPointIdx] = helperLocalBundleAdjustment( ...
        mapPointSet, directionAndDepth, vSetKeyFrames, ...
        currKeyFrameId, intrinsics, newPointIdx); 
    
    % Visualize 3D world points and camera trajectory
    updatePlot(mapPlot, vSetKeyFrames, mapPointSet);

Закрытие цикла

Шаг закрытия цикла берет текущий ключевой кадр, обработанный локальным процессом отображения, и пытается обнаружить и замкнуть круг. Обнаружение цикла выполняется с помощью подхода сумок слов. Визуальный словарь, представленный как bagOfFeatures объект создается оффлайн с дескрипторами SURF, извлеченными из большого набора изображений в наборе данных путем вызова:

bag = bagOfFeatures(imds,'CustomExtractor', @helperSURFFeatureExtractorFunction);

где imds imageDatastore объект, хранящий учебные изображения и helperSURFFeatureExtractorFunction функция экстрактора функции SURF. Смотрите Извлечение Изображений с Мешком Визуальных Слов для получения дополнительной информации.

Процесс закрытия цикла инкрементно создает базу данных, представленную как invertedImageIndex объект, который хранит визуальное отображение слова к изображению на основе мешка функций SURF. Кандидаты цикла идентифицированы путем запроса изображений в базе данных, которые визуально похожи на текущий ключевой кадр с помощью evaluateImageRetrieval. Система координат ключа-кандидата допустима, если она не соединяется с последним ключевым кадром, и три из его соседних ключевых кадров являются кандидатами цикла.

Когда допустимый кандидат цикла будет найден, используйте estimateGeometricTransform3D вычислить относительное положение между кандидатом цикла система координат и текущим ключевым кадром. Относительное положение представляет 3-D преобразование подобия, сохраненное в affine3d объект. Затем добавьте связь цикла с относительным положением и обновите mapPointSet и vSetKeyFrames.

    % Initialize the loop closure database
    if currKeyFrameId == 3
        % Load the bag of features data created offline
        bofData         = load('bagOfFeaturesData.mat');
    
        % Initialize the place recognition database
        loopCandidates  = [1; 2];
        loopDatabase    = indexImages(subset(imds, loopCandidates), bofData.bof);
        
    % Check loop closure after some key frames have been created    
    elseif currKeyFrameId > 20
        
        % Minimum number of feature matches of loop edges
        loopEdgeNumMatches = 50;
        
        % Detect possible loop closure key frame candidates
        [isDetected, validLoopCandidates] = helperCheckLoopClosure(vSetKeyFrames, currKeyFrameId, ...
            loopDatabase, currI, loopCandidates, loopEdgeNumMatches);
        
        if isDetected 
            % Add loop closure connections
            [isLoopClosed, mapPointSet, vSetKeyFrames] = helperAddLoopConnections(...
                mapPointSet, vSetKeyFrames, validLoopCandidates, currKeyFrameId, ...
                currFeatures, currPoints, loopEdgeNumMatches);
        end
    end
    
    % If no loop closure is detected, add the image into the database
    if ~isLoopClosed
        addImages(loopDatabase,  subset(imds, currFrameIdx), 'Verbose', false);
        loopCandidates= [loopCandidates; currKeyFrameId]; %#ok<AGROW>
    end
    
    % Update IDs and indices
    lastKeyFrameId  = currKeyFrameId;
    lastKeyFrameIdx = currFrameIdx;
    addedFramesIdx  = [addedFramesIdx; currFrameIdx]; %#ok<AGROW>
    currFrameIdx    = currFrameIdx + 1;
end % End of main loop

Creating an inverted image index using Bag-Of-Features.
-------------------------------------------------------

Encoding images using Bag-Of-Features.
--------------------------------------

* Encoding 2 images...done.
Finished creating the image index.

Loop edge added between keyframe: 6 and 204

Наконец, оптимизация графика положения подобия выполняется по существенному графику в vSetKeyFrames откорректировать дрейф положений камеры. Существенный график создается внутренне путем удаления связей с меньше, чем minNumMatches соответствия в covisibility графике. После оптимизации графика положения подобия обновите 3-D местоположения точек карты с помощью оптимизированных положений и связанных шкал.

% Optimize the poses
minNumMatches      = 30;
[vSetKeyFramesOptim, poseScales] = optimizePoses(vSetKeyFrames, minNumMatches, 'Tolerance', 1e-16);

% Update map points after optimizing the poses
mapPointSet = helperUpdateGlobalMap(mapPointSet, directionAndDepth, ...
    vSetKeyFrames, vSetKeyFramesOptim, poseScales);

updatePlot(mapPlot, vSetKeyFrames, mapPointSet);

% Plot the optimized camera trajectory
optimizedPoses  = poses(vSetKeyFramesOptim);
plotOptimizedTrajectory(mapPlot, optimizedPoses)

% Update legend
showLegend(mapPlot);

Сравните с основной истиной

Можно сравнить оптимизированную траекторию камеры с основной истиной, чтобы оценить точность ORB-SLAM. Загруженные данные содержат groundtruth.txt файл, который хранит основную истину положения камеры каждой системы координат. Данные были сохранены в форме MAT-файла. Можно также вычислить среднеквадратичную ошибку (RMSE) оценок траектории.

% Load ground truth 
gTruthData = load('orbslamGroundTruth.mat');
gTruth     = gTruthData.gTruth;

% Plot the actual camera trajectory 
plotActualTrajectory(mapPlot, gTruth(addedFramesIdx), optimizedPoses);

% Show legend
showLegend(mapPlot);

% Evaluate tracking accuracy
helperEstimateTrajectoryError(gTruth(addedFramesIdx), optimizedPoses);

Absolute RMSE for key frame trajectory (m): 0.25809

Это завершает обзор того, как создать карту внутренней среды и оценить траекторию камеры с помощью ORB-SLAM.

Вспомогательные Функции

Короткие функции помощника включены ниже. Большая функция включена в отдельные файлы.

helperAddLoopConnections добавьте связи между текущим ключевым кадром и допустимым кандидатом цикла.

helperAddNewKeyFrame добавьте ключевые кадры в набор ключевого кадра.

helperCheckLoopClosure обнаружьте кандидатов цикла ключевые кадры путем получения визуально подобных изображений из базы данных.

helperCreateNewMapPoints создайте новые точки карты триангуляцией.

helperLocalBundleAdjustment совершенствуйте положение текущего ключевого кадра и карту окружающей сцены.

helperSURFFeatureExtractorFunction реализует извлечение признаков SURF, используемое в bagOfFeatures.

helperTrackLastKeyFrame оцените текущее положение камеры путем отслеживания последнего ключевого кадра.

helperTrackLocalMap совершенствуйте текущее положение камеры путем отслеживания локальной карты.

helperViewDirectionAndDepth сохраните среднее направление представления и предсказанную глубину точек карты

helperVisualizeMatchedFeatures покажите совпадающие функции в системе координат.

helperVisualizeMotionAndStructure покажите точки карты и траекторию камеры.

helperDetectAndExtractFeatures обнаружьте и извлеките и функции ORB из изображения.

function [features, validPoints] = helperDetectAndExtractFeatures(Irgb, ...
    scaleFactor, numLevels, varargin)

numPoints   = 1000;

% In this example, the images are already undistorted. In a general
% workflow, uncomment the following code to undistort the images.
%
% if nargin > 3
%     intrinsics = varargin{1};
% end
% Irgb  = undistortImage(Irgb, intrinsics);

% Detect ORB features
Igray  = im2gray(Irgb);

points = detectORBFeatures(Igray, 'ScaleFactor', scaleFactor, 'NumLevels', numLevels);

% Select a subset of features, uniformly distributed throughout the image
points = selectUniform(points, numPoints, size(Igray, 1:2));

% Extract features
[features, validPoints] = extractFeatures(Igray, points);
end

helperHomographyScore вычислите homography и оцените реконструкцию.

function [H, score, inliersIndex] = helperComputeHomography(matchedPoints1, matchedPoints2)

[H, inliersLogicalIndex] = estimateGeometricTransform2D( ...
    matchedPoints1, matchedPoints2, 'projective', ...
    'MaxNumTrials', 1e3, 'MaxDistance', 4, 'Confidence', 90);

inlierPoints1 = matchedPoints1(inliersLogicalIndex);
inlierPoints2 = matchedPoints2(inliersLogicalIndex);

inliersIndex  = find(inliersLogicalIndex);

locations1 = inlierPoints1.Location;
locations2 = inlierPoints2.Location;
xy1In2     = transformPointsForward(H, locations1);
xy2In1     = transformPointsInverse(H, locations2);
error1in2  = sum((locations2 - xy1In2).^2, 2);
error2in1  = sum((locations1 - xy2In1).^2, 2);

outlierThreshold = 6;

score = sum(max(outlierThreshold-error1in2, 0)) + ...
    sum(max(outlierThreshold-error2in1, 0));
end

helperFundamentalMatrixScore вычислите основную матрицу и оцените реконструкцию.

function [F, score, inliersIndex] = helperComputeFundamentalMatrix(matchedPoints1, matchedPoints2)

[F, inliersLogicalIndex]   = estimateFundamentalMatrix( ...
    matchedPoints1, matchedPoints2, 'Method','RANSAC',...
    'NumTrials', 1e3, 'DistanceThreshold', 0.01);

inlierPoints1 = matchedPoints1(inliersLogicalIndex);
inlierPoints2 = matchedPoints2(inliersLogicalIndex);

inliersIndex  = find(inliersLogicalIndex);

locations1    = inlierPoints1.Location;
locations2    = inlierPoints2.Location;

% Distance from points to epipolar line
lineIn1   = epipolarLine(F', locations2);
error2in1 = (sum([locations1, ones(size(locations1, 1),1)].* lineIn1, 2)).^2 ...
    ./ sum(lineIn1(:,1:2).^2, 2);
lineIn2   = epipolarLine(F, locations1);
error1in2 = (sum([locations2, ones(size(locations2, 1),1)].* lineIn2, 2)).^2 ...
    ./ sum(lineIn2(:,1:2).^2, 2);

outlierThreshold = 4;

score = sum(max(outlierThreshold-error1in2, 0)) + ...
    sum(max(outlierThreshold-error2in1, 0));

end

helperTriangulateTwoFrames триангулируйте две системы координат, чтобы инициализировать карту.

function [isValid, xyzPoints, inlierIdx] = helperTriangulateTwoFrames(...
    pose1, pose2, matchedPoints1, matchedPoints2, intrinsics, minParallax)

[R1, t1]   = cameraPoseToExtrinsics(pose1.Rotation, pose1.Translation);
camMatrix1 = cameraMatrix(intrinsics, R1, t1);

[R2, t2]   = cameraPoseToExtrinsics(pose2.Rotation, pose2.Translation);
camMatrix2 = cameraMatrix(intrinsics, R2, t2);

[xyzPoints, reprojectionErrors, isInFront] = triangulate(matchedPoints1, ...
    matchedPoints2, camMatrix1, camMatrix2);

% Filter points by view direction and reprojection error
minReprojError = 1;
inlierIdx  = isInFront & reprojectionErrors < minReprojError;
xyzPoints  = xyzPoints(inlierIdx ,:);

% A good two-view with significant parallax
ray1       = xyzPoints - pose1.Translation;
ray2       = xyzPoints - pose2.Translation;
cosAngle   = sum(ray1 .* ray2, 2) ./ (vecnorm(ray1, 2, 2) .* vecnorm(ray2, 2, 2));

% Check parallax
isValid = all(cosAngle < cosd(minParallax) & cosAngle>0);
end

helperIsKeyFrame проверяйте, является ли система координат ключевым кадром.

function isKeyFrame = helperIsKeyFrame(mapPoints, ...
    refKeyFrameId, lastKeyFrameIndex, currFrameIndex, mapPointsIndices)

numPointsRefKeyFrame = numel(findWorldPointsInView(mapPoints, refKeyFrameId));

% More than 20 frames have passed from last key frame insertion
tooManyNonKeyFrames = currFrameIndex > lastKeyFrameIndex + 20;

% Track less than 100 map points
tooFewMapPoints     = numel(mapPointsIndices) < 100;

% Tracked map points are fewer than 90% of points tracked by
% the reference key frame
tooFewTrackedPoints = numel(mapPointsIndices) < 0.9 * numPointsRefKeyFrame;

isKeyFrame = (tooManyNonKeyFrames || tooFewMapPoints) && tooFewTrackedPoints;
end

helperCullRecentMapPoints отходы недавно добавленные точки карты.

function [mapPointSet, directionAndDepth, mapPointsIdx] = helperCullRecentMapPoints(mapPointSet, directionAndDepth, mapPointsIdx, newPointIdx)
outlierIdx    = setdiff(newPointIdx, mapPointsIdx);
if ~isempty(outlierIdx)
    mapPointSet   = removeWorldPoints(mapPointSet, outlierIdx);
    directionAndDepth = remove(directionAndDepth, outlierIdx);
    mapPointsIdx  = mapPointsIdx - arrayfun(@(x) nnz(x>outlierIdx), mapPointsIdx);
end
end

helperEstimateTrajectoryError вычислите ошибку отслеживания.

function rmse = helperEstimateTrajectoryError(gTruth, cameraPoses)
locations       = vertcat(cameraPoses.AbsolutePose.Translation);
gLocations      = vertcat(gTruth.Translation);
scale           = median(vecnorm(gLocations, 2, 2))/ median(vecnorm(locations, 2, 2));
scaledLocations = locations * scale;

rmse = sqrt(mean( sum((scaledLocations - gLocations).^2, 2) ));
disp(['Absolute RMSE for key frame trajectory (m): ', num2str(rmse)]);
end

helperUpdateGlobalMap обновите 3-D местоположения точек карты после оптимизации графика положения

function [mapPointSet, directionAndDepth] = helperUpdateGlobalMap(...
    mapPointSet, directionAndDepth, vSetKeyFrames, vSetKeyFramesOptim, poseScales)
%helperUpdateGlobalMap update map points after pose graph optimization
posesOld     = vSetKeyFrames.Views.AbsolutePose;
posesNew     = vSetKeyFramesOptim.Views.AbsolutePose;
positionsOld = mapPointSet.WorldPoints;
positionsNew = positionsOld;
indices     = 1:mapPointSet.Count;

% Update world location of each map point based on the new absolute pose of 
% the corresponding major view
for i = 1: mapPointSet.Count
    majorViewIds = directionAndDepth.MajorViewId(i);
    poseNew = posesNew(majorViewIds).T;
    poseNew(1:3, 1:3) = poseNew(1:3, 1:3) * poseScales(majorViewIds);
    tform = posesOld(majorViewIds).T \ poseNew;
    positionsNew(i, :) = positionsOld(i, :) * tform(1:3,1:3) + tform(4, 1:3);
end
mapPointSet = updateWorldPoints(mapPointSet, indices, positionsNew);
end

Ссылка

[1] Mur-Artal, Рауль, Хосе Мария Мартинес Монтьель и Хуан Д. Тардос. "ORB-SLAM: универсальная и точная монокулярная система SLAM". Транзакции IEEE на Робототехнике 31, № 5, стр 1147-116, 2015.

[2] Штурм, Юрген, Николас Энджелхард, Феликс Эндрес, Вольфрам Бергард и Дэниел Кремерс. "Сравнительный тест для оценки RGB-D системы SLAM". В Продолжениях Международной конференции IEEE/RSJ по вопросам Интеллектуальных Роботов и Систем, стр 573-580, 2012.

Документация