В этом примере показано, как использовать fitrauto
функционируйте, чтобы автоматически попробовать выбор типов модели регрессии с различными гиперзначениями параметров, учитывая учебный предиктор и данные об ответе. По умолчанию функция использует Байесовую оптимизацию, чтобы выбрать и оценить модели. Если ваш обучающий набор данных содержит много наблюдений, можно использовать асинхронный последовательный алгоритм сокращения вдвое (ASHA) вместо этого. После того, как оптимизация завершена, fitrauto
возвращает модель, обученную на целом наборе данных, который, как ожидают, лучше всего предскажет ответы для новых данных. Проверяйте производительность модели на тестовых данных.
Загрузите набор выборочных данных NYCHousing2015
, который включает 10 переменных с информацией о продажах свойств в Нью-Йорке в 2 015. Этот пример использует некоторые из этих переменных, чтобы анализировать отпускные цены.
load NYCHousing2015
Вместо того, чтобы загрузить выборочные данные устанавливает NYCHousing2015
, можно загрузить данные из Нью-Йорк Сити Открытый веб-сайт Данных и импортировать данные можно следующим образом.
folder = 'Annualized_Rolling_Sales_Update'; ds = spreadsheetDatastore(folder,"TextType","string","NumHeaderLines",4); ds.Files = ds.Files(contains(ds.Files,"2015")); ds.SelectedVariableNames = ["BOROUGH","NEIGHBORHOOD","BUILDINGCLASSCATEGORY","RESIDENTIALUNITS", ... "COMMERCIALUNITS","LANDSQUAREFEET","GROSSSQUAREFEET","YEARBUILT","SALEPRICE","SALEDATE"]; NYCHousing2015 = readall(ds);
Предварительно обработайте набор данных, чтобы выбрать переменные предикторы интереса. Некоторые шаги предварительной обработки соответствуют, те в примере Обучают Модель Линейной регрессии.
Во-первых, измените имена переменных в нижний регистр для удобочитаемости.
NYCHousing2015.Properties.VariableNames = lower(NYCHousing2015.Properties.VariableNames);
Затем удалите выборки с определенными проблематичными значениями. Например, сохраните только те выборки где по крайней мере одно из измерений области grosssquarefeet
или landsquarefeet
является ненулевым. Примите что saleprice
из 0$ указывает на передачу владения без суммы, и удалите выборки с тем saleprice
значение. Примите что yearbuilt
значение 1500 или меньше является опечаткой, и удалите соответствующие выборки.
NYCHousing2015(NYCHousing2015.grosssquarefeet == 0 & NYCHousing2015.landsquarefeet == 0,:) = []; NYCHousing2015(NYCHousing2015.saleprice == 0,:) = []; NYCHousing2015(NYCHousing2015.yearbuilt <= 1500,:) = [];
Преобразуйте saledate
переменная в виде datetime
массив, в два числовых столбца MM
(месяц) и DD
(день), и удаляет saledate
переменная. Проигнорируйте значения года, потому что все выборки в течение года 2015.
[~,NYCHousing2015.MM,NYCHousing2015.DD] = ymd(NYCHousing2015.saledate); NYCHousing2015.saledate = [];
Числовые значения в borough
переменная указывает на имена городков. Замените переменную к категориальной переменной с помощью имен.
NYCHousing2015.borough = categorical(NYCHousing2015.borough,1:5, ... ["Manhattan","Bronx","Brooklyn","Queens","Staten Island"]);
neighborhood
переменная имеет 254 категории. Удалите эту переменную для простоты.
NYCHousing2015.neighborhood = [];
Преобразуйте buildingclasscategory
переменная к категориальной переменной, и исследует переменную при помощи wordcloud
функция.
NYCHousing2015.buildingclasscategory = categorical(NYCHousing2015.buildingclasscategory); wordcloud(NYCHousing2015.buildingclasscategory);
Примите, что вы интересуетесь только одним - 2D, и жилье с тремя семействами. Найдите демонстрационные индексы для этого жилья и удалите другие выборки. Затем измените buildingclasscategory
переменная к порядковой категориальной переменной, с названиями категории с целочисленным знаком.
idx = ismember(string(NYCHousing2015.buildingclasscategory), ... ["01 ONE FAMILY DWELLINGS","02 TWO FAMILY DWELLINGS","03 THREE FAMILY DWELLINGS"]); NYCHousing2015 = NYCHousing2015(idx,:); NYCHousing2015.buildingclasscategory = categorical(NYCHousing2015.buildingclasscategory, ... ["01 ONE FAMILY DWELLINGS","02 TWO FAMILY DWELLINGS","03 THREE FAMILY DWELLINGS"], ... ["1","2","3"],'Ordinal',true);
buildingclasscategory
переменная теперь указывает на количество семейств в одном жилье.
Исследуйте переменную отклика saleprice
при помощи summary
функция.
s = summary(NYCHousing2015); s.saleprice
ans = struct with fields:
Size: [24972 1]
Type: 'double'
Description: ''
Units: ''
Continuity: []
Min: 1
Median: 515000
Max: 37000000
NumMissing: 0
Создайте гистограмму saleprice
переменная.
histogram(NYCHousing2015.saleprice)
Поскольку распределение saleprice
значения скашиваются правом со всеми значениями, больше, чем 0, журнал преобразовывают saleprice
переменная.
NYCHousing2015.saleprice = log(NYCHousing2015.saleprice);
Точно так же преобразуйте grosssquarefeet
и landsquarefeet
переменные. Добавьте значение 1 прежде, чем взять логарифм каждой переменной, в случае, если переменная равна 0.
NYCHousing2015.grosssquarefeet = log(1 + NYCHousing2015.grosssquarefeet); NYCHousing2015.landsquarefeet = log(1 + NYCHousing2015.landsquarefeet);
Разделите набор данных в набор обучающих данных и набор тестов при помощи cvpartition
. Используйте приблизительно 80% наблюдений для выбора модели и настраивающего процесса гиперпараметра, и другие 20%, чтобы проверить производительность итоговой модели, возвращенной fitrauto
.
rng("default") % For reproducibility of the partition c = cvpartition(length(NYCHousing2015.saleprice),"Holdout",0.2); trainData = NYCHousing2015(training(c),:); testData = NYCHousing2015(test(c),:);
Идентифицируйте и удалите выбросы saleprice
, grosssquarefeet
, и landsquarefeet
от обучающих данных при помощи isoutlier
функция.
[priceIdx,priceL,priceU] = isoutlier(trainData.saleprice); trainData(priceIdx,:) = []; [grossIdx,grossL,grossU] = isoutlier(trainData.grosssquarefeet); trainData(grossIdx,:) = []; [landIdx,landL,landU] = isoutlier(trainData.landsquarefeet); trainData(landIdx,:) = [];
Удалите выбросы saleprice
, grosssquarefeet
, и landsquarefeet
от тестовых данных при помощи тех же более низких и верхних порогов, вычисленных на обучающих данных.
testData(testData.saleprice < priceL | testData.saleprice > priceU,:) = []; testData(testData.grosssquarefeet < grossL | testData.grosssquarefeet > grossU,:) = []; testData(testData.landsquarefeet < landL | testData.landsquarefeet > landU,:) = [];
Найдите соответствующую модель регрессии для данных в trainData
при помощи fitrauto
. По умолчанию, fitrauto
использует Байесовую оптимизацию, чтобы выбрать модели и их гиперзначения параметров, и вычисляет значение для каждой модели, где valLoss является среднеквадратической ошибкой (MSE) перекрестной проверки. fitrauto
предоставляет график оптимизации и итеративное отображение результатов оптимизации. Для получения дополнительной информации о том, как интерпретировать эти результаты, смотрите Многословное Отображение.
Задайте, чтобы запустить Байесовую оптимизацию параллельно, которая требует Parallel Computing Toolbox™. Из-за невоспроизводимости синхронизации параллели, параллельная Байесова оптимизация не обязательно приводит к восстанавливаемым результатам. Из-за сложности оптимизации этот процесс может занять время, особенно для больших наборов данных.
bayesianOptions = struct("UseParallel",true); [bayesianMdl,bayesianResults] = fitrauto(trainData,"saleprice", ... "HyperparameterOptimizationOptions",bayesianOptions);
Warning: Data set has more than 10000 observations. Because ASHA optimization often finds good solutions faster than Bayesian optimization for data sets with many observations, try specifying the 'Optimizer' field value as 'asha' in the 'HyperparameterOptimizationOptions' value structure.
Starting parallel pool (parpool) using the 'local' profile ... Connected to the parallel pool (number of workers: 6). Copying objective function to workers... Done copying objective function to workers. Learner types to explore: ensemble, svm, tree Total iterations (MaxObjectiveEvaluations): 90 Total time (MaxTime): Inf |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 1 | 6 | Best | 0.25922 | 4.3113 | 0.25922 | 0.25922 | svm | BoxConstraint: 0.0055914 | | | | | | | | | | KernelScale: 0.0056086 | | | | | | | | | | Epsilon: 17.88 | | 2 | 6 | Best | 0.19617 | 34.116 | 0.19617 | 0.19617 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 232 | | | | | | | | | | MinLeafSize: 8 | | 3 | 6 | Best | 0.18796 | 0.90857 | 0.18796 | 0.18796 | tree | MinLeafSize: 43 | | 4 | 6 | Accept | 0.19656 | 38.929 | 0.18796 | 0.18796 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 271 | | | | | | | | | | MinLeafSize: 53 | | 5 | 6 | Accept | 0.25922 | 2.9895 | 0.18796 | 0.18796 | svm | BoxConstraint: 0.3578 | | | | | | | | | | KernelScale: 0.033374 | | | | | | | | | | Epsilon: 0.53615 | | 6 | 6 | Accept | 0.29931 | 14.374 | 0.18796 | 0.19052 | tree | MinLeafSize: 2 | | 7 | 6 | Accept | 0.2022 | 29.307 | 0.18796 | 0.19052 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 246 | | | | | | | | | | MinLeafSize: 1114 | | 8 | 6 | Best | 0.18737 | 47.153 | 0.18737 | 0.1877 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 297 | | | | | | | | | | MinLeafSize: 3220 | | 9 | 6 | Accept | 0.19582 | 27.647 | 0.18737 | 0.18742 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 247 | | | | | | | | | | MinLeafSize: 4243 | | 10 | 6 | Best | 0.17764 | 136.84 | 0.17764 | 0.17768 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 4 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 11 | 6 | Best | 0.17762 | 141.42 | 0.17762 | 0.17765 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 299 | | | | | | | | | | MinLeafSize: 161 | | 12 | 6 | Accept | 0.25922 | 2.8311 | 0.17762 | 0.17765 | svm | BoxConstraint: 0.31228 | | | | | | | | | | KernelScale: 73.3 | | | | | | | | | | Epsilon: 2.1891 | | 13 | 6 | Accept | 0.29931 | 15.084 | 0.17762 | 0.17765 | tree | MinLeafSize: 2 | | 14 | 6 | Accept | 0.25922 | 1.527 | 0.17762 | 0.17765 | svm | BoxConstraint: 107.75 | | | | | | | | | | KernelScale: 414.93 | | | | | | | | | | Epsilon: 27.903 | | 15 | 6 | Accept | 0.18795 | 0.6543 | 0.17762 | 0.17765 | tree | MinLeafSize: 219 | | 16 | 6 | Accept | 0.19675 | 28.038 | 0.17762 | 0.17765 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 208 | | | | | | | | | | MinLeafSize: 210 | | 17 | 6 | Accept | 0.19855 | 0.27516 | 0.17762 | 0.17765 | tree | MinLeafSize: 895 | | 18 | 6 | Best | 0.17748 | 110.04 | 0.17748 | 0.17802 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 227 | | | | | | | | | | MinLeafSize: 161 | | 19 | 6 | Best | 0.17744 | 98.352 | 0.17744 | 0.17697 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 209 | | | | | | | | | | MinLeafSize: 12 | | 20 | 6 | Accept | 0.18558 | 0.80906 | 0.17744 | 0.17697 | tree | MinLeafSize: 81 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 21 | 6 | Accept | 0.21098 | 1.1229 | 0.17744 | 0.17697 | tree | MinLeafSize: 12 | | 22 | 6 | Accept | 0.1853 | 44.836 | 0.17744 | 0.17706 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 218 | | | | | | | | | | MinLeafSize: 2260 | | 23 | 6 | Accept | 0.1922 | 1612.6 | 0.17744 | 0.17706 | svm | BoxConstraint: 529.96 | | | | | | | | | | KernelScale: 813.67 | | | | | | | | | | Epsilon: 0.0014318 | | 24 | 6 | Accept | 4.2133 | 1609.9 | 0.17744 | 0.17706 | svm | BoxConstraint: 23.501 | | | | | | | | | | KernelScale: 37.99 | | | | | | | | | | Epsilon: 0.0072166 | | 25 | 6 | Accept | 0.20705 | 0.77564 | 0.17744 | 0.17706 | tree | MinLeafSize: 1381 | | 26 | 6 | Accept | 0.27791 | 6.724 | 0.17744 | 0.17706 | tree | MinLeafSize: 3 | | 27 | 6 | Accept | 0.25951 | 3.6011 | 0.17744 | 0.17706 | tree | MinLeafSize: 4 | | 28 | 6 | Accept | 0.21798 | 23.29 | 0.17744 | 0.17889 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 239 | | | | | | | | | | MinLeafSize: 2731 | | 29 | 6 | Accept | 0.178 | 77.067 | 0.17744 | 0.17839 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 200 | | | | | | | | | | MinLeafSize: 530 | | 30 | 6 | Accept | 0.25922 | 1.5571 | 0.17744 | 0.17839 | svm | BoxConstraint: 41.013 | | | | | | | | | | KernelScale: 0.042918 | | | | | | | | | | Epsilon: 0.70698 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 31 | 6 | Accept | 0.23155 | 1.5822 | 0.17744 | 0.17839 | tree | MinLeafSize: 7 | | 32 | 6 | Accept | 0.25922 | 1.3826 | 0.17744 | 0.17839 | svm | BoxConstraint: 404.64 | | | | | | | | | | KernelScale: 3.2648 | | | | | | | | | | Epsilon: 1.9718 | | 33 | 6 | Best | 0.17736 | 113.23 | 0.17736 | 0.17784 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 254 | | | | | | | | | | MinLeafSize: 330 | | 34 | 6 | Accept | 0.23949 | 2.0777 | 0.17736 | 0.17784 | tree | MinLeafSize: 6 | | 35 | 6 | Accept | 0.25922 | 1.4086 | 0.17736 | 0.17784 | svm | BoxConstraint: 1.3089 | | | | | | | | | | KernelScale: 0.051591 | | | | | | | | | | Epsilon: 10.5 | | 36 | 6 | Accept | 0.29931 | 14.866 | 0.17736 | 0.17784 | tree | MinLeafSize: 2 | | 37 | 6 | Accept | 0.19293 | 0.3679 | 0.17736 | 0.17784 | tree | MinLeafSize: 421 | | 38 | 6 | Accept | 0.24384 | 15.2 | 0.17736 | 0.17727 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 213 | | | | | | | | | | MinLeafSize: 5333 | | 39 | 6 | Accept | 0.21113 | 0.21852 | 0.17736 | 0.17727 | tree | MinLeafSize: 2018 | | 40 | 6 | Accept | 0.25922 | 0.085872 | 0.17736 | 0.17727 | tree | MinLeafSize: 9068 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 41 | 6 | Accept | 0.19638 | 40.141 | 0.17736 | 0.17724 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 298 | | | | | | | | | | MinLeafSize: 1 | | 42 | 6 | Accept | 4.763 | 1682.3 | 0.17736 | 0.17724 | svm | BoxConstraint: 18.072 | | | | | | | | | | KernelScale: 48.632 | | | | | | | | | | Epsilon: 0.014558 | | 43 | 6 | Accept | 0.18736 | 0.49337 | 0.17736 | 0.17724 | tree | MinLeafSize: 176 | | 44 | 6 | Accept | 0.19215 | 0.32 | 0.17736 | 0.17724 | tree | MinLeafSize: 382 | | 45 | 6 | Accept | 0.19308 | 0.60574 | 0.17736 | 0.17724 | tree | MinLeafSize: 27 | | 46 | 6 | Accept | 0.33688 | 40.679 | 0.17736 | 0.17724 | tree | MinLeafSize: 1 | | 47 | 6 | Accept | 0.19245 | 0.61185 | 0.17736 | 0.17724 | tree | MinLeafSize: 28 | | 48 | 6 | Accept | 0.21609 | 0.22892 | 0.17736 | 0.17724 | tree | MinLeafSize: 2546 | | 49 | 6 | Accept | 0.23213 | 0.13115 | 0.17736 | 0.17724 | tree | MinLeafSize: 6021 | | 50 | 6 | Accept | 0.18774 | 0.40306 | 0.17736 | 0.17724 | tree | MinLeafSize: 205 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 51 | 6 | Accept | 0.23807 | 0.13795 | 0.17736 | 0.17724 | tree | MinLeafSize: 6906 | | 52 | 6 | Accept | 0.1776 | 98.371 | 0.17736 | 0.17707 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 201 | | | | | | | | | | MinLeafSize: 1 | | 53 | 6 | Accept | 0.25922 | 1.406 | 0.17736 | 0.17707 | svm | BoxConstraint: 0.093209 | | | | | | | | | | KernelScale: 416.84 | | | | | | | | | | Epsilon: 47.572 | | 54 | 6 | Accept | 0.25922 | 1.3682 | 0.17736 | 0.17707 | svm | BoxConstraint: 42.549 | | | | | | | | | | KernelScale: 0.0019538 | | | | | | | | | | Epsilon: 0.039803 | | 55 | 6 | Accept | 4.8506 | 1527.7 | 0.17736 | 0.17707 | svm | BoxConstraint: 169.91 | | | | | | | | | | KernelScale: 27.071 | | | | | | | | | | Epsilon: 0.0098403 | | 56 | 6 | Accept | 0.19222 | 0.37478 | 0.17736 | 0.17707 | tree | MinLeafSize: 388 | | 57 | 6 | Accept | 0.18585 | 0.48171 | 0.17736 | 0.17707 | tree | MinLeafSize: 72 | | 58 | 6 | Accept | 0.33688 | 42.99 | 0.17736 | 0.17707 | tree | MinLeafSize: 1 | | 59 | 6 | Accept | 0.18592 | 0.50417 | 0.17736 | 0.17707 | tree | MinLeafSize: 74 | | 60 | 6 | Accept | 0.19215 | 0.37623 | 0.17736 | 0.17707 | tree | MinLeafSize: 380 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 61 | 6 | Accept | 0.18585 | 0.49439 | 0.17736 | 0.17707 | tree | MinLeafSize: 72 | | 62 | 6 | Accept | 0.18535 | 0.4369 | 0.17736 | 0.17707 | tree | MinLeafSize: 89 | | 63 | 6 | Accept | 0.19073 | 0.58171 | 0.17736 | 0.17707 | tree | MinLeafSize: 35 | | 64 | 6 | Accept | 0.18611 | 0.47647 | 0.17736 | 0.17707 | tree | MinLeafSize: 122 | | 65 | 6 | Accept | 0.18918 | 0.5661 | 0.17736 | 0.17707 | tree | MinLeafSize: 39 | | 66 | 6 | Accept | 0.18654 | 0.4837 | 0.17736 | 0.17707 | tree | MinLeafSize: 126 | | 67 | 6 | Accept | 0.20714 | 0.79792 | 0.17736 | 0.17707 | tree | MinLeafSize: 14 | | 68 | 6 | Accept | 0.22637 | 0.16862 | 0.17736 | 0.17707 | tree | MinLeafSize: 3434 | | 69 | 6 | Accept | 0.23213 | 0.14103 | 0.17736 | 0.17707 | tree | MinLeafSize: 5980 | | 70 | 6 | Accept | 0.18639 | 0.48029 | 0.17736 | 0.17707 | tree | MinLeafSize: 129 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 71 | 6 | Accept | 0.29931 | 14.872 | 0.17736 | 0.17707 | tree | MinLeafSize: 2 | | 72 | 6 | Accept | 0.19378 | 0.30912 | 0.17736 | 0.17707 | tree | MinLeafSize: 497 | | 73 | 6 | Accept | 0.1867 | 0.51488 | 0.17736 | 0.17707 | tree | MinLeafSize: 56 | | 74 | 6 | Accept | 0.24793 | 2.3832 | 0.17736 | 0.17707 | tree | MinLeafSize: 5 | | 75 | 6 | Accept | 0.20306 | 40.109 | 0.17736 | 0.17741 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 330 | | | | | | | | | | MinLeafSize: 1205 | | 76 | 6 | Accept | 2.6193 | 1791.8 | 0.17736 | 0.17741 | svm | BoxConstraint: 0.0045714 | | | | | | | | | | KernelScale: 31.869 | | | | | | | | | | Epsilon: 0.0072361 | | 77 | 6 | Accept | 0.25951 | 3.5512 | 0.17736 | 0.17741 | tree | MinLeafSize: 4 | | 78 | 6 | Accept | 0.18034 | 84.91 | 0.17736 | 0.17741 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 249 | | | | | | | | | | MinLeafSize: 1311 | | 79 | 6 | Accept | 0.18786 | 0.39533 | 0.17736 | 0.17741 | tree | MinLeafSize: 214 | | 80 | 6 | Accept | 0.1896 | 0.34147 | 0.17736 | 0.17741 | tree | MinLeafSize: 282 | |===================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Estimated min | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | log(1 + valLoss) | | | |===================================================================================================================================================| | 81 | 6 | Accept | 0.18796 | 0.50394 | 0.17736 | 0.17741 | tree | MinLeafSize: 43 | | 82 | 6 | Accept | 0.22637 | 0.14599 | 0.17736 | 0.17741 | tree | MinLeafSize: 2935 | | 83 | 6 | Accept | 0.19814 | 0.27693 | 0.17736 | 0.17741 | tree | MinLeafSize: 846 | | 84 | 6 | Accept | 0.18688 | 0.48689 | 0.17736 | 0.17741 | tree | MinLeafSize: 149 | | 85 | 6 | Accept | 0.19759 | 0.27859 | 0.17736 | 0.17741 | tree | MinLeafSize: 734 | | 86 | 6 | Accept | 0.18762 | 0.42147 | 0.17736 | 0.17741 | tree | MinLeafSize: 185 | | 87 | 6 | Accept | 0.22477 | 1.2947 | 0.17736 | 0.17741 | tree | MinLeafSize: 8 | | 88 | 6 | Accept | 0.19773 | 0.26026 | 0.17736 | 0.17741 | tree | MinLeafSize: 794 | | 89 | 6 | Accept | 0.33688 | 42.701 | 0.17736 | 0.17741 | tree | MinLeafSize: 1 | | 90 | 6 | Accept | 0.18571 | 0.54027 | 0.17736 | 0.17741 | tree | MinLeafSize: 77 |
__________________________________________________________ Optimization completed. Total iterations: 90 Total elapsed time: 2007.7289 seconds Total time for training and validation: 9697.8458 seconds Best observed learner is an ensemble model with: Method: LSBoost NumLearningCycles: 254 MinLeafSize: 330 Observed log(1 + valLoss): 0.17736 Time for training and validation: 113.2282 seconds Best estimated learner (returned model) is an ensemble model with: Method: LSBoost NumLearningCycles: 209 MinLeafSize: 12 Estimated log(1 + valLoss): 0.17741 Estimated time for training and validation: 98.5745 seconds Documentation for fitrauto display
Total elapsed time
значение показывает, что Байесова оптимизация требовала времени к запущенному (более чем 30 минут).
Итоговая модель возвращена fitrauto
соответствует лучшему предполагаемому ученику. Прежде, чем возвратить модель, функция переобучает его с помощью целого обучающего набора данных (trainData
), перечисленный Learner
(или модель) тип и отображенные гиперзначения параметров.
Когда fitrauto
с Байесовой оптимизацией занимает много времени, чтобы запуститься из-за количества наблюдений в вашем наборе обучающих данных, рассмотреть использование fitrauto
с оптимизацией ASHA вместо этого. Учитывая, что trainData
содержит более чем 10 000 наблюдений, попытайтесь использовать fitrauto
с оптимизацией ASHA, чтобы автоматически найти соответствующую модель регрессии. Когда вы используете fitrauto
с оптимизацией ASHA функция случайным образом выбирает несколько моделей с различными гиперзначениями параметров и обучает их на небольшом подмножестве обучающих данных. Если значение для конкретной модели обещает, где valLoss является перекрестной проверкой MSE, модель продвинута и обучена на большей сумме обучающих данных. Этот процесс повторения и успешные модели обучен на прогрессивно больших объемах данных. По умолчанию, fitrauto
предоставляет график оптимизации и итеративное отображение результатов оптимизации. Для получения дополнительной информации о том, как интерпретировать эти результаты, смотрите Многословное Отображение.
Задайте, чтобы запустить оптимизацию ASHA параллельно. Обратите внимание на то, что оптимизация ASHA часто имеет больше итераций, чем Байесова оптимизация по умолчанию. Если у вас есть ограничение времени, можно задать MaxTime
поле HyperparameterOptimizationOptions
структура, чтобы ограничить номер секунд fitrauto
запуски.
ashaOptions = struct("Optimizer","asha","UseParallel",true); [ashaMdl,ashaResults] = fitrauto(trainData,"saleprice", ... "HyperparameterOptimizationOptions",ashaOptions);
Copying objective function to workers...
Warning: Files that have already been attached are being ignored. To see which files are attached see the 'AttachedFiles' property of the parallel pool.
Done copying objective function to workers. Learner types to explore: ensemble, svm, tree Total iterations (MaxObjectiveEvaluations): 340 Total time (MaxTime): Inf |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 1 | 2 | Best | 0.2592 | 0.25783 | 0.2592 | 228 | tree | MinLeafSize: 449 | | 2 | 2 | Accept | 0.2593 | 0.27523 | 0.2592 | 228 | svm | BoxConstraint: 0.031078 | | | | | | | | | | KernelScale: 0.70461 | | | | | | | | | | Epsilon: 0.000557 | | 3 | 2 | Accept | 0.2608 | 0.31721 | 0.2592 | 228 | svm | BoxConstraint: 5.9764 | | | | | | | | | | KernelScale: 0.15731 | | | | | | | | | | Epsilon: 0.13472 | | 4 | 2 | Accept | 0.25935 | 0.2915 | 0.2592 | 228 | svm | BoxConstraint: 1.3072 | | | | | | | | | | KernelScale: 15.843 | | | | | | | | | | Epsilon: 12.799 | | 5 | 2 | Accept | 0.25932 | 0.26278 | 0.2592 | 228 | tree | MinLeafSize: 2530 | | 6 | 3 | Best | 0.23146 | 0.080734 | 0.23146 | 910 | tree | MinLeafSize: 449 | | 7 | 3 | Accept | 0.25929 | 0.14919 | 0.23146 | 228 | svm | BoxConstraint: 0.0012763 | | | | | | | | | | KernelScale: 0.020115 | | | | | | | | | | Epsilon: 0.010552 | | 8 | 4 | Accept | 0.26017 | 0.11273 | 0.23146 | 228 | svm | BoxConstraint: 0.0054073 | | | | | | | | | | KernelScale: 0.10302 | | | | | | | | | | Epsilon: 0.084674 | | 9 | 6 | Accept | 0.25922 | 0.11101 | 0.23146 | 228 | svm | BoxConstraint: 207.78 | | | | | | | | | | KernelScale: 0.046022 | | | | | | | | | | Epsilon: 0.0010033 | | 10 | 6 | Accept | 0.25925 | 0.17856 | 0.23146 | 910 | svm | BoxConstraint: 207.78 | | | | | | | | | | KernelScale: 0.046022 | | | | | | | | | | Epsilon: 0.0010033 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 11 | 6 | Accept | 0.25923 | 0.12042 | 0.23146 | 228 | svm | BoxConstraint: 511.01 | | | | | | | | | | KernelScale: 0.002289 | | | | | | | | | | Epsilon: 0.37187 | | 12 | 6 | Accept | 0.23806 | 0.59165 | 0.23146 | 228 | svm | BoxConstraint: 293.55 | | | | | | | | | | KernelScale: 542.8 | | | | | | | | | | Epsilon: 1.1488 | | 13 | 6 | Accept | 0.25928 | 6.1565 | 0.23146 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 204 | | | | | | | | | | MinLeafSize: 1349 | | 14 | 5 | Accept | 0.25979 | 8.1732 | 0.23146 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 290 | | | | | | | | | | MinLeafSize: 2969 | | 15 | 5 | Accept | 0.25921 | 0.23677 | 0.23146 | 228 | tree | MinLeafSize: 2204 | | 16 | 6 | Accept | 0.25933 | 0.17458 | 0.23146 | 228 | tree | MinLeafSize: 1521 | | 17 | 6 | Accept | 0.25926 | 7.8315 | 0.23146 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 252 | | | | | | | | | | MinLeafSize: 3143 | | 18 | 6 | Accept | 0.2592 | 0.14979 | 0.23146 | 228 | svm | BoxConstraint: 2.4584 | | | | | | | | | | KernelScale: 0.097169 | | | | | | | | | | Epsilon: 0.77631 | | 19 | 6 | Accept | 0.23359 | 0.12931 | 0.23146 | 228 | tree | MinLeafSize: 89 | | 20 | 5 | Accept | 0.25927 | 7.9014 | 0.23146 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 218 | | | | | | | | | | MinLeafSize: 2848 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 21 | 5 | Accept | 0.25919 | 0.16966 | 0.23146 | 910 | svm | BoxConstraint: 2.4584 | | | | | | | | | | KernelScale: 0.097169 | | | | | | | | | | Epsilon: 0.77631 | | 22 | 6 | Accept | 0.2595 | 0.10365 | 0.23146 | 228 | svm | BoxConstraint: 0.17794 | | | | | | | | | | KernelScale: 0.0016095 | | | | | | | | | | Epsilon: 0.10435 | | 23 | 6 | Best | 0.20371 | 0.17056 | 0.20371 | 228 | tree | MinLeafSize: 3 | | 24 | 6 | Best | 0.20026 | 9.6065 | 0.20026 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 207 | | | | | | | | | | MinLeafSize: 8 | | 25 | 6 | Accept | 0.20645 | 0.32482 | 0.20026 | 910 | tree | MinLeafSize: 3 | | 26 | 6 | Accept | 0.25923 | 0.13187 | 0.20026 | 228 | svm | BoxConstraint: 308.98 | | | | | | | | | | KernelScale: 1.5085 | | | | | | | | | | Epsilon: 8.6079 | | 27 | 6 | Accept | 0.20683 | 0.61169 | 0.20026 | 3639 | tree | MinLeafSize: 3 | | 28 | 6 | Accept | 0.23663 | 8.0533 | 0.20026 | 910 | svm | BoxConstraint: 293.55 | | | | | | | | | | KernelScale: 542.8 | | | | | | | | | | Epsilon: 1.1488 | | 29 | 6 | Accept | 0.25939 | 0.09772 | 0.20026 | 228 | svm | BoxConstraint: 0.80152 | | | | | | | | | | KernelScale: 11.639 | | | | | | | | | | Epsilon: 14.529 | | 30 | 6 | Best | 0.19922 | 6.5146 | 0.19922 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 232 | | | | | | | | | | MinLeafSize: 21 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 31 | 6 | Accept | 0.25924 | 0.090849 | 0.19922 | 228 | svm | BoxConstraint: 0.0049835 | | | | | | | | | | KernelScale: 0.096697 | | | | | | | | | | Epsilon: 1.8 | | 32 | 6 | Accept | 0.25931 | 0.11033 | 0.19922 | 228 | tree | MinLeafSize: 254 | | 33 | 6 | Best | 0.19427 | 9.6331 | 0.19427 | 228 | svm | BoxConstraint: 0.020371 | | | | | | | | | | KernelScale: 100.81 | | | | | | | | | | Epsilon: 0.05477 | | 34 | 6 | Accept | 0.25969 | 6.8253 | 0.19427 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 278 | | | | | | | | | | MinLeafSize: 4637 | | 35 | 6 | Best | 0.18681 | 9.635 | 0.18681 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 232 | | | | | | | | | | MinLeafSize: 21 | | 36 | 6 | Accept | 0.2592 | 0.10297 | 0.18681 | 228 | svm | BoxConstraint: 5.2635 | | | | | | | | | | KernelScale: 8.9786 | | | | | | | | | | Epsilon: 15.631 | | 37 | 6 | Accept | 0.25928 | 0.081009 | 0.18681 | 228 | svm | BoxConstraint: 2.1875 | | | | | | | | | | KernelScale: 0.0027909 | | | | | | | | | | Epsilon: 1.5475 | | 38 | 6 | Accept | 0.22237 | 5.8302 | 0.18681 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 230 | | | | | | | | | | MinLeafSize: 47 | | 39 | 6 | Accept | 0.25941 | 0.082779 | 0.18681 | 228 | tree | MinLeafSize: 367 | | 40 | 6 | Accept | 0.25928 | 33.863 | 0.18681 | 228 | svm | BoxConstraint: 9.0541 | | | | | | | | | | KernelScale: 4.9834 | | | | | | | | | | Epsilon: 0.023362 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 41 | 6 | Accept | 0.4118 | 0.3285 | 0.18681 | 228 | svm | BoxConstraint: 17.138 | | | | | | | | | | KernelScale: 705.47 | | | | | | | | | | Epsilon: 0.0020063 | | 42 | 6 | Accept | 0.1974 | 7.5138 | 0.18681 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 207 | | | | | | | | | | MinLeafSize: 8 | | 43 | 6 | Accept | 0.25956 | 0.083912 | 0.18681 | 228 | tree | MinLeafSize: 3815 | | 44 | 6 | Accept | 0.60579 | 20.227 | 0.18681 | 228 | svm | BoxConstraint: 2.9833 | | | | | | | | | | KernelScale: 64.904 | | | | | | | | | | Epsilon: 0.46657 | | 45 | 5 | Accept | 0.25929 | 32.187 | 0.18681 | 228 | svm | BoxConstraint: 0.0038969 | | | | | | | | | | KernelScale: 4.0278 | | | | | | | | | | Epsilon: 0.02414 | | 46 | 5 | Accept | 0.20265 | 0.10465 | 0.18681 | 228 | tree | MinLeafSize: 25 | | 47 | 6 | Accept | 0.25955 | 0.10326 | 0.18681 | 228 | svm | BoxConstraint: 159.42 | | | | | | | | | | KernelScale: 0.0025582 | | | | | | | | | | Epsilon: 0.35672 | | 48 | 6 | Accept | 0.2592 | 0.10545 | 0.18681 | 228 | tree | MinLeafSize: 924 | | 49 | 6 | Accept | 0.19117 | 0.14041 | 0.18681 | 910 | tree | MinLeafSize: 25 | | 50 | 6 | Accept | 1.509 | 31.423 | 0.18681 | 228 | svm | BoxConstraint: 96.951 | | | | | | | | | | KernelScale: 11.176 | | | | | | | | | | Epsilon: 0.061916 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 51 | 6 | Accept | 0.25954 | 0.094582 | 0.18681 | 228 | svm | BoxConstraint: 13.159 | | | | | | | | | | KernelScale: 0.030173 | | | | | | | | | | Epsilon: 0.019064 | | 52 | 6 | Accept | 0.25938 | 0.089722 | 0.18681 | 228 | tree | MinLeafSize: 427 | | 53 | 6 | Accept | 0.25946 | 0.10768 | 0.18681 | 228 | svm | BoxConstraint: 0.036205 | | | | | | | | | | KernelScale: 0.039957 | | | | | | | | | | Epsilon: 0.00052381 | | 54 | 6 | Accept | 0.19973 | 0.11585 | 0.18681 | 910 | tree | MinLeafSize: 89 | | 55 | 6 | Accept | 0.20261 | 0.10802 | 0.18681 | 228 | tree | MinLeafSize: 33 | | 56 | 6 | Accept | 0.20011 | 0.10398 | 0.18681 | 228 | tree | MinLeafSize: 12 | | 57 | 6 | Accept | 0.25928 | 0.076343 | 0.18681 | 228 | tree | MinLeafSize: 278 | | 58 | 6 | Accept | 0.23286 | 8.0758 | 0.18681 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 282 | | | | | | | | | | MinLeafSize: 67 | | 59 | 6 | Accept | 0.25938 | 0.10685 | 0.18681 | 228 | svm | BoxConstraint: 0.0036313 | | | | | | | | | | KernelScale: 229.52 | | | | | | | | | | Epsilon: 19.065 | | 60 | 6 | Accept | 0.1899 | 0.19173 | 0.18681 | 910 | tree | MinLeafSize: 12 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 61 | 6 | Accept | 0.2592 | 0.10538 | 0.18681 | 228 | svm | BoxConstraint: 0.017434 | | | | | | | | | | KernelScale: 5.5931 | | | | | | | | | | Epsilon: 6.2283 | | 62 | 6 | Accept | 0.25926 | 0.093737 | 0.18681 | 228 | svm | BoxConstraint: 3.3555 | | | | | | | | | | KernelScale: 11.762 | | | | | | | | | | Epsilon: 3.4714 | | 63 | 6 | Accept | 0.20106 | 8.9623 | 0.18681 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 299 | | | | | | | | | | MinLeafSize: 51 | | 64 | 6 | Accept | 0.19865 | 9.9048 | 0.18681 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 230 | | | | | | | | | | MinLeafSize: 47 | | 65 | 6 | Accept | 0.25937 | 7.2365 | 0.18681 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 266 | | | | | | | | | | MinLeafSize: 232 | | 66 | 6 | Accept | 0.25931 | 0.08819 | 0.18681 | 228 | tree | MinLeafSize: 268 | | 67 | 6 | Accept | 0.25936 | 0.12072 | 0.18681 | 228 | svm | BoxConstraint: 0.0021604 | | | | | | | | | | KernelScale: 0.72568 | | | | | | | | | | Epsilon: 1.0249 | | 68 | 6 | Accept | 0.19918 | 7.8895 | 0.18681 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 250 | | | | | | | | | | MinLeafSize: 11 | | 69 | 6 | Accept | 0.25921 | 6.3317 | 0.18681 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 248 | | | | | | | | | | MinLeafSize: 5910 | | 70 | 6 | Accept | 0.25924 | 0.088117 | 0.18681 | 228 | tree | MinLeafSize: 1001 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 71 | 6 | Accept | 0.25932 | 0.099608 | 0.18681 | 228 | svm | BoxConstraint: 0.014495 | | | | | | | | | | KernelScale: 0.031712 | | | | | | | | | | Epsilon: 3.5679 | | 72 | 6 | Accept | 0.54111 | 0.232 | 0.18681 | 228 | svm | BoxConstraint: 2.3012 | | | | | | | | | | KernelScale: 835.74 | | | | | | | | | | Epsilon: 0.0077649 | | 73 | 6 | Accept | 0.19171 | 0.13644 | 0.18681 | 910 | tree | MinLeafSize: 33 | | 74 | 6 | Accept | 0.18784 | 0.2701 | 0.18681 | 3639 | tree | MinLeafSize: 12 | | 75 | 6 | Accept | 0.25924 | 0.099585 | 0.18681 | 228 | svm | BoxConstraint: 31.378 | | | | | | | | | | KernelScale: 1.578 | | | | | | | | | | Epsilon: 0.13664 | | 76 | 6 | Accept | 0.20704 | 0.097326 | 0.18681 | 228 | tree | MinLeafSize: 43 | | 77 | 6 | Accept | 0.25924 | 0.09125 | 0.18681 | 228 | svm | BoxConstraint: 0.0010645 | | | | | | | | | | KernelScale: 6.4809 | | | | | | | | | | Epsilon: 7.9513 | | 78 | 6 | Best | 0.18639 | 11.598 | 0.18639 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 299 | | | | | | | | | | MinLeafSize: 51 | | 79 | 6 | Accept | 0.44315 | 0.1599 | 0.18639 | 228 | svm | BoxConstraint: 0.0032518 | | | | | | | | | | KernelScale: 656.68 | | | | | | | | | | Epsilon: 0.34053 | | 80 | 6 | Accept | 0.19291 | 0.13664 | 0.18639 | 910 | tree | MinLeafSize: 43 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 81 | 6 | Accept | 0.19874 | 9.2601 | 0.18639 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 266 | | | | | | | | | | MinLeafSize: 2 | | 82 | 6 | Accept | 0.1868 | 9.9992 | 0.18639 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 250 | | | | | | | | | | MinLeafSize: 11 | | 83 | 6 | Best | 0.1792 | 22.564 | 0.1792 | 3639 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 232 | | | | | | | | | | MinLeafSize: 21 | | 84 | 6 | Accept | 0.20614 | 0.12673 | 0.1792 | 228 | tree | MinLeafSize: 3 | | 85 | 6 | Accept | 0.2046 | 7.9646 | 0.1792 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 224 | | | | | | | | | | MinLeafSize: 9 | | 86 | 6 | Accept | 0.19763 | 8.2364 | 0.1792 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 280 | | | | | | | | | | MinLeafSize: 6 | | 87 | 6 | Accept | 0.19389 | 8.1494 | 0.1792 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | | MinLeafSize: 1 | | 88 | 6 | Accept | 0.25942 | 0.09458 | 0.1792 | 228 | svm | BoxConstraint: 42.319 | | | | | | | | | | KernelScale: 0.090631 | | | | | | | | | | Epsilon: 0.10486 | | 89 | 6 | Accept | 0.25921 | 4.4702 | 0.1792 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 201 | | | | | | | | | | MinLeafSize: 1178 | | 90 | 6 | Accept | 0.25937 | 0.071628 | 0.1792 | 228 | tree | MinLeafSize: 6723 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 91 | 6 | Accept | 0.18506 | 10.317 | 0.1792 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 280 | | | | | | | | | | MinLeafSize: 6 | | 92 | 6 | Accept | 2.2599 | 20.375 | 0.1792 | 228 | svm | BoxConstraint: 0.39402 | | | | | | | | | | KernelScale: 31.169 | | | | | | | | | | Epsilon: 0.031802 | | 93 | 6 | Accept | 0.25971 | 0.080514 | 0.1792 | 228 | tree | MinLeafSize: 723 | | 94 | 6 | Accept | 0.25987 | 5.9834 | 0.1792 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 252 | | | | | | | | | | MinLeafSize: 146 | | 95 | 6 | Accept | 0.1866 | 11.574 | 0.1792 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 289 | | | | | | | | | | MinLeafSize: 1 | | 96 | 6 | Accept | 0.20229 | 0.11423 | 0.1792 | 228 | tree | MinLeafSize: 4 | | 97 | 6 | Accept | 0.2592 | 31.178 | 0.1792 | 228 | svm | BoxConstraint: 425.38 | | | | | | | | | | KernelScale: 4.6541 | | | | | | | | | | Epsilon: 0.01015 | | 98 | 6 | Accept | 0.25921 | 0.088085 | 0.1792 | 228 | tree | MinLeafSize: 600 | | 99 | 6 | Accept | 0.19332 | 6.4603 | 0.1792 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 229 | | | | | | | | | | MinLeafSize: 7 | | 100 | 6 | Accept | 0.21514 | 0.13684 | 0.1792 | 228 | tree | MinLeafSize: 2 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 101 | 6 | Accept | 0.25947 | 31.387 | 0.1792 | 228 | svm | BoxConstraint: 0.0080031 | | | | | | | | | | KernelScale: 3.4638 | | | | | | | | | | Epsilon: 0.0098092 | | 102 | 6 | Accept | 0.25944 | 0.099799 | 0.1792 | 228 | svm | BoxConstraint: 0.0099888 | | | | | | | | | | KernelScale: 0.032527 | | | | | | | | | | Epsilon: 0.012744 | | 103 | 6 | Accept | 0.22443 | 0.091154 | 0.1792 | 228 | tree | MinLeafSize: 53 | | 104 | 6 | Accept | 0.25958 | 0.081231 | 0.1792 | 228 | tree | MinLeafSize: 8998 | | 105 | 6 | Accept | 0.20183 | 0.23323 | 0.1792 | 910 | tree | MinLeafSize: 4 | | 106 | 6 | Accept | 0.2592 | 0.10233 | 0.1792 | 228 | tree | MinLeafSize: 2272 | | 107 | 6 | Accept | 0.19763 | 11.001 | 0.1792 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 266 | | | | | | | | | | MinLeafSize: 2 | | 108 | 6 | Accept | 0.25942 | 0.098678 | 0.1792 | 228 | tree | MinLeafSize: 5905 | | 109 | 6 | Accept | 0.25947 | 0.072705 | 0.1792 | 228 | tree | MinLeafSize: 134 | | 110 | 6 | Accept | 0.25928 | 0.10728 | 0.1792 | 228 | svm | BoxConstraint: 74.567 | | | | | | | | | | KernelScale: 0.0015974 | | | | | | | | | | Epsilon: 0.0034784 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 111 | 6 | Accept | 0.20157 | 0.097916 | 0.1792 | 228 | tree | MinLeafSize: 19 | | 112 | 6 | Accept | 0.20545 | 0.11515 | 0.1792 | 228 | tree | MinLeafSize: 3 | | 113 | 6 | Accept | 0.20034 | 9.7233 | 0.1792 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 280 | | | | | | | | | | MinLeafSize: 1 | | 114 | 6 | Accept | 0.25923 | 0.092665 | 0.1792 | 228 | svm | BoxConstraint: 14.996 | | | | | | | | | | KernelScale: 0.10377 | | | | | | | | | | Epsilon: 0.0016871 | | 115 | 6 | Accept | 0.18432 | 10.207 | 0.1792 | 910 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 229 | | | | | | | | | | MinLeafSize: 7 | | 116 | 6 | Best | 0.17879 | 27.102 | 0.17879 | 3639 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 280 | | | | | | | | | | MinLeafSize: 6 | | 117 | 6 | Accept | 0.197 | 9.4409 | 0.17879 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 224 | | | | | | | | | | MinLeafSize: 9 | | 118 | 6 | Accept | 0.20113 | 0.13171 | 0.17879 | 228 | tree | MinLeafSize: 6 | | 119 | 6 | Accept | 0.20746 | 0.12231 | 0.17879 | 228 | tree | MinLeafSize: 4 | | 120 | 6 | Accept | 0.25935 | 0.075511 | 0.17879 | 228 | tree | MinLeafSize: 1980 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 121 | 6 | Accept | 0.49264 | 90.202 | 0.17879 | 910 | svm | BoxConstraint: 0.020371 | | | | | | | | | | KernelScale: 100.81 | | | | | | | | | | Epsilon: 0.05477 | | 122 | 6 | Accept | 0.25992 | 0.079288 | 0.17879 | 228 | tree | MinLeafSize: 7649 | | 123 | 6 | Accept | 0.19313 | 0.2003 | 0.17879 | 910 | tree | MinLeafSize: 6 | | 124 | 6 | Accept | 0.25925 | 0.10507 | 0.17879 | 228 | svm | BoxConstraint: 0.0010279 | | | | | | | | | | KernelScale: 0.017025 | | | | | | | | | | Epsilon: 0.69262 | | 125 | 6 | Accept | 0.19667 | 11.238 | 0.17879 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 280 | | | | | | | | | | MinLeafSize: 1 | | 126 | 6 | Accept | 0.1962 | 17.528 | 0.17879 | 228 | svm | BoxConstraint: 0.57268 | | | | | | | | | | KernelScale: 144.77 | | | | | | | | | | Epsilon: 0.0035523 | | 127 | 6 | Accept | 0.2012 | 9.2172 | 0.17879 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 1 | | 128 | 6 | Accept | 0.25929 | 0.092827 | 0.17879 | 228 | tree | MinLeafSize: 4202 | | 129 | 6 | Accept | 0.22142 | 7.0322 | 0.17879 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 248 | | | | | | | | | | MinLeafSize: 49 | | 130 | 5 | Best | 0.17856 | 21.463 | 0.17856 | 3639 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 229 | | | | | | | | | | MinLeafSize: 7 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 131 | 5 | Accept | 0.20269 | 0.106 | 0.17856 | 228 | tree | MinLeafSize: 12 | | 132 | 6 | Accept | 0.25919 | 0.091818 | 0.17856 | 228 | svm | BoxConstraint: 463.19 | | | | | | | | | | KernelScale: 0.0015281 | | | | | | | | | | Epsilon: 3.5963 | | 133 | 6 | Accept | 2.8527 | 21.219 | 0.17856 | 228 | svm | BoxConstraint: 20.926 | | | | | | | | | | KernelScale: 46.672 | | | | | | | | | | Epsilon: 0.0093695 | | 134 | 6 | Accept | 0.20271 | 8.2695 | 0.17856 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 262 | | | | | | | | | | MinLeafSize: 7 | | 135 | 6 | Accept | 0.25726 | 0.10028 | 0.17856 | 228 | svm | BoxConstraint: 242.37 | | | | | | | | | | KernelScale: 76.45 | | | | | | | | | | Epsilon: 1.5249 | | 136 | 6 | Accept | 0.17951 | 23.185 | 0.17856 | 3639 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 299 | | | | | | | | | | MinLeafSize: 51 | | 137 | 6 | Accept | 0.25919 | 0.11218 | 0.17856 | 228 | svm | BoxConstraint: 21.44 | | | | | | | | | | KernelScale: 0.0076633 | | | | | | | | | | Epsilon: 0.23505 | | 138 | 6 | Accept | 0.2592 | 0.079812 | 0.17856 | 228 | tree | MinLeafSize: 8192 | | 139 | 6 | Accept | 0.18797 | 0.15023 | 0.17856 | 910 | tree | MinLeafSize: 19 | | 140 | 6 | Accept | 0.25927 | 5.641 | 0.17856 | 228 | ensemble | Method: LSBoost | | | | | | | | | | NumLearningCycles: 230 | | | | | | | | | | MinLeafSize: 377 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Training set | Learner | Hyperparameter: Value | | | workers | result | | & validation (sec)| log(1 + valLoss) | size | | | |================================================================================================================================================| | 141 | 6 | Accept | 0.1977 | 11.219 | 0.17856 | 910 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 275 | | | | | | | | | | MinLeafSize: 1 | | 142 | 6 | Accept | 0.25921 | 0.10323 | 0.17856 | 228 | svm | BoxConstraint: 0.0090374 | | | | | | | | | | KernelScale: 0.0060556 | | | | | | | | | | Epsilon: 0.71 | | 143 | 6 | Accept | 0.25924 | 5.4613 | 0.17856 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 226 | | | | | | | | | | MinLeafSize: 6821 | | 144 | 6 | Accept | 0.20313 | 9.2921 | 0.17856 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 279 | | | | | | | | | | MinLeafSize: 1 | | 145 | 6 | Accept | 0.19053 | 0.14872 | 0.17856 | 910 | tree | MinLeafSize: 12 | | 146 | 6 | Accept | 0.25923 | 30.828 | 0.17856 | 228 | svm | BoxConstraint: 0.0031649 | | | | | | | | | | KernelScale: 4.8895 | | | | | | | | | | Epsilon: 0.0013612 | | 147 | 6 | Accept | 0.21067 | 7.0323 | 0.17856 | 228 | ensemble | Method: Bag | | | | | | | | | | NumLearningCycles: 239 | | | | | | | | | | MinLeafSize: 26 | | 148 | 6 | Accept | 0.25932 | 0.084933 | 0.17856 | 228 | svm | BoxConstraint: 0.021344 | | | | | | | | | | KernelScale: 0.0012067 | | | | | | | | | | Epsilon: 0.06872 | | 149 | 6 | Accept | 0.25926 | 0.091933 | 0.17856 | 228 | svm | BoxConstraint: 0.46953 | | | | | | | | | | KernelScale: 1.1723 | | | | | | | | | | Epsilon: 0.018647 | | 150 | 6 | Accept | 0.25929 | 31.158 | 0.17856 | 228 | svm | BoxConstraint: 273.9 | | | | | | | | | | KernelScale: 4.1295 | | | | | | | | | | Epsilon: 0.0024699 | |================================================================================================================================================| | Iter | Active | Eval | log(1 + valLoss) | Time for training | Observed min | Trai...
__________________________________________________________ Optimization completed. Total iterations: 340 Total elapsed time: 409.0068 seconds Total time for training and validation: 2210.5101 seconds Best observed learner is an ensemble model with: Method: LSBoost NumLearningCycles: 209 MinLeafSize: 13 Observed log(1 + valLoss): 0.17714 Time for training and validation: 93.7536 seconds Documentation for fitrauto display
Total elapsed time
значение показывает, что оптимизация ASHA заняла меньше времени, чтобы запуститься, чем Байесова оптимизация (менее чем 10 минут).
Итоговая модель возвращена fitrauto
соответствует лучшему наблюдаемому ученику. Прежде, чем возвратить модель, функция переобучает его с помощью целого обучающего набора данных (trainData
), перечисленный Learner
(или модель) тип и отображенные гиперзначения параметров.
Оцените эффективность возвращенного bayesianMdl
и ashaMdl
модели на наборе тестов testData
. Для каждой модели вычислите среднеквадратическую ошибку (MSE) набора тестов и возьмите логарифмическое преобразование MSE, чтобы совпадать со значениями в многословном отображении fitrauto
. Меньший MSE (и преобразованный в журнал MSE) значения указывают на лучшую эффективность.
bayesianTestMSE = loss(bayesianMdl,testData,"saleprice");
bayesianTestError = log(1 + bayesianTestMSE)
bayesianTestError = 0.1796
ashaTestMSE = loss(ashaMdl,testData,"saleprice");
ashaTestError = log(1 + ashaTestMSE)
ashaTestError = 0.1794
Для каждой модели сравните предсказанные значения отклика набора тестов с истинными значениями отклика. Постройте предсказанную отпускную цену вдоль вертикальной оси и истинную отпускную цену вдоль горизонтальной оси. Точки на ссылочной линии указывают на правильные предсказания. Хорошая модель производит предсказания, которые рассеиваются около линии. Используйте 1 2 мозаичное размещение, чтобы сравнить результаты для этих двух моделей.
bayesianTestPredictions = predict(bayesianMdl,testData); ashaTestPredictions = predict(ashaMdl,testData); tiledlayout(1,2) nexttile plot(testData.saleprice,bayesianTestPredictions,".") hold on plot(testData.saleprice,testData.saleprice) % Reference line hold off xlabel(["True Sale Price","(log transformed)"]) ylabel(["Predicted Sale Price","(log transformed)"]) title("Bayesian Optimization Model") nexttile plot(testData.saleprice,ashaTestPredictions,".") hold on plot(testData.saleprice,testData.saleprice) % Reference line hold off xlabel(["True Sale Price","(log transformed)"]) ylabel(["Predicted Sale Price","(log transformed)"]) title("ASHA Optimization Model")
На основе преобразованных в журнал значений MSE и графиков предсказания, bayesianMdl
и ashaMdl
модели выполняют так же хорошо на наборе тестов.
Для каждой модели используйте диаграммы сравнить распределение предсказанных и истинных отпускных цен городком. Создайте диаграммы при помощи boxchart
функция. Каждая диаграмма отображает медиану, более низкие и верхние квартили, любые выбросы (вычисленное использование межквартильного размаха), и минимальные и максимальные значения, которые не являются выбросами. В частности, линия в каждом поле является демонстрационной медианой, и круговые маркеры указывают на выбросы.
Для каждого городка сравните красную диаграмму (показав распределение предсказанных цен) к синей диаграмме (показав распределение истинных цен). Подобные распределения за предсказанные и истинные отпускные цены указывают на хорошие предсказания. Используйте 1 2 мозаичное размещение, чтобы сравнить результаты для этих двух моделей.
tiledlayout(1,2) nexttile boxchart(testData.borough,testData.saleprice) hold on boxchart(testData.borough,bayesianTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Borough") ylabel(["Sale Price","(log transformed)"]) title("Bayesian Optimization Model") nexttile boxchart(testData.borough,testData.saleprice) hold on boxchart(testData.borough,ashaTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Borough") ylabel(["Sale Price","(log transformed)"]) title("ASHA Optimization Model")
Для обеих моделей предсказанная средняя отпускная цена тесно совпадает со средней истинной отпускной ценой в каждом городке. Предсказанные отпускные цены, кажется, варьируются меньше, чем истинные отпускные цены.
Для каждой модели отобразите графики поля, которые сравнивают распределение предсказанных и истинных отпускных цен количеством семейств в жилье. Используйте 1 2 мозаичное размещение, чтобы сравнить результаты для этих двух моделей.
tiledlayout(1,2) nexttile boxchart(testData.buildingclasscategory,testData.saleprice) hold on boxchart(testData.buildingclasscategory,bayesianTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Number of Families in Dwelling") ylabel(["Sale Price","(log transformed)"]) title("Bayesian Optimization Model") nexttile boxchart(testData.buildingclasscategory,testData.saleprice) hold on boxchart(testData.buildingclasscategory,ashaTestPredictions) hold off legend(["True Sale Prices","Predicted Sale Prices"]) xlabel("Number of Families in Dwelling") ylabel(["Sale Price","(log transformed)"]) title("ASHA Optimization Model")
Для обеих моделей предсказанная средняя отпускная цена тесно совпадает со средней истинной отпускной ценой в каждом типе жилья. Предсказанные отпускные цены, кажется, варьируются меньше, чем истинные отпускные цены.
Для каждой модели постройте гистограмму остаточных значений набора тестов и проверяйте, что они нормально распределены. (Вспомните, что отпускные цены являются преобразованным журналом.) Используют 1 2 мозаичное размещение, чтобы сравнить результаты для этих двух моделей.
bayesianTestResiduals = testData.saleprice - bayesianTestPredictions; ashaTestResiduals = testData.saleprice - ashaTestPredictions; tiledlayout(1,2) nexttile histogram(bayesianTestResiduals) title("Test Set Residuals (Bayesian)") nexttile histogram(ashaTestResiduals) title("Test Set Residuals (ASHA)")
Несмотря на то, что гистограммы немного лево-скашиваются, они оба приблизительно симметричны приблизительно 0.
fitrauto
| boxchart
| histogram
| BayesianOptimization