Классифицируйте текстовые данные Используя глубокое обучение

Этот пример показывает, как классифицировать текстовые описания прогнозов погоды с помощью сети долгой краткосрочной памяти (LSTM) глубокого обучения.

Текстовые данные естественно последовательны. Часть текста является последовательностью слов, которые могут иметь зависимости между ними. Чтобы изучить и использовать долгосрочные зависимости, чтобы классифицировать данные о последовательности, используйте нейронную сеть LSTM. Сеть LSTM является типом рекуррентной нейронной сети (RNN), которая может изучить долгосрочные зависимости между временными шагами данных о последовательности.

Чтобы ввести текст к сети LSTM, сначала преобразуйте текстовые данные в числовые последовательности. Можно достигнуть этого использования кодирования слова, которое сопоставляет документы последовательностям числовых индексов. Для лучших результатов также включайте слой встраивания слова в сеть. Вложения Word сопоставляют слова в словаре к числовым векторам, а не скалярным индексам. Эти вложения получают семантические детали слов, так, чтобы слова с подобными значениями имели подобные векторы. Они также отношения модели между словами через векторную арифметику. Например, отношение "король королеве, как человек женщине", описан королем уравнения – человек + женщина = королева.

Существует четыре шага в обучении и использовании сети LSTM в этом примере:

  • Импортируйте и предварительно обработайте данные.

  • Преобразуйте слова в числовые последовательности с помощью кодирования слова.

  • Создайте и обучите сеть LSTM со слоем встраивания слова.

  • Классифицируйте новые текстовые данные с помощью обученной сети LSTM.

Importdata

Импортируйте данные о прогнозах погоды. Эти данные содержат маркированные текстовые описания погодных явлений. Чтобы импортировать текстовые данные как строки, задайте тип текста, чтобы быть 'string'.

filename = "weatherReports.csv";
data = readtable(filename,'TextType','string');
head(data)
ans=8×16 table
            Time             event_id          state              event_type         damage_property    damage_crops    begin_lat    begin_lon    end_lat    end_lon                                                                                             event_narrative                                                                                             storm_duration    begin_day    end_day    year       end_timestamp    
    ____________________    __________    ________________    ___________________    _______________    ____________    _________    _________    _______    _______    _________________________________________________________________________________________________________________________________________________________________________________________________    ______________    _________    _______    ____    ____________________

    22-Jul-2016 16:10:00    6.4433e+05    "MISSISSIPPI"       "Thunderstorm Wind"       ""                "0.00K"         34.14        -88.63     34.122     -88.626    "Large tree down between Plantersville and Nettleton."                                                                                                                                                  00:05:00          22          22       2016    22-Jul-0016 16:15:00
    15-Jul-2016 17:15:00    6.5182e+05    "SOUTH CAROLINA"    "Heavy Rain"              "2.00K"           "0.00K"         34.94        -81.03      34.94      -81.03    "One to two feet of deep standing water developed on a street on the Winthrop University campus after more than an inch of rain fell in less than an hour. One vehicle was stalled in the water."       00:00:00          15          15       2016    15-Jul-0016 17:15:00
    15-Jul-2016 17:25:00    6.5183e+05    "SOUTH CAROLINA"    "Thunderstorm Wind"       "0.00K"           "0.00K"         35.01        -80.93      35.01      -80.93    "NWS Columbia relayed a report of trees blown down along Tom Hall St."                                                                                                                                  00:00:00          15          15       2016    15-Jul-0016 17:25:00
    16-Jul-2016 12:46:00    6.5183e+05    "NORTH CAROLINA"    "Thunderstorm Wind"       "0.00K"           "0.00K"         35.64        -82.14      35.64      -82.14    "Media reported two trees blown down along I-40 in the Old Fort area."                                                                                                                                  00:00:00          16          16       2016    16-Jul-0016 12:46:00
    15-Jul-2016 14:28:00    6.4332e+05    "MISSOURI"          "Hail"                    ""                ""              36.45        -89.97      36.45      -89.97    ""                                                                                                                                                                                                      00:07:00          15          15       2016    15-Jul-0016 14:35:00
    15-Jul-2016 16:31:00    6.4332e+05    "ARKANSAS"          "Thunderstorm Wind"       ""                "0.00K"         35.85         -90.1     35.838     -90.087    "A few tree limbs greater than 6 inches down on HWY 18 in Roseland."                                                                                                                                    00:09:00          15          15       2016    15-Jul-0016 16:40:00
    15-Jul-2016 16:03:00    6.4343e+05    "TENNESSEE"         "Thunderstorm Wind"       "20.00K"          "0.00K"        35.056       -89.937      35.05     -89.904    "Awning blown off a building on Lamar Avenue. Multiple trees down near the intersection of Winchester and Perkins."                                                                                     00:07:00          15          15       2016    15-Jul-0016 16:10:00
    15-Jul-2016 17:27:00    6.4344e+05    "TENNESSEE"         "Hail"                    ""                ""             35.385        -89.78     35.385      -89.78    "Quarter size hail near Rosemark."                                                                                                                                                                      00:05:00          15          15       2016    15-Jul-0016 17:32:00

Удалите строки таблицы с пустыми отчетами.

idxEmpty = strlength(data.event_narrative) == 0;
data(idxEmpty,:) = [];

Цель этого примера состоит в том, чтобы классифицировать события меткой в столбце event_type. Чтобы разделить данные на классы, преобразуйте эти метки в категориальный.

data.event_type = categorical(data.event_type);

Просмотрите распределение классов в данных с помощью гистограммы. Чтобы сделать метки легче читать, увеличьте ширину фигуры.

f = figure;
f.Position(3) = 1.5*f.Position(3);

h = histogram(data.event_type);
xlabel("Class")
ylabel("Frequency")
title("Class Distribution")

Классы данных являются неустойчивыми со многими классами, содержащими немного наблюдений. Когда классы являются неустойчивыми таким образом, сетевая сила сходятся к менее точной модели. Чтобы предотвратить эту проблему, удалите любые классы, которые появляются меньше чем десять раз.

Получите подсчет частот классов и имен классов от гистограммы.

classCounts = h.BinCounts;
classNames = h.Categories;

Найдите классы, содержащие меньше чем десять наблюдений.

idxLowCounts = classCounts < 10;
infrequentClasses = classNames(idxLowCounts)
infrequentClasses = 1×8 cell array
    {'Freezing Fog'}    {'Hurricane'}    {'Lakeshore Flood'}    {'Marine Dense Fog'}    {'Marine Strong Wind'}    {'Marine Tropical Depression'}    {'Seiche'}    {'Sneakerwave'}

Удалите эти нечастые классы из данных. Используйте removecats, чтобы удалить неиспользованные категории из категориальных данных.

idxInfrequent = ismember(data.event_type,infrequentClasses);
data(idxInfrequent,:) = [];
data.event_type = removecats(data.event_type);

Теперь данные сортируются в классы разумного размера. Следующий шаг должен разделить его в наборы для обучения, валидации и тестирования. Разделите данные в учебный раздел и протянутый раздел для валидации и тестирования. Задайте процент затяжки, чтобы быть 30%.

cvp = cvpartition(data.event_type,'Holdout',0.3);
dataTrain = data(training(cvp),:);
dataHeldOut = data(test(cvp),:);

Разделите протянутый набор снова, чтобы установить валидацию. Задайте процент затяжки, чтобы быть 50%. Это приводит к разделению 70%-х учебных наблюдений, 15% наблюдений валидации и 15%-х тестовых наблюдений.

cvp = cvpartition(dataHeldOut.event_type,'HoldOut',0.5);
dataValidation = dataHeldOut(training(cvp),:);
dataTest = dataHeldOut(test(cvp),:);

Извлеките текстовые данные и метки из разделенных таблиц.

textDataTrain = dataTrain.event_narrative;
textDataValidation = dataValidation.event_narrative;
textDataTest = dataTest.event_narrative;
YTrain = dataTrain.event_type;
YValidation = dataValidation.event_type;
YTest = dataTest.event_type;

Чтобы проверять, что вы импортировали данные правильно, визуализируйте учебные текстовые данные с помощью облака слова.

figure
wordcloud(textDataTrain);
title("Training Data")

Предварительно обработайте текстовые данные

Создайте функцию, которая маркирует и предварительно обрабатывает текстовые данные. Функциональный preprocessText, перечисленный в конце примера, выполняет эти шаги:

  1. Маркируйте текст с помощью tokenizedDocument.

  2. Преобразуйте текст в нижний регистр с помощью lower.

  3. Сотрите пунктуацию с помощью erasePunctuation.

Предварительно обработайте данные тренировки и данные о валидации с помощью функции preprocessText.

documentsTrain = preprocessText(textDataTrain);
documentsValidation = preprocessText(textDataValidation)
documentsValidation = 
  4218×1 tokenizedDocument:

      5 tokens: quarter size hail near rosemark
      7 tokens: large tree down on powerlines in caruthersville
      6 tokens: three trees down on hwy 224
      7 tokens: heat indices of 110 degrees or higher
      9 tokens: numerous trees were reported down in the greenback area
      9 tokens: several large tree branches were blown down in osage
     12 tokens: a tree fell onto a car four miles west southwest of knoxville
      7 tokens: two trees were reported in tellico plains
     21 tokens: wind gusts of 40 to 45 mph were common across buffalo county during the morning and early afternoon of april 2nd
     77 tokens: strong southerly gradient winds affected the nashville metro during the afternoon hours on april 6 a peak wind gust of 49 mph 43 knots was measured at the nashville international airport asos at 145 pm cdt a few trees were blown down across davidson county including a tree blown down in front of the blair school of music on the vanderbilt university campus and a large tree blown down in the yard of a home in hermitage
     36 tokens: wind gusts of 40 to 50 mph were common across crawford county during the morning and early afternoon of april 2nd the highest recorded wind gust was 49 mph by a mesonet station near mt sterling
     31 tokens: snowfall amounts between 19 and 32 inches were reported across prince william county snowfall totaled up to 300 inches near bull run and 185 inches of snow was reported near dumfries
     16 tokens: wind driven hail resulted in numerous holes in siding on the south side of a house
     15 tokens: snowfall amounts were estimated to be between 24 and 36 inches based on observations nearby
     33 tokens: snowfall amounts were reported to be between 18 and 30 inches across southern fauquier county a snowfall report of 300 inches was received in opal and 180 inches of snow fell near bealeton
     10 tokens: three to eight inches of snow fell across suffolk county
      9 tokens: approximately nine inches of snow fell in bristol county
     14 tokens: between 2 and 5 inches of snow were reported over the 12 hour period
     13 tokens: trained spotters reported between 02 and 04 inches of ice around the county
     23 tokens: the white river at newport remained above flood stage from december and fell below flood stage during the evening hours on the 9th
    118 tokens: strong high pressure developed across south central arizona including the greater phoenix area during the day on july 22nd leading to excessive heat over the lower deserts the official high temperature at phoenix reached to 112 degrees the heat proved to be deadly according to a report from local broadcast media a 12 year old boy was rushed to the hospital after losing consciousness from heat stroke or heat exhaustion during an afternoon hike the boy had been hiking at the apache wash trailhead located about halfway between deer valley airport and cave creek the boy later died an excessive heat warning had been in effect for the area since noon continuing on and into the next day
     13 tokens: the blackhall mountain snotel site elevation 9820 ft estimated six inches of snow
     13 tokens: the battle mountain snotel site elevation 7440 ft estimated 17 inches of snow
      7 tokens: trees were blown down on persimmon road
     16 tokens: two to five inches of rain fell across central and eastern portions of scotts bluff county
     13 tokens: one to two feet of water covered highway 2692 from mitchell to scottsbluff
     13 tokens: the intersection of highway 97 and highway vv was closed due to flooding
      7 tokens: quarter size hail was reported at federal
    127 tokens: the national weather service baltimore washington weather forecast office has confirmed a waterspout and tornado struck the potomac river moving into st mary s county just south of beauvue on tuesday february 24 2016 a national weather service ground survey along with radar analysis concluded the tappahannock virginia tornado which created a 30 mile path of damage across the middle peninsula and northern neck of virginia crossed the potomac and traveled a mile into st mary s county maryland before dissipating most of the 65 mile path in maryland was over the potomac river the national weather service classified the storm once onshore as an ef0 peak winds were estimated at 65 mph the path width was approximately 75 yards no damage was reported over the water
     14 tokens: the east santa barbara channel buoy reported a thunderstorm wind gust of 34 knots
     19 tokens: there was a report via social media of 73 inches of lake enhanced snow in 24 hours at ironwood
     27 tokens: lake enhanced snow totals over an 18hour period included six inches near harvey and five inches just south of marquette in the higher terrain of sands township
     19 tokens: twoday storm total lake effect snow accumulation included 14 inches at lac la belle and 11 inches at phoenix
     10 tokens: trees were blown down around highway 37 near fort gaines
     13 tokens: the webber springs snotel site elevation 9250 ft estimated 15 inches of snow
     10 tokens: just over two inches of rain fell in 24 hours
      8 tokens: quarter size hail was reported at fort laramie
     12 tokens: between 2 and 4 inches of snow was reported around the county
     13 tokens: estimated wind gusts of 60 mph were reported 17 miles westnorthwest of hemingford
      9 tokens: nickel to quarter size hail was reported at potter
      9 tokens: just over two inches of rain fell with thunderstorms
     73 tokens: the poteau river near poteau rose above its flood stage of 24 feet at 100 pm cst on december 27th the river crested at 3144 feet at 615 am cst on the 28th resulting in major flooding extensive flooding of cropland occurred many county roads were inundated by flood water the river remained in flood through the end of december 2015 finally falling below flood stage at 1245 pm cst on january 3rd
     10 tokens: fdk reported reduced visibilities of one quarter mile or less
     10 tokens: hwy reported reduced visibilities of one quarter mile or less
     11 tokens: heavy rain fell from thunderstorms on the morning hours of 731
     20 tokens: total rainfall from thunderstorms was six and three quarter inches with over four inches falling in a couple of hours
     14 tokens: between 1 and 3 inches of snow were reported over the 12 hour period
      9 tokens: between 1 and 4 inches of snow was reported
     10 tokens: just under three inches of heavy rain fell from thunderstorms
     21 tokens: a large tree was taken down by thunderstorm wind gusts and knocked off several wires as well onto old topton rd
     28 tokens: heavy rainfall over the solimar burn scar resulted in a significant mud and debris flow on highway 101 multiple lanes were closed due to mud and debris flows
     40 tokens: northerly winds gusting to near 50 mph combined with existing snow cover to cause areas of blowing snow with visibilities lowering to below a half mile at times new snow of less than a half inch fell during this time
     24 tokens: a light glaze of freezing rain caused hazardous travel conditions over portions of fulton county several roads were closed due to the icy conditions
      9 tokens: a tree was blown down just south of bonifay
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
      7 tokens: trees were blown down on beechwood drive
     10 tokens: nyg reported reduced visibilities of one quarter mile or less
     10 tokens: mrb reported reduced visibilities of one quarter mile or less
      9 tokens: between 1 and 4 inches of snow was reported
     12 tokens: ping pong ball size hail was reported two miles south of cheyenne
      5 tokens: heavy rain fell from thunderstorms
     13 tokens: trained spotters reported between 02 and 04 inches of ice around the county
     36 tokens: a tree was blown down at avery and spring street in st augustine the time of damage was based on radar the cost of damage was estimated for the event to be included in storm data
     41 tokens: snow amounts reported by spotters were 10 inches at victor and 9 inches in driggs snotel amounts were the following 25 inches at black bear 13 inches at island park 16 inches at phillips bench and 19 inches at white elephant
     11 tokens: harell road was closed at forbes street due to high water
      9 tokens: golf ball size hail was reported southwest of wheatland
      9 tokens: between 5 and 8 inches of snow was reported
      9 tokens: almost three inches of heavy rain fell with thunderstorms
     31 tokens: a lightning strike hit the pulaski county 911 center several of the computer systems equipment and radios inside the building were damaged or destroyed by the lightning strike time was estimated
     14 tokens: the east santa barbara channel buoy reported a thunderstorm wind gust of 34 knots
     13 tokens: a 24 hour storm total rainfall of 550 inches was reported near northview
     15 tokens: a tree was blown down onto a residence in the cottondale area damage was estimated
      8 tokens: trees were blown down in the youngstown area
     14 tokens: the roof was partially blown off of lighthouse church near sylvester damage was estimated
      9 tokens: a tree was blown down on spring creek road
     53 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms including one in mitchell
      5 tokens: heavy rain fell from thunderstorms
      5 tokens: heavy rain fell from thunderstorms
     13 tokens: the blackhall mountain snotel site elevation 9820 ft estimated 15 inches of snow
      9 tokens: there was a water rescue reported on cherokee avenue
     13 tokens: trained spotters reported between 02 and 04 inches of ice around the county
     14 tokens: the wind sensor at the torrington airport measured a peak gust of 63 mph
      9 tokens: between 5 and 8 inches of snow was reported
     58 tokens: snow began during the evening hours on the 22nd then continued heavy at times through the 23rd before ending early on the 24th snowfall totals included 277 inches in metuchen 240 inches in east brunswick 230 inches in perth amboy 190 inches in woodbridge 180 inches in milltown 170 inches in highland park and 160 inches in cheesequake
     10 tokens: quarter size hail was reported seven miles west of carpenter
     39 tokens: snowfall of 1 to 4 inches combined with southeast winds gusting around 30 mph at times to produce areas of blowing snow with visibilities lowering locally to below one mile an accumulation of 30 inches was reported at everly
     17 tokens: quarter size hail was reported 4 miles southeast of kentwood the report was relayed by broadcast media
      8 tokens: golfball size hail was reported in downtown franklinton
     17 tokens: a porch roof was blown off a home at louisiana highway 447 and courtney road in walker
     12 tokens: just under four inches of rain fell in 24 hours from thunderstorms
     13 tokens: the webber springs snotel site elevation 9250 ft estimated 17 inches of snow
      7 tokens: trees were blown down on thames street
      9 tokens: between 1 and 6 inches of snow was reported
     11 tokens: four and a half inches of heavy rain fell from thunderstorms
     17 tokens: one quarter to one half of an inch of freezing rain accrual was reported across the county
     20 tokens: a tree was blown down on highway 189 about 5 miles outside of elba power lines were also blown down
      8 tokens: a tree was blown down on highway 162
      8 tokens: trees were blown down on harvey mill road
     18 tokens: a tree was blown down onto a house near the 3200 block of crawfordville highway damage was estimated
     11 tokens: highway j was flooded and there was a high water rescue
      9 tokens: between one and two inches of snow was reported
      9 tokens: between one and two inches of snow was reported
     23 tokens: meteorologist from the 26th operational weather squadron at barksdale air force base reported halfdollar size hail on highway 171 northwest of grand cane
     40 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours a gust to 56 mph was measured at storm lake airport the high winds caused spotty power line and traffic light damage
     12 tokens: a wind gust of 60 mph was measured at wunderground site kflpanam37
     12 tokens: snowfall amounts of up to 2 inches were observed across the county
     24 tokens: heavy rain and snowmelt combined to cause minor flooding on the kennebec river at skowhegan flood stage 35000 cfs which crested at 35168 cfs
     27 tokens: a tenth of an inch of freezing rain was reported in dillon a large tree limb was down on hwy 301 south near the church of god
     13 tokens: the wydot sensor at bordeaux measured a peak wind gust of 61 mph
     17 tokens: a spotter reported visibility of 300 yards at el toro rd and aliso creek in aliso viejo
     44 tokens: lake effect snow showers accumulated to between 2 and 6 inches during the evening hours of january 3rd through midmorning january 4th heaviest across northern and eastern portions of the county reduced visibilities and slick roadways led to a few accidents and school delays
     42 tokens: lake effect snow showers accumulated to between 2 and 5 inches during the evening hours of january 3rd through midmorning january 4th heaviest across western portions of the county reduced visibilities and slick roadways led to a few accidents and school delays
     28 tokens: a couple tenths of an inch of freezing rain accrual was reported across the county in addition snowfall sleet amounts of around one half of an inch fell
     11 tokens: a 59 mph wind gust was measured at the cleveland awos
      8 tokens: a tree was blown down on morris road
     11 tokens: a tree fell on a home around 2273 highway 15 south
     46 tokens: lake effect snow showers accumulated to between 2 and 4 inches during the late evening hours of january 3rd through midmorning january 4th heaviest across northwest portions of the county reduced visibilities and slick roadways led to a few accidents and school delays across the region
     14 tokens: a 24 hour storm total rainfall of 381 inches was reported near ash grove
     19 tokens: quarter size hail fell at the intersection of north street and highway 224 on the north side of nacogdoches
     14 tokens: flash flooding washed out portions of county road 19 north of carter canyon road
     16 tokens: a tree was blown down onto county road 5 near the intersection with county road 245
     13 tokens: several trees uprooted along highway 231 between the cities of cleveland and oneonta
     13 tokens: several trees uprooted and power pole downed causing structural damage to a building
     40 tokens: northerly winds gusting to near 50 mph combined with existing snow cover to cause areas of blowing snow with visibilities lowering to below a half mile at times new snow of less than a half inch fell during this time
     29 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours the high winds caused spotty power line and traffic light damage
     15 tokens: law enforcement reported a funnel cloud near the intersection of us 98 and conners highway
     18 tokens: flooding reported in plaza del caribe flood waters reached the doors of the vehicles in the parking lot
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
     26 tokens: woodland high school lunchroom roof lifted off along with numerous trees uprooted and power lines downed in the town of woodland trees uprooted across randolph county
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
     49 tokens: temperatures reaching daytime highs of 95 to 100 after a period of cool weather were accompanied by humid conditions with the heat index rising to a little above 100 degrees an unknown number of people suffered from heat stress heat exhaustion or dehydration as reported by hospital emergency rooms
     20 tokens: thunderstorm winds caused tree damage including several branches blown down tree debris damaged power lines and caused a power outage
     49 tokens: reports of slideoffs and accidents along with school delays were common on january 12th due to snow and blowing snow snow accumulations across the country generally ranged between 2 and 3 inches the accumulating snow combined with temperatures falling into the teens and reduced visibilities created difficult driving conditions
     49 tokens: reports of slideoffs and accidents along with school delays were common on january 12th due to snow and blowing snow snow accumulations across the country generally ranged between 2 and 3 inches the accumulating snow combined with temperatures falling into the teens and reduced visibilities created difficult driving conditions
     23 tokens: six utility poles bent on 4000n between 6000e and 8000e and a half dozen trees downed radar indications are this was a microburst
     13 tokens: a tree was blown down along shell point road and spring creek highway
      5 tokens: heavy rain fell from thunderstorms
     19 tokens: trees were taken down at the intersection of 202 and route 10 in morris plains due to thunderstorm winds
     10 tokens: there were 2 reports of trees down in quitman county
      7 tokens: trees were blown down on highway 216
      8 tokens: trees were blown down along val del road
     15 tokens: a public report indicated pea to penny sized hail near majors field in greenville tx
     49 tokens: reports of slideoffs and accidents along with school delays were common on january 12th due to snow and blowing snow snow accumulations across the country generally ranged between 1 and 3 inches the accumulating snow combined with temperatures falling into the teens and reduced visibilities created difficult driving conditions
     11 tokens: golf ball size hail was reported six miles west of hemingford
     25 tokens: heavy rain and snow melt combined to cause minor flooding on the presumpscot river at westbrook flood stage 150 ft which crested at 1717 ft
     13 tokens: trees and wires down on western ave in morristown due to thunderstorm winds
      5 tokens: heavy rain fell from thunderstorms
     10 tokens: trees and power lines were blown down on highway 52
     23 tokens: downed tree blocking the road along the 500 block of ridgewood road radar estimated winds in excess of 60 mph in the vicinity
     17 tokens: a tree was blown down on county road 58 near the border of franklin and gulf counties
     12 tokens: a tree was blown down at courtney grade road near puckett road
     10 tokens: a tree was taken down due to thunderstorm wind gusts
     36 tokens: near lake darby 34 inches of snow was measured a social media post from north of grove city showed that 3 inches of snow fell there the port columbus international airport recorded 23 inches of snow
      5 tokens: reported by ew8188 in frederick
     13 tokens: the squaw peak raws recorded a gust to 77 mph at 130239 pst
      9 tokens: hail from a thunderstorm was estimated at 75 inches
     21 tokens: liberty county dispatch reported power lines down at the intersection of highway 84 and leroy coffer highway due to gusty winds
     24 tokens: liberty county dispatch reported a tree and a power line down at the intersection of highway 17 and phillips road due to gusty winds
     16 tokens: the juniper creek raws recorded wind chill temperatures ranging from 16f to 24f during this interval
     22 tokens: the flynn prairie raws recorded several gusts exceeding 57 mph during this interval the peak gust was 67 mph at 170713 pst
     13 tokens: dutch harbor asos experienced a peak gust to 82 knots during this time
     10 tokens: several large trees were taken down due to thunderstorm winds
      6 tokens: thunderstorm winds took down numerous trees
     19 tokens: three inchesof snow was measured 5 miles south of heath a spotter measured 26 inches of snow in johnstown
     17 tokens: northeast of troy a spotter measured 28 inches of snow southwest of town odot measured 2 inches
      5 tokens: reported by dw3148 falling waters
     10 tokens: w99 reported reduced visibilities of one quarter mile or less
     17 tokens: the umpqua offshore buoy indicated heavy swell that likely generated high surf along the southern oregon coast
     17 tokens: the port orford buoy indicated heavy swell that likely generated high surf along the southern oregon coast
     17 tokens: the port orford buoy indicated heavy swell that likely generated high surf along the southern oregon coast
     15 tokens: offshore buoys indicated heavy swell that likely generated high surf along the southern oregon coast
     16 tokens: a power line was blown down at highway 71 and industrial road monetary damage was estimated
      8 tokens: weatherflow measured thunderstorm wind gust of 64 mph
     14 tokens: an nws employee measured hail to the size of 125 inches during a thunderstorm
     16 tokens: in plymouth warren avenue was closed at the entrance to plymouth beach due to coastal flooding
     30 tokens: a coop observer reported 4 inches of snow new snow in waterville wa the new snow fell between noon on the 19th and 8am on the 20th of january 2015
     12 tokens: several pictures of golf ball size hail was posted to social media
     15 tokens: a measured wind gust of 56 knots occurred with a thunderstorm at a weatherflow site
     25 tokens: em reported 56 inches of snow in marysville early thursday morning the southwest half of marshall county received 57 inches of snow from this storm
    100 tokens: isolated thunderstorms developed and moved north across the greater phoenix metropolitan area during the late afternoon and early evening hours on july 18th one of the stronger storms moved across phoenix sky harbor airport and generated gusty and damaging winds according to a county official with the city of phoenix at 1825mst damaging thunderstorm outflow winds pulled the roof off of an apartment complex located at the intersection of 26th street and van buren street the apartment complex was about 1 mile northwest of the airport peak wind gusts were estimated to be about 70 knots no injuries were reported
      8 tokens: near circleville an inch of snow was measured
      7 tokens: trees were blown down east of somerset
      8 tokens: three inches of snow was measured near wapakoneta
     63 tokens: a coop observer reported 143 inches of new snow from this passing storm system other new snow amounts associated with this storm system include 82 inches at mazama and 5 inches 12 miles northwest of entiat wa the snow started near 10 pm wednesday reached warning criteria amounts near 10am on thursday the 21st before diminishing substantially early friday morning of january 222016
     11 tokens: two and a half inches of snow was measured near greenville
     14 tokens: a report from east of pickerington showed that 2 inches of snow had fallen
     23 tokens: heavy snotel amounts were 9 inches at dollarhide 11 inches at galena summit 9 inches at hyndman and 10 inches at swede peak
     15 tokens: snowfall amounts were estimated to be between 20 and 30 inches based on reports nearby
      9 tokens: just over two inches of rain fell from thunderstorms
     11 tokens: there were numerous reports of downed trees in the haysi vicinity
     16 tokens: broadcast media reported a tree blown down on a home 2 miles north of mead wa
     35 tokens: two weather stations along the south washington coast reported a few hours of sustained winds between 49 and 51 mph in the early morning a peak gust of 61 mph was measured at cape disappointment
     14 tokens: a member of the public reported 6 inches of snow in east wenatchee wa
     14 tokens: snowfall totaled up to 243 inches near keyser and 235 inches near short gap
     18 tokens: thunderstorm winds caused tree damage including several branches blown down the winds were accompanied by penny size hail
     10 tokens: northwest of chillicothe a half inch of snow was measured
     13 tokens: spotters in trenton and southeast of oxford both measured 2 inches of snow
     47 tokens: numerous trees were downed from a storm that would go on to produce a tornado at nugget lake see separate entry for the tornado local law enforcement officials also indicated a trailer house was blown over at highway 10 and 490th avenue in the town of salem
     11 tokens: snow accumulated 2 to 5 inches including 40 inches near pipestone
     21 tokens: snow amounts up to 15 inches were measured across warren county cocorahs station mcminnville 85 ese measured 14 inches of snow
      9 tokens: trees and wires taken down due to thunderstorm winds
     32 tokens: areas of both low visibilities from fog and icy surfaces from freezing drizzle combined to make travel hazardous from the night of january 6th to the early daylight hours of january 7th
     12 tokens: a coop observer reported 71 inches of snow in holden village wa
     41 tokens: measurements and estimates of 4 to 8 inches of snow were received across pottawatomie county the heaviest snow occurred across the northwest half of the county 6 inches was measured by coop in blaine in the northern parts of the county
     32 tokens: areas of both low visibilities from fog and icy surfaces from freezing drizzle combined to make travel hazardous from the night of january 6th to the early daylight hours of january 7th
     32 tokens: areas of both low visibilities from fog and icy surfaces from freezing drizzle combined to make travel hazardous from the night of january 6th to the early daylight hours of january 7th
     12 tokens: several trees and power wires taken down due to thunderstorm wind gusts
     18 tokens: snow accumulated 2 to 5 inches over the southeastern part of yankton county including 32 inches at yankton
      6 tokens: law enforcement reported several trees down
     12 tokens: power lines were blown down on the 4100 block of jordan ave
     12 tokens: local media reported trees and power lines were down in oakland township
     10 tokens: five people died in las vegas of heat related causes
     10 tokens: a woman died of heat related causes in death valley
      7 tokens: eight homes were flooded in dolan springs
     18 tokens: over 9 inches of snow was reported in alamo the heavy wet snow resulted in scattered power outages
      5 tokens: emergency management reported trees down
      8 tokens: a trained spotter reported trees and wires down
      5 tokens: the public reported trees down
     23 tokens: a tree was down on south shore road near old forge in the town of webb blocking the roadway due to thunderstorm winds
      8 tokens: the sacramento wash flooded the oatman topock highway
     11 tokens: a wind gust to 65 mph was reported at joplin 3sw
     37 tokens: frequent wind gusts of 40 to 50 mph resulted in numerous trees down across vance county including on homes cars and power lines numerous customers lost power in vance county as a result of the strong winds
     28 tokens: the stuart airport awos ksua recorded peak wind gusts of 35 knots as a strong thunderstorm crossed the coast and continued across the intracoastal and nearshore atlantic waters
     11 tokens: one to three inches of snowfall with some light ice accretion
      6 tokens: emergency management reported numerous trees down
     16 tokens: highway 95 was impassable from vidal junction to mile marker 24 due to flooding and debris
     35 tokens: usaf wind tower 0300 recorded a peak gust of 41 knots from the southwest as a strong thunderstorm exited merritt island and continued across the banana river barrier island and into the nearshore atlantic waters
     32 tokens: the awos at the new smryna beach airport kevb reported winds up to 38 knots from the westsouthwest as a strong thunderstorm exited the coast and continued over the nearshore atlantic waters
     37 tokens: the vero beach airport asos kvrb measured a gust to 34 knots from the southsouthwest as a line of strong thunderstorms exited the mainland and continued rapidly east across the intracoastal waterways barrier islands and nearshore atlantic
     14 tokens: eight inches of snow was reported in barryton 73 inches was reported in sylvester
     12 tokens: fourteen inches of snow fell in riverdale ten inches fell in alma
     14 tokens: the interstate 80 at grassey sensor reported a peak wind gust of 58 mph
     11 tokens: quarter size hail was reported near p highway near rocky point
     34 tokens: an estimated 35 inches of rain fell causing water to flow over roadways at highway 132 and hayward stabe road and rupe imo and skeleton wood and imo and wheat capital and highway 132
     14 tokens: thunderstorm winds snapped a one to two foot diameter tree at chaparral high school
     10 tokens: mud and debris were on interstate 15 at exit 64
     12 tokens: a wind gust to 74 mph was reported at gallatin gateway 16se
     58 tokens: torrential rainfall of 12 to 15 inches caused widespread flash flooding across the county the heavy rains caused at least 8 dams to breach in cumberland county numerous roads were closed due to flooding including portions of interstate 95 numerous homes and businesses were flooded as well with numerous water rescues from people trapped in homes and vehicles
     19 tokens: measured wind gusts of 40 to 45 mph knocked down isolated tree limbs that resulted in isolated power outages
     15 tokens: a tree was reported down on black hollow road in arlington due to thunderstorm winds
     16 tokens: a trained weather spotter observed pennysized hail falling near state roads 50 and 429 in ocoee
     24 tokens: a foot of snow was reported in comstock park there were numerous reports of ten to eleven inches of snow across southern kent county
     22 tokens: local emergency management relayed a report of a tree down southwest of somerset shingles were also blown off of a roof nearby
      8 tokens: carpet barn road was closed due to flooding
     10 tokens: highway e near barker creek was closed due to flooding
      6 tokens: flash flooding covered kelso cima road
     24 tokens: approximately 10 vehicles were stuck in flood waters at david drive and river drive sections of needles highway near capri road also washed away
     20 tokens: a light pole was blown down on the neil street onramp to westbound i74 in champaign at around 1400 cst
      6 tokens: lightning set fire to a house
     13 tokens: street flooding was reported at orange street and market street near hogans creek
     10 tokens: a trained spotter measured a wind gust of 70 mph
      9 tokens: several large tree limbs were blown down in gainesville
     15 tokens: an nws employee reported heavy freezing rain causing very icy conditions along the glenn highway
     39 tokens: frequent wind gusts of 30 to 40 mph resulted in multiple reports of trees down across person county including on homes cars and power lines some customers lost power in person county as a result of the strong winds
     15 tokens: a wind gust to 67 mph was reported at bynum 13w the dellwo mcscn site
     57 tokens: a short tornado track was determined along cr53 just north of its intersection with cr26 this was a concentrated area of damage with large trees uprooted and snapped near a residence one of the trees had a small amount of debarking with large limbs removed this tornado was rated ef0 with max winds estimated at 85 mph
     19 tokens: numerous trees were blown down in the area along with numerous power outages reported by the wiregrass electric coop
     14 tokens: a severe thunderstorm producing winds estimated near 60 mph knocked down trees near karthus
     13 tokens: wires were reported down on route 41 in sheffield due to thunderstorm winds
     10 tokens: golf ball size hail fell 1 mile south of alanreed
     14 tokens: trees were blown down and roofs and siding were damaged in the laughlin area
     18 tokens: winds caused isolated damage removing the roof from a trailer home a wall came down with the roof
     86 tokens: this was the second tornado to develop in northwest houston county spawned by the same parent thunderstorm after initially developing in houston county the tornado crossed into extreme southeast dale county before moving back into houston county in the murphy mill road area there was a small area of ef1 damage along murphy mill road where many large diameter pine trees were snapped and uprooted this tornado likely lifted before reaching us highway 231 this tornado was rated ef1 with max winds estimated near 100 mph
     14 tokens: bar pilot dispatcher reported a brief waterspout in the columbia river no damage reported
     12 tokens: public reported thor road flooding near track road just south of pelion
      7 tokens: penny size hail was reported via mping
     33 tokens: usaf wind tower 1007 recorded a peak wind gust over 35 knots near playalinda beach as a strong squall line exited the peninsula and raced eastward across the intracoastal and nearshore coastal waters
     23 tokens: the asos at vero beach airport kvrb recorded peak winds of 38 knots as a strong squall line passed by and continued offshore
     11 tokens: estimated wind gust of 60 mph reported north of hazel green
     10 tokens: the wind gust was measured by a davis weather system
     17 tokens: a few dime to quarter sized hailstones fell along with brief heavy rain and very strong winds
     10 tokens: a large tree was downed in dartmouth blocking reed road
     16 tokens: a tree was snapped off at its base and the fordville scale house was blown down
     17 tokens: the grand canyon airport asos measured a peak wind gust of 59 mph at 207 pm mst
     21 tokens: trees were toppled and power lines brought down by wind gusts estimated at up to 60 mph the time is estimated
     93 tokens: weather observers across cumberland county reported snowfall amounts of 3 to 5 inches winds gusting to between 45 and 55 mph created whiteout conditions from 1000 to 1400 cst snowcovered roads and poor visibility due to falling and blowing snow contributed to numerous traffic accidents across the county especially on i57 a fatal traffic accident occurred on il130 south of greenup when a semi truck collided with another vehicle a 57 yearold male in the vehicle was killed in addition many trees and power lines were blown down resulting in scattered power outages
     11 tokens: local media relayed a report of roof damage to a home
     19 tokens: heavy rainfall over southern sections of alexandria produced flooded roadways some roadways had 2 feet of water over them
     39 tokens: strong north winds behind a cold front pushed the tide levels to or below 1 mllw for 2 tide cycles at sabine pass the tide fell to a lowest level of 19 mllw during the morning of the 24th
     21 tokens: two to three inches of snow and gusty southeast winds up to 25 mph created snow covered roads and hazardous travel
     68 tokens: weather observers across edgar county reported snowfall amounts of 4 to 6 inches winds gusting to between 40 and 50 mph created whiteout conditions from 1000 to 1300 cst snowcovered roads and poor visibility due to falling and blowing snow contributed to numerous traffic accidents across the county especially on us150 and us36 in addition many trees and power lines were blown down resulting in scattered power outages
     11 tokens: a picutre of quarter size hail was received through social media
     13 tokens: multiple power poles were knocked down along patton road relayed via social media
      8 tokens: power lines were knocked down on huntsville road
     20 tokens: a large tree was knocked down and blocking the road on mt olive drive at the intersection of section road
     14 tokens: a tree was knocked down along al 277 in stevenson time estimated by radar
      7 tokens: funnel cloud reported did not touch down
      9 tokens: strong winds hit the grand forks air force base
      8 tokens: a tree was knocked down onto a home
     21 tokens: a 30 by 40 foot section of metal roofing was blown onto the intersection of miller and gray roads in gurley
     21 tokens: two to three inches of snow and gusty southeast winds up to 25 mph created snow covered roads and hazardous travel
     21 tokens: two to three inches of snow and gusty southeast winds up to 25 mph created snow covered roads and hazardous travel
      8 tokens: trees were knocked down on paint hollow road
      7 tokens: trees were knocked down on bellview road
      7 tokens: trees were knocked down on blanche road
     12 tokens: large trees were downed by severe storm winds in the spring area
     22 tokens: there was street flooding in the town of coldspring there was also water inundating highway 59 south of the town of goodrich
      8 tokens: trees were blown down across county road 65
     22 tokens: a social media post from haydenville showed that 7 inches of snow fell there the cooperative observer in laurelville measured 4 inches
     10 tokens: winds damaged a lightweight tin roof fences and utility poles
     42 tokens: polk county fire rescue reported that multiple 911 calls were received of a tornado briefly touching down in the lake wales area a few trees were found knocked over and two power poles were partial damaged but no structural damage was reported
      8 tokens: several trees uprooted in the town of vincent
     43 tokens: a spotter west of hebron reported 4 inches of snow in that area another near union had 32 inches while a third spotter and broadcast media reported 3 inches fell near burlington and francisville respectively the cvg airport recorded 27 inches of snow
     10 tokens: several trees uprooted in and near the cedar bluff community
     14 tokens: six large trees were knocked down along cr 23 between red bay and vina
     41 tokens: the cocorahs observer southwest of bethel measured 5 inches of snow a spotter north of williamsburg measured 45 inches of snow a nws employee in goshen had 4 inches of accumulation while the odot county garage near amelia had 35 inches
     25 tokens: the airport at kcvg measured a 47 mph gust as did a cwop station in burlington numerous trees were blown down causing significant power outages
     10 tokens: a peak wind of 52 kt 60 mph was reported
     25 tokens: the wind sensor at the rawlins airport measured sustained winds of 40 mph or higher with a peak gust of 60 mph at 151253 mst
     13 tokens: several trees uprooted along highway 43 between tierece road and old fayette road
     18 tokens: the observer near new carlisle measured 2 inches of snow another observer north of springfield measured an inch
     12 tokens: several trees uprooted and power lines downed in the coates bend community
     40 tokens: a public report southeast of washington court house showed that 6 inches of snow fell there a social media post from new martinsburg had 5 inches of snow while the cooperative observer south of washington court house measured 4 inches
     10 tokens: the cooperative observer near alpine measured 38 inches of snow
     37 tokens: a nws employee near ogden measured 3 inches of snow another employee north of wilmington and the nws office south of town both measured 23 inches while the odot county garage measured 13 inches west of burtonville
     12 tokens: the odot county garage west of springfield measured an inch of snow
     12 tokens: numerous trees uprooted and power lines downed in the city of wetumpka
     16 tokens: the nedor sensor at dalton on highway 385 measured sustained winds of 40 mph or higher
     17 tokens: the nedor sensor at interstate 80 mile post 50 measured sustained winds of 40 mph or higher
      9 tokens: trace amounts of ice were reported around the county
     11 tokens: this wind gust was measured at a lavaca bay mesonet site
      3 tokens: no damage reported
     16 tokens: the public estimated 075 inch hail in wind point and relayed their report via social media
     17 tokens: a home weather station near new port richey measured a wind gust to 48 knots 55 mph
     20 tokens: rainfall totals generally ranged from 5 to 9 inches across the county franklin airport fkn reported 878 inches of rain
     32 tokens: snow melt and around an inch of rainfall produced an ice jam on the kennebec river at augusta flood stage 120 ft resulting in minor flooding and a crest of 1435 ft
     21 tokens: blizzard conditions were estimated based on observations nearby snowfall reports between 19 and 39 inches were received across southeastern montgomery county
     23 tokens: the patrick air force base awos kcof recorded a peak gust of 34 knots from the northwest as a strong thunderstorm moved offshore
     16 tokens: flash flooding was reported at stevens and hazelwood in borger barricades were setup in those locations
     18 tokens: a home weather station located on indian shores beach measured a wind gust to 39 knots 45 mph
     15 tokens: a home weather station near belleair measured a wind gust of 38 knots 44 mph
     90 tokens: torrential rainfall of 8 to 12 inches caused widespread flash flooding across the county additional heavy rainfall upstream caused moderate flooding along the cape river basin flooding damaged approximately 744 structures throughout the county resulting in 91 million in property damage numerous streets and roads were reported flooded including interstate 95 near dunn with several washouts reported on secondary roads the flooding resulted in 1 direct fatality a 74 year old man died when he drove past a barricade near carolina drive and was swept away into a flooded creek
     12 tokens: visibility was estimated to be around onequarter mile based on observations nearby
     17 tokens: a usgs rain gauge near lakewood ranch measured 752 inches of rain in a 6hr time period
     34 tokens: rainfall totals generally ranged from 3 to 6 inches across the county stampers reported 527 inches of rain healys 1 sse reported 426 inches of rain remlik 1 n reported 371 inches of rain
     58 tokens: heavy rainfall of 7 to 10 inches caused widespread flash flooding across the county roads all throughout the county were closed due to flooding numerous homes and businesses were flooded as well with numerous water rescues from people trapped in homes and vehicles flooding damaged approximately 2503 structures throughout the county resulting in 655 million in property damage
    149 tokens: torrential rainfall of 10 to 14 inches caused widespread flash flooding across the county additional rainfall upstream caused alltime record major flooding along the black river near tomahawk flooding damaged approximately 657 structures throughout the county resulting in 41 million in property damage and and at least 25 million in crop damage numerous roads were flooded all througout the county us 701 was flooded going into both newton grove and garland and nc 24 was flooded between turkey and clinton nc 24 was closed at the county line in autryville with water flowing over the bridge bonnetsville road between salemburg and the avenue was washed out washedout areas were also on edmond matthis road bass lake road mount moriah church road five bridge road fleet cooper road and numerous others numerous homes and businesses were flooded as well with numerous water rescues from people trapped in homes and vehicles
     44 tokens: heavy rainfall of 9 to 12 inches caused widespread flash flooding across the county numerous roads were closed due to flooding numerous homes and businesses were flooded as well flooding damaged approximately 433 structures throughout the county resulting in 31 million in property damage
    197 tokens: torrential rainfall of 9 to 12 inches caused widespread flash flooding across the county additional 5 to 6 inches of rainfall upstream caused alltime record major flooding along the neuse river basin flooding damaged approximately 1160 structures throughout the county resulting in 247 million in property damage and 20 million in crop damage numerous streets and roads were reported flooded causing sinkholes to form including a large sinkhole at mile marker 334 on interstate 40 the flooding resulted in 4 direct fatalities a 19 year old female died when her car was swept away by flood waters into hannah creek on interstate 95 at mile marker 83 near four oaks a 30 year old male died when his vehicle was swept off the road when attempting to drive through flood waters on cornwallis road near nc highway 42 a 67 year old male died when his vehicle was swept away when attempting to go across a floodcovered bridge on highway 210 near galilee road a 51 year old male died when he was …
     21 tokens: the tidal gauge at annapolis indicated moderate flooding water levels through the storm drains approached businesses on dock street in annapolis
     18 tokens: heavy rainfall of 5 to 6 inches caused widespread flash flooding across the county with numerous road closures
    130 tokens: torrential rainfall of 9 to 12 inches caused widespread flash flooding across the county additional heavy rainfall upstream caused major flooding along the tar river basin and along contentnea creek flooding damaged approximately 1174 structures throughout the county resulting in 323 million in property damage and 20 million in crop damage numerous streets and roads were reported flooded with several washouts reported on secondary roads the flooding resulted in 2 direct fatalities a 51 year old female died when the car she was driving was swept off the road in rushing floodwaters along nc highway 581 between renfro road and rock ridge a 65year old male died when his car was swept away by swift water in a creek near the 6400 block of good news church road near saratoga
     41 tokens: rainfall totals generally ranged from 5 to 11 inches across the county benns church 1 wsw reported 1038 inches of rain smithfield reported 883 inches of rain comet reported 870 inches of rain carrollton 2 ese reported 668 inches of rain
     27 tokens: rainfall totals generally ranged from 7 to 10 inches across the county norfolk international airport orf reported 924 inches of rain norview reported 910 inches of rain
     22 tokens: heavy rainfall caused street flooding in rhinelander mainly west of the wisconsin river in the area of davenport street and maple street
     28 tokens: rainfall totals generally ranged from 3 to 6 inches across the county mollusk 1 se reported 398 inches of rain kilmarnock 1 sw reported 311 inches of rain
     30 tokens: rainfall totals generally ranged from 1 inch to 3 inches across the county louisa 1 nnw reported 151 inches of rain zion crossroads 1 nne reported 114 inches of rain
     12 tokens: rainfall totals generally ranged from 2 to 4 inches across the county
     36 tokens: the mount pleasant police department reported longpoint road near needlerush parkway closed due to saltwater flooding at 748 am est a maximum tide level of 771 ft mllw was recorded at the charleston harbor tide gauge
      9 tokens: trace amounts of ice were reported around the county
     11 tokens: severe storm winds caused tree damage in the town of deanville
     18 tokens: there were numerous reports of trees and power lines down throughout dickenson county especially between clincho and haysi
     37 tokens: wind chills of 35 to 40 below zero were common across olmsted county on the morning of january 17th the lowest recorded wind chill by the automated weather observing equipment at the rochester airport was 41 below
     12 tokens: there were a few trees down in the county including in grundy
     12 tokens: flash flooding was reported along john b carter road southeast of fayetteville
     19 tokens: blizzard conditions were reported at reagan national airport snowfall reports were between 18 and 26 inches across arlington county
      5 tokens: thunderstorm winds damaged a fence
     10 tokens: just over two inches of rain fell due to thunderstorms
     13 tokens: several roads closed due to flash flooding with some debris washed into roadways
     11 tokens: a trained spotter estimated 60 mile per hour winds in bridgeport
     20 tokens: a public report of quarter size hail in crossroads was relayed by broadcast media event time was estimated by radar
      8 tokens: snowfall totaled up to 225 inches near dayton
     13 tokens: thunderstorm winds caused tree damage including a large tree blown across a road
     12 tokens: a brief waterspout over northern sarasota bay was reported by the public
     15 tokens: thunderstorm winds blew the roof off a mobile home and also blew down power lines
     14 tokens: a wind gust of 58 mph was recorded at the judith gap dot site
     11 tokens: windows were knocked out at the tom steed reservoir bait shop
     10 tokens: a gust of 61 mph was recorded across the area
     12 tokens: a wind gust of 59 mph was recorded at the baker airport
     11 tokens: thunderstorm winds destroyed two grain bins and damaged a light pole
     70 tokens: a nws survery crew found 25 homes that sustained damage mainly to pool cages roofs garages and carports numerous tree limbs were snapped at the top of the trees with a few being uprooted a few business signs in the area were damaged or destroyed sporadic damage was found along the 3 mile path likely indicating the tornado may have lifted off the ground a time or two before dissipating
     11 tokens: branches were reported down in the northern end of pocahontas county
      9 tokens: the butler awos reported a wind chill of 12
     32 tokens: two to six inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county an isolated report or two exceeded 7 inches
     27 tokens: less than an inch to two inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county
     13 tokens: the asos at columbia metro airport reported a wind gust of 51 mph
      9 tokens: almost three inches of heavy rainfall fell with thunderstorms
      7 tokens: several trees downed due to thunderstorm winds
     60 tokens: between 18 and 30 inches of snow fell fell near the sierra crest and in the higher elevations south of lake tahoe at lake level periods of rain or rain mixed with snow cut down totals greatly with only 9 inches of snow in tahoma and just under 6 inches at the south lake tahoe airport and in tahoe city
     18 tokens: nickel to quarter sized hail fell and nearly covered the ground over two inches of rain also fell
     11 tokens: two and a half inches of rain fell due to thunderstorms
      7 tokens: over two inches fell due to thunderstorms
      8 tokens: several trees taken down due to thunderstorm winds
      9 tokens: hail with a thunderstorm was measured at 75 inches
      9 tokens: hail was measured at 34 inch from a thunderstorm
     28 tokens: snowfall amounts of 6 to 7 inches were measured above the 5000 foot level wind gusts of 25 to 35 mph produced areas of blowing and drifting snow
     26 tokens: snowfall amounts of 6 to 10 inches were measured across the area wind gusts of 20 to 35 mph produced areas of blowing and drifting snow
     27 tokens: snowfall amounts of 1 to 3 inches were measured across the area wind gusts of 20 to 30 mph produced some areas of blowing and drifting snow
     17 tokens: county comms reported multiple trees and power lines blown down near highway 74 and old fort rd
     31 tokens: heavy rain resulted in flash flooding at a couple of locations in asheboro colony road and the intersection of patton avenue and thomas street were briefly closed due to high water
     14 tokens: two trees were blown down at a residence approximately 4 miles northnortheast of enfield
      8 tokens: one tree was reported down on morganton road
     24 tokens: four to six inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county
     24 tokens: the wydot sensor at dana ridge measured sustained winds of 40 mph or higher with a peak gust of 60 mph at 291430 mst
      7 tokens: several trees downed due to thunderstorm winds
     11 tokens: trees and wires downed on centerville road due to thunderstorm winds
     38 tokens: snowfall amounts between 25 and 38 inches were received across frederick county snowfall totaled up to 38 inches near gainesboro a snowfall report of 350 inches was received near stephens city and 245 inches was reported in middletown
      8 tokens: quarter to ping pong ball sized hail fell
      7 tokens: flooding was reported on mirror lake drive
     12 tokens: lightning struck a tree which fell on a house damaging several rooms
     32 tokens: two to six inches of snow fell across the region the larger totals were in higher elevations and in northern sections of the county an isolated report or two exceeded 7 inches
     10 tokens: several trees taken down due to thunderstorm winds in bridgeton
     10 tokens: numerous trees taken down due to thunderstorm winds in fairton
     10 tokens: several trees down on straughn mill road near interstate 295
      9 tokens: almost three inches of rain was measured with thunderstorms
      7 tokens: trees taken down by thunderstorm wind gusts
      6 tokens: a house was struck by lightning
     10 tokens: a 63 mph wind gust was measured from a thunderstorm
      9 tokens: a funnel cloud was observed at 9148 centreville road
     13 tokens: public reported heavy rainfall of 211 inches so far beginning time radar estimated
     13 tokens: power pole and wires taken down due to thunderstorm winds trees also downed
      8 tokens: hail was estimated at 1 inch in diameter
     37 tokens: the department of highways relayed a report of flash flooding at highway 20 four miles west of loup loup summit a debris flow went across the road roughly 6 miles east of twisp wa on highway 20
      8 tokens: trees and wires downed on bunker hill road
     12 tokens: a 53 mph thunderstorm wind gust was measured by a weatherflow site
     19 tokens: severe thunderstorm wind gusts around 60 mph downed trees along south mt pleasant avenue between monroeville and highway 84
      9 tokens: severe thunderstorm wind gusts downed trees along oakley road
     12 tokens: numerous wires were reported down at route 27 at davils mill rd
     15 tokens: fd reported a tree blown down on a home causing significant damage on lakeside loop
     18 tokens: one shallow rooted oak tree was blown over wind speeds were estimated to be 60 miles per hour
     22 tokens: snow accumulated 3 to 6 inches including 60 inches near pukwana the snow caused slippery roads which resulted in a few accidents
     48 tokens: a bow echo producing winds estimated at 80 mph produced a corridor of wind damage along and north of straughn school road which is northeast of andalusia numerous trees were uprooted with power lines also downed a tree fell onto a home on country drive causing considerable damage
     11 tokens: a tree fell and damaged utility equipment off of mcdaniel road
     36 tokens: social media reports of at least a couple of dozen trees blown down across far northern iredell county with one on a house causing a brief entrapment the roof of a gas station was also damaged
     47 tokens: county comms and highway patrol reported multiple trees blown down across roads in southwest greenwood county from the intersection of alexander and briarwood rd south to just north and east of bradley part of a roof was reported to be damaged on breezewood rd east of bradley
     39 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours a gust to 63 mph was measured near wessington springs the high winds caused spotty power line and traffic light damage
     15 tokens: a few trees were blown down in the cranfield liberty road area south of cranfield
      9 tokens: public reported quarter size hail on sam dee rd
     63 tokens: the stream gauge on potomac river at point of rocks reached flood stage the gauge peaked at 16 feet at 0015 est the parking lots at both the mckimmey and brunswick boat ramps began to flood flooding of an agricultural field adjacent to the mckimmey boat ramp occurred about half the lower parking lot of the point of rocks boat ramp also flooded
     10 tokens: old charles town road was closed near the opequon creek
     17 tokens: spiky hail around ping pong ball size was reported near the intersection of highways 82 and 319
     39 tokens: light snow began around noon on january 17th then continued through the afternoon hours storm totals included 23 inches near little egg harbor 15 inches in berkeley township 13 inches in brick township and 10 inches in jackson township
      6 tokens: several roads flooded in the area
     20 tokens: dallas center fire department reported hail just under ping pong ball in size mixed with larger amounts of smaller hail
     39 tokens: westerly winds behind a cold front reach sustained speed of 40 to 45 mph for a few hours a gust to 54 mph was measured at le mars the high winds caused spotty power line and traffic light damage
     23 tokens: trees and power lines were blown down in ocilla in addition a house fire resulted from a downed power line damage was estimated
     11 tokens: trees damaged an suv and a mobile home damage was estimated
     10 tokens: spotter reported brief 34 inch hail off old river rd
     10 tokens: spotters reported around half of an inch across the county
     39 tokens: while lingering light snow after a blizzard produced little additional accumulation continuing strong north to northwest winds produced blowing and drifting of the heavy new snowpack through the morning hours difficult to impossible travel conditions slowly began to ease
     12 tokens: a large cedar tree about two feet in diameter was reported down
     23 tokens: multiple large trees uprooted and blown onto power lines resulting in toppled power lines all which resulted in blockage of the entire roadway
     12 tokens: a large tree was reported down across calhoun st in west baltimore
     44 tokens: county comms and public via social media reported multiple trees blown down in the uptown and central city area the damage was centered in the elizabeth neighborhood where multiple trees fell on vehicles and one tree fell on an apartment building along greenway ave
     13 tokens: the wydot sensor at strouss hill measured peak wind gusts of 58 mph
     27 tokens: the wydot sensor at interstate 80 mile post 249 measured sustained winds of 40 mph or higher with a peak gust of 60 mph at 151355 mst

Просмотрите первые несколько предварительно обработанных учебных материалов.

documentsTrain(1:5)
ans = 
  5×1 tokenizedDocument:

     7 tokens: large tree down between plantersville and nettleton
    37 tokens: one to two feet of deep standing water developed on a street on the winthrop university campus after more than an inch of rain fell in less than an hour one vehicle was stalled in the water
    13 tokens: nws columbia relayed a report of trees blown down along tom hall st
    13 tokens: media reported two trees blown down along i40 in the old fort area
    14 tokens: a few tree limbs greater than 6 inches down on hwy 18 in roseland

Преобразуйте документ последовательностям

Чтобы ввести документы в сеть LSTM, используйте кодирование слова, чтобы преобразовать документы в последовательности числовых индексов.

Чтобы создать кодирование слова, используйте функцию wordEncoding.

enc = wordEncoding(documentsTrain);

Следующий шаг преобразования должен заполнить и обрезать документы, таким образом, они являются всеми одинаковыми длина. Функция trainingOptions предоставляет возможности заполнять и обрезать входные последовательности автоматически. Однако эти опции не хорошо подходят для последовательностей векторов слова. Вместо этого клавиатура и усеченный последовательности вручную. Если вы лево-заполняете и обрезаете последовательности векторов слова, то учебная сила улучшается.

Чтобы заполнить и обрезать документы, сначала выберите целевую длину, и затем обрежьте документы, которые более длинны, чем она и лево-заполняют документы, которые короче, чем она. Для лучших результатов целевая длина должна быть короткой, не отбрасывая большие объемы данных. Чтобы найти подходящую целевую длину, просмотрите гистограмму длин учебного материала.

documentLengths = doclength(documentsTrain);
figure
histogram(documentLengths)
title("Document Lengths")
xlabel("Length")
ylabel("Number of Documents")

Большинство учебных материалов имеет меньше чем 75 лексем. Используйте это в качестве своей целевой длины для усечения и дополнения.

Преобразуйте документы последовательностям числовых индексов с помощью doc2sequence. Чтобы обрезать или лево-заполнить последовательности, чтобы иметь длину 75, установите опцию 'Length' на 75.

XTrain = doc2sequence(enc,documentsTrain,'Length',75);
XTrain(1:5)
ans = 5×1 cell array
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}

Преобразуйте документы валидации последовательностям с помощью тех же опций.

XValidation = doc2sequence(enc,documentsValidation,'Length',75);

Создайте и обучите сеть LSTM

Задайте архитектуру сети LSTM. Чтобы ввести данные о последовательности в сеть, включайте входной слой последовательности и установите входной размер на 1. Затем, включайте слой встраивания слова размерности 100 и то же количество слов как кодирование слова. Затем, включайте слой LSTM и определите номер скрытых модулей к 180. Чтобы использовать слой LSTM для проблемы классификации последовательностей к метке, установите режим вывода на 'last'. Наконец, добавьте полносвязный слой с тем же размером как количество классов, softmax слоя и слоя классификации.

inputSize = 1;
embeddingDimension = 100;
numWords = enc.NumWords;
numHiddenUnits = 180;
numClasses = numel(categories(YTrain));

layers = [ ...
    sequenceInputLayer(inputSize)
    wordEmbeddingLayer(embeddingDimension,numWords)
    lstmLayer(numHiddenUnits,'OutputMode','last')
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer]
layers = 
  6x1 Layer array with layers:

     1   ''   Sequence Input          Sequence input with 1 dimensions
     2   ''   Word Embedding Layer    Word embedding layer with 100 dimensions and 16954 unique words
     3   ''   LSTM                    LSTM with 180 hidden units
     4   ''   Fully Connected         39 fully connected layer
     5   ''   Softmax                 softmax
     6   ''   Classification Output   crossentropyex

Задайте опции обучения. Установите решатель на 'adam', train в течение 10 эпох, и установите порог градиента к 1. Установите начальную букву, изучают уровень 0,01. Чтобы контролировать учебный прогресс, установите опцию 'Plots' на 'training-progress'. Задайте данные о валидации с помощью опции 'ValidationData'. Чтобы подавить многословный вывод, установите 'Verbose' на false.

По умолчанию trainNetwork использует графический процессор, если вы доступны (требует Parallel Computing Toolbox™, и CUDA® включил графический процессор с, вычисляют возможность 3.0 или выше). В противном случае это использует центральный процессор. Чтобы задать среду выполнения вручную, используйте аргумент пары "имя-значение" 'ExecutionEnvironment' trainingOptions. Обучение на центральном процессоре может взять значительно дольше, чем обучение на графическом процессоре.

options = trainingOptions('adam', ...
    'MaxEpochs',10, ...    
    'GradientThreshold',1, ...
    'InitialLearnRate',0.01, ...
    'ValidationData',{XValidation,YValidation}, ...
    'Plots','training-progress', ...
    'Verbose',false);

Обучите сеть LSTM с помощью функции trainNetwork.

net = trainNetwork(XTrain,YTrain,layers,options);

Протестируйте сеть LSTM

Чтобы протестировать сеть LSTM, сначала подготовьте тестовые данные таким же образом как данные тренировки. Затем сделайте прогнозы на предварительно обработанных тестовых данных с помощью обученной сети LSTM net.

Предварительно обработайте тестовые данные с помощью тех же шагов в качестве учебных материалов.

textDataTest = lower(textDataTest);
documentsTest = tokenizedDocument(textDataTest);
documentsTest = erasePunctuation(documentsTest);

Преобразуйте тестовые документы последовательностям с помощью doc2sequence с теми же опциями, создав обучающие последовательности.

XTest = doc2sequence(enc,documentsTest,'Length',75);
XTest(1:5)
ans = 5×1 cell array
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}
    {1×75 double}

Классифицируйте тестовые документы с помощью обученной сети LSTM.

YPred = classify(net,XTest);

Вычислите точность классификации. Точность является пропорцией меток, которые сеть предсказывает правильно.

accuracy = sum(YPred == YTest)/numel(YPred)
accuracy = 0.8691

Предскажите Используя новые данные

Классифицируйте тип события трех новых прогнозов погоды. Создайте массив строк, содержащий новые прогнозы погоды.

reportsNew = [ ...
    "Lots of water damage to computer equipment inside the office."
    "A large tree is downed and blocking traffic outside Apple Hill."
    "Damage to many car windshields in parking lot."];

Предварительно обработайте текстовые данные с помощью шагов предварительной обработки в качестве учебных материалов.

documentsNew = preprocessText(reportsNew);

Преобразуйте текстовые данные в последовательности с помощью doc2sequence с теми же опциями, создав обучающие последовательности.

XNew = doc2sequence(enc,documentsNew,'Length',75);

Классифицируйте новые последовательности с помощью обученной сети LSTM.

[labelsNew,score] = classify(net,XNew);

Покажите прогнозы погоды с их предсказанными метками.

[reportsNew string(labelsNew)]
ans = 3×2 string array
    "Lots of water damage to computer equipment inside the office."      "Flash Flood"      
    "A large tree is downed and blocking traffic outside Apple Hill."    "Thunderstorm Wind"
    "Damage to many car windshields in parking lot."                     "Hail"             

Предварительная обработка функции

Функциональный preprocessText выполняет эти шаги:

  1. Маркируйте текст с помощью tokenizedDocument.

  2. Преобразуйте текст в нижний регистр с помощью lower.

  3. Сотрите пунктуацию с помощью erasePunctuation.

function documents = preprocessText(textData)

% Tokenize the text.
documents = tokenizedDocument(textData);

% Convert to lowercase.
documents = lower(documents);

% Erase punctuation.
documents = erasePunctuation(documents);

end

Смотрите также

| | | | | | | |

Похожие темы