NOAA / National Oceanic and Atmospheric Administration - Global System Laboratory Publications Directory

Abstract

One of the National Oceanic and Atmospheric Administration (NOAA) goals is to provide timely and reliable weather forecasts to support important decisions when and where people need it for safety, emergencies, planning for day-to-day activities. Satellite data is essential for areas lacking in-situ observations for use as initial conditions in Numerical Weather Prediction (NWP) Models, such as spans of the ocean or remote areas of land. Currently only about 7% of total received satellite data is selected for use and from that, an even smaller percentage ever are assimilated into NWP models. With machine learning, the computational and time costs needed for satellite data selection can be greatly reduced. We study various machine learning approaches to process orders of magnitude more satellite data in significantly less time allowing for a greater quantity and more intelligent selection of data to be used for assimilation purposes. Given the future launches of satellites in the upcoming years, machine learning is capable of being applied for better selection of Regions of Interest (ROI) in the magnitudes more of satellite data that will be received. This paper discusses the background of machine learning methods as applied to weather forecasting and the challenges of creating a “labeled dataset” for training and testing purposes. In the training stage of supervised machine learning, labeled data are important to identify a ROI as either true or false so that the model knows what signatures in satellite data to identify. Authors have selected cyclones, including tropical cyclones and mid-latitude lows, as ROI for their machine learning purposes and created a labeled dataset of true or false for ROI from Global Forecast System (GFS) reanalysis data. A dataset like this does not yet exist and given the need for a high quantity of samples, is was decided this was best done with automation. This process was done by developing a program similar to the National Center for Environmental Prediction (NCEP) tropical cyclone tracker by Marchok that was used to identify cyclones based off its physical characteristics. We will discuss the methods and challenges to creating this dataset and the dataset’s use for our current supervised machine learning model as well as use for future work on events such as convection initiation.

Article / Publication Data

Active/Online

YES

Available Metadata

DOI ↗

View

Fiscal Year

2018

Publication Name

Recorded Presentation

Published On

January 10, 2018

Publisher Name

Link in DOI

Type

Conference Report

URL ↗

View

Institutions

Not available

Authors

Authors who have authored or contributed to this publication.

Christina E. Kumler - lead Gsl

Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder

NOAA/Global Systems Laboratory
Yu-Ju Lee - second Gsl

Other
Jebb Q. Stewart - third Gsl

Federal
Mark W. Govett - fourth Gsl

Federal
Brian J. Etherton - fifth Gsl

Other
Lidia Trailovic - sixth Gsl

Other

Improving Satellite Observation Utilization For Model Initialization With Machine Learning: An Introduction and Tackling The “labeled Dataset” Challenge For Cyclones Around The World

Abstract

Institutions

Authors