1)Abstract

The main idea is to transform an output of a NWP (Numerical Weather Prediction) model and

weather observations to training examples for supervised machine learning algorithm that

determines coefficients of a forecast correction function.  Forecast correction is a complex nonlinear

function (artificial neural network) that takes current NWP forecast, latest weather observations and returns

corrected forecast. 

Figure 1 illustrates directions of data flow in the forecast correction system.

 

 

                                                     Figure 1. Data processing pipeline

 

Main features of the proposed approach :

1)Outputs in multiple grid points of NWP model and observations of multiple weather stations

are used to create training examples. This allows forecast correction function to take into account

two - dimensional weather patterns.

2)Weather observations from different sources (WMO observation stations, local wireless sensor

networks, personal weather stations) can be combined together to achieve greater forecast precision.

 

Main results:

1)The software system that performs forecast correction is implemented.

2)Numerical experiments with real-world meteorological data are carried out.  Results of those

experiments show significant improvement of forecast accuracy over the baseline.

 

2)Training examples for supervised machine learning algorithm

Raw meteorological data that is needed for forecast correction system consists of three groups of time series:

1. Observations in a location for which we want to get a corrected forecast

2. Observations in neighbouring locations

3. Meteorological parameters forecasted by NWP model

These time series are transformed into pairs of input and output vectors. Each input vector contains

parameters forecasted by NWP model at time t and observations of parameters at time t - t_delta

(e.g., t_delta may be 86400 seconds).  Each output vector contains an observation at time t of a parameter

for which we want to get a corrected forecast.

 


                                          Figure 2. Training examples

Figure 2 shows simplified set of training examples.

 

In order to capture two-dimensional weather patterns, parameters from multiple grid points of NWP

model and observations from multiple weather stations are taken into account.

 

                                 Figure 3. Positions of NWP grid points and observation stations

Figure 3 shows possible positions of NWP grid points and observation stations.  Each arrow

represents the fact that input vector contains meteorological parameters in that location.

 

Resulting n-dimensional input vector is a concatenation of meteorological parameters from l

weather stations and m NWP grid points. Where

n = l * number of observation parameters + m * number of NWP parameters.

 

                                          Figure 4. Input vector

 

3)Forecast correction function and machine learning algorithm

Forecast correction function is a feedforward artificial neural network with multiple hidden layers.

Artificial neural network can approximate any computable function to arbitrary accuracy

(when there are enough neurons in hidden layers).

 

                                                 Figure 5. Artificial neural network

Each line in Figure 5 represents a coefficient.

 

Machine learning algorithm is an optimization process that finds coefficients that minimize

mean squared error of an output of a neural network. The optimization is carried

out using stochastic gradient descent algorithm.

 

Coefficients of a neural network after optimization contain information about two-dimensional

weather patterns that affect forecast accuracy. Example of a pattern (contrived) : when there

is a strong south-west wind and the air pressure is higher than 1010kPa and humidity is higher

than 90% then the temperature forecasted by NWP is usually 2 degrees lower than the real value.

All information about weather patterns is "hidden" inside neural network coefficients.

 

4)Software implementation

Current software implementation of forecast correction system consists of 8 software components:

 

1.GFS (Global Forecast System) forecast history downloader  - program that downloads outputs of GFS model in GRIB2 format

from http://www.noaa.gov/ site and saves a subset of meteorological parameters in relational database.

This program is implemented using Haskell programming language.

 

2.FMI's open data downloader - program that downloads weather observations from stations in Finland

using Finnish Meteorological Institute's XML API and saves in a relational database. This program is

implemented using Haskell programming language.

 

3.Current GFS forecast downloader - program that downloads latest output of GFS model in

GRIB2 format from http://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/ site and saves a

subset of meteorological parameters in a relational database. This program is implemented

using Haskell programming language.

 

4.Relational database - database that contains meteorological information in form of time series.

 

5.Data preprocessor - program that is responsible for transformation of raw weather data in form of

time series to training examples for supervised machine learning algorithm. This program performs

data cleansing and merging of all time series into a set of training examples. This program is implemented

using Lua programming language.

 

6.Neural network trainer - program that trains neural network using stochastic gradient descent.

Output of this program is a binary file with neural network coefficients. This program is implemented

using Lua programming language and Torch scientific computation framework.

 

7.Neural network - program that calculates output of a neural network given coefficients and input vector.

This program is implemented using Lua programming language and Torch scientific computation framework.

 

8.HTTP server - program that generates dynamic web-page with corrected GFS forecast  :

https://alterozoom.com/meteo/gfs_correction 

The server is implemented using Haskell programming language.

 

 

                                                       Figure 6. Software components

 

5)Numerical experiments

5.1)Reduction of error

Relative reduction of a root-mean-square error (RMSE) of a forecast is used as a criterion

of efficiency of the proposed method. Reduction of error is calculated as follows  :

(RMSE_NWP  -  RMSE_NN) / RMSE_NWP where

RMSE_NWP is a root-mean-square error of a forecast that is calculated using only meteorological parameters in NWP

grid point that is closest to the location of interest (baseline forecast),

RMSE_NN is a root-mean-square error of a forecast that is calculated using forecast correction function

(forecast correction function is a neural network trained on meteorological parameters in multiple NWP grid points

and observation stations).

 

5.2)Data sources

1.Observations in a location for which we want to get corrected forecast :

2 years of observations at Helsinki Kumpula (temperature, air pressure, wind speed, wind direction,

gust speed,humidity)

 

2.Observations in neighboring locations :

2 years of observations at Helsinki Kaisaniemi (temperature, air pressure, wind speed, wind direction,

gust speed,humidity)

 

3.Meteorological parameters forecasted by NWP model :

Output of Global Forecast System (GFS) at coordinates  (latitude, longitude):

60.5 ,  24.5

60.5 , 25

60  , 24.5

60  , 25

Meteorological parameters : temperature, air pressure,  humidity, U component of wind speed, V component of wind speed

 

Input vector size is 32 (2 * 6 + 4 * 5).

Output vector size is 1

 

5.3)Neural network training

Cross - validation technique is used to assess generalization ability of a neural network and to

determine when to stop iteration of a stochastic gradient descent. Pairs of input and output vectors

are split randomly into two subsets : "training" examples (90% of all data) and "testing" (or validation)

examples (10 % of all data). After each iteration of a stochastic gradient descent two values are

calculated : "training" reduction of forecast error and  "testing" reduction of forecast error. 

 

                   Figure 7. "Training" and "testing"  reduction of temperature forecast error (in Helsinki Kumpula)

 

In Figure 7 green line is "training" forecast error reduction, blue line is "testing" forecast error reduction.

Horizontal axis shows iterations.  "Training" reduction of a forecast error grows after 200 iterations while

"testing" reduction of a forecast errors remains constant. It means that neural network after 200 iterations

overfits the data and iteration of a stochastic gradient descent should be stopped.

 

5.4) Results

"Testing" forecast error reduction is calculated for three meteorological parameters : temperature,

air pressure, humidity.  Each experiment is carried out with two configurations of neural network :

1)The simplest possible neural network with one 1 linear layer (this essentially is a multiple linear regression).

2)Three - layer neural network with hyperbolic tangent nonlinearity.

 

Temperature forecast error reduction (the more the better):

1)One layer network : 0.3824

2)Three layers network : 0.4913  

 

Humidity forecast error reduction (the more the better):

1)One layer network : 0.2596

2)Three layers network : 0.3250

 

Pressure forecast error reduction (the more the better):

1)One layer network : 0.9239

2)Three layers network : 0.9136 

 

Global Forecast System