Global Data Assimilation System (GDAS1) Archive Information
Archive began: December 1, 2004
NOAA-Air Resources Laboratory 1315 East-West Highway Silver Spring, MD 20910 (301-713-0295)
GDAS ARCHIVE OVERVIEWThe National Weather Service's National Centers for Environmental Prediction (NCEP) runs a series of computer analyses and forecasts operationally. One of the operational systems is the GDAS (Global Data Assimilation System). Information on this model can be found on the NCEP website. At NOAA's Air Resources Laboratory (ARL), NCEP model output are used for air quality transport and dispersion modeling. ARL archives both EDAS (Eta Data Assimilation System) and GDAS output using a 1-byte packing routine. Both archives contain basic fields such as the u- and v-wind components, temperature, and humidity. However, the archives differ from each other because of the horizontal and vertical resolution, as well as in the specific fields, provided by NCEP.
ORIGIN OF THE DATA The 3-hourly archive data come from NCEP's GDAS. The GDAS is run 4 times a day, ie, at 00, 06, 12, and 18 UTC. Model output is for the analysis time and 3, 6, and 9-hour forecasts. NCEP post-processing of the GDAS converts the data from spectral coefficient form to 1 degree latitude-longitude (360 by 181) grids and from sigma levels to mandatory pressure levels. Model output is in GRIB format. See the web link above for current information on the GDAS. ARL saves the successive analyses and 3-hour forecast, four times each day to produce a continuous data archive. Some fields such as precipitation and surface fluxes are not available at the analysis time, therefore these are taken from the 6-hour forecast files.
ARL PROCESSING ARL processing converts NCEP's 1-degree GRIB output using a one-byte character packing method described below. Then, the ARL archiving program produces a 3 hourly, global, 1 degree latitude longitude dataset on pressure surfaces. The data are put into weekly files and made available online at the ARL server for access via ftp. 7-day archive file size is about 600 MB. If processing for a cycle fails, the data from the 6-h and 9-h forecast will fill in the missings.
DATA DESCRIPTION The archive data file contains the data in synoptic time sequence, without any missing records (missing data is represented by nulls and the forecast hour is set to negative 1). Therefore it is possible to position randomly to any point within a data file. Each file contains data for one week except for files containing data past the 28th of the month. At each time period, an index record is always the first record, followed by surface data, and then all data in each pressure level from the ground up. GDAS1 data are available in the files called gdas1.mmmyy.w#, where mmm is the month (e.g. jul) and yy is the year (05) and # refer to:
#=1 - days 1-7
Data Grid The data are on a 360 by 181 latitude-longitude (grid). The lower-left corner (1,1) is (0W,90S). The upper-right corner (360,181) is (1W, 90N). In Table 1, the data grid is identified by the model that produced the data, a grid identification number, the number of X and Y grid points, the Pole position (latitude and longitude) of the grid projection, a reference latitude and longitude, the grid spacing (km) which is true at the reference point, the orientation with respect to the reference longitude, the angle between the axis and the cone, and a point on the grid in grid units and latitude and longitude . The given pole position results in the lowest left grid point to have a value of (1,1).
Table 1. Data Grid Specifications
Table 2. Meteorological Fields contained in the GDAS Archive. For accumulation/average fields, 6-h acc/avg at 00, 06, 12, 18 UTC, 3-h acc/avg at 03, 09, 15, 21 UTC.
* geopotential meters
Meteorological Fields and Vertical Structure The archived data files contain only some of the fields produced by the model at NCEP. These fields were selected according to what is most relevant for transport and dispersion studies and disk space limitations. In Table 2, the fields are identified by a description, the units, and a unique four character identification label that is written to the header label (see Data Grid Unpacking Procedure in a later section) of each record. Data order in the file is given by a two digit code. The first digit indicates if it is a surface (or single) level variable (S) or an upper level variable (U). The second digit indicates the order in which that variable appears in the file. The upper level GDAS data are output on the following 23 pressure surfaces. Table 3 gives the level number corresponding to each data level, which is also written to each header label.
Table 3. Description of Vertical Levels
Missing Data Missing data are written as an array of nulls with a forecast hour of -1 in the header label. The associated field label may be either "NULL" or the label given in Table 2.
Definition File The definition file given in Appendix A summarizes the grid specifications and data fields. The format is such that the first 20 characters are the dummy ID field followed by the data. Much of the information is written into the index record of each time period.
Record 1 consists of a four character string that identifies the source of
the meteorological data.
The key to reading the meteorological files is decoding the ASCII index record, the first record of each time period. The first 50 characters of the index record contain the same "header" information as do the other records in the given time period. The four-character label is "INDX". The format for this record is given below. Complete descriptions are similar to the variables in the discussion above of the Definition File.
Format of the Index Record
END LEVEL AND VARIABLE LOOPS
Data Grid Unpacking NCEP typically saves their model output in GRIB format. However, at ARL the data are stored in a more compact form and can be directly used on a variety of computing platforms with direct access I/O.
The data array is packed and stored into one-byte characters. To preserve as much data precision as possible, the difference between adjacent grid point=s values is saved and packed rather than the actual values. The grid is then reconstructed by adding the differences between grid values starting with the first value, which is stored in unpacked ASCII form in the header record at grid point (1,1). To illustrate the process, assume that a grid of real data, R, of dimensions i,j is given by the below example.
1,j 2,j .... i-1,j i,j 1,j-1 2,j-1 .... i-1,j-1 i,j-1 .... .... .... .... .... 1,2 2,2 .... i-1,2 i,2 1,1 2,1 .... i-1,1 i,1
The packed value, P, is then given by
Pi,j = (Ri,j - Ri-1,j)* (2**(7-N)),
where the scaling exponent
N = ln dRmax / ln 2 .
The value of dRmax is the maximum difference between any two adjacent grid points for the entire array. It is computed from the differences along each i index holding j constant. The difference at index (1,j) is computed from index (1,j-1), and at 1,1 the difference is always zero. The packed values are one byte unsigned integers, where values from 0 to 126 represent -127 to -1, 127 represents zero, and values of 128 to 254 represent 1 to 127. Each record length is then equal in bytes to the number of array elements plus 50 bytes for the header label information. The 50 byte label field precedes each packed data field and contains the following ASCII data:
*Forecast hour is -1 for missing data.
Sample Program A sample FORTRAN90 program is available from the ARL ftp server ( ftp://arlftp.arlhq.noaa.gov/pub/archives/utility/chk_data.f ) that can be used to unpack and read the first few elements of the data array for each record of an ARL packed meteorological file.
Appendix A. Definition File - GDAS.CFG