Chapter 9

DATA MANAGEMENT AND INTEGRATION POLICY




Introduction The development and maintenance of a comprehensive and accurate database is a critical step in meeting the scientific objectives of CEPEX. This chapter provides an overview of these objectives, pertinent data sets, field data integration, the CEPEX Central Archive, schedules and data protocol, and data publications.

All data management activities will be coordinated between the Center for Clouds, Chemistry, and Climate (C4) at Scripps Institution of Oceanography and UCAR's Office of Field Project Support (OFPS). The CEPEX data management objectives have both a real time and a final processed component. The real-time component will involve a limited quality assurance data ingest in the field for operational support as well as specific scientific analysis using the SIO-C4 data system. These data collected in real time will have degraded resolution because of communication restrictions and processing limitations. Final data sets will contain higher resolution and will be subject to more rigorous quality assurance following the field-phase. This final processing will be accomplished jointly between SIO and OFPS, with the CCA establishment and distribution performed by the OFPS through NCAR in Boulder, Colorado.

Data Management Objectives

The objectives of CEPEX data management are as follows:

CEPEX Data Sets

Surface

The surface data collected for CEPEX will consist of standard measurements from the operational island meteorological stations and ocean buoys in the CEPEX region. Measurements of radiation, boundary-layer flux, precipitation, and radar reflectivity will be recorded aboard the Vickers. Abbreviated data reports from the Vickers will also be received in real time at Fiji via INMARSAT (International Maritime Satellite). Higher-resolution data (i.e., from ISS stations, the Vickers, etc.) will be collected from the respective countries and agencies involved following the field-phase and will be used in the final data sets.

Upper-air

The upper-air data collected for CEPEX will consist of standard/supplemental rawinsonde and profiler data from the existing island stations in the CEPEX region. The real-time data will be reduced in vertical resolution, although the final data sets will consist of the higher vertical resolution (i.e., 10-sec or 50-m) data received from the sites following the field-phase. Special ozone and frostpoint hygrometer soundings will be taken onboard the Vickers. Abbreviated sounding reports from the Vickers will also be received and archived in real time at Fiji. The higher-resolution soundings archived onboard will be used in the final data sets.

Aircraft

A subset of aircraft data from the ER-2, the Learjet, and the P-3 will be collected and archived following each flight in Fiji. Some data will not be available until the day after each flight. The subsets of data will consist of either a degraded resolution (i.e., temporal or spatial) and/or a limited number of parameters or products. This is especially true for the P-3, which will not operate from Fiji and will transmit a limited data set (selected parameters) from various remote P-3 waypoints. Following the field program, each individual investigator will be responsible for processing, quality assuring, and providing the entire data set from his/her instruments in a mutually agreed upon standard format.

Satellite

Satellite data will be collected in both real time and backup modes during CEPEX. The primary data sources in Fiji will be satellite imagery from the on-site SeaSpace GMS/AVHRR ingest and display workstations. Backup archival arrangements will include GMS/AVHRR imagery archived by the BOM in Melbourne, Australia and by Aeromet in Kwajalein, Marshall Islands as well as NOAA.

DMSP data will be archived using several methods. First, OLS 2.5-km resolution imagery will be archived by the National Snow and Ice Data Center in Boulder, Colorado. A request for the 0.5- km resolution imagery has been submitted to the U.S. Air Force, but may not be approved due to military constraints. Second, SSM/I moisture data will be archived by NASA/Marshall Space Flight Center in Huntsville, Alabama, as part of NASA's WetNET program. Data from the TOPEX/POSEIDON satellite will be routinely collected, processed, archived, and available from the NASA/Jet Propulsion Laboratory in Pasadena, California. (TOPEX: Topographic Ocean Experiment; POSEIDEN is the French satellite program.)

Model

Available model products will be collected in real time and archived in various formats for CEPEX. Primarily, the analysis and forecast fields will be obtained for the European Centre for Medium-Range Weather Forecasting (ECMWF), United States NMC, New Zealand Meteorological Service, and Australian TAPS models. The primary real-time source for these products will be McIDAS. Final archiving of model products will be performed by NCAR (ECMWF and NMC products), the New Zealand Meteorological Service, and the Australian BOM.

Data Collection

The PIs of the proposal will have individual responsibilities in collecting data as follows: The archival of this data will remain the responsibility of UCAR and SIO-C4.

Data Integration

The data integration process consists of three stages:
  1. field data integration
  2. integration of data from primary platforms
  3. fully integrated data sets
Data integration will be conducted at SIO-C4. This integration process will be directed by V. Ramanathan, with concurrence by a steering committee initially consisting of: S. Williams, W. Collins, A. Heymsfield, and F. Valero. The steering committee will have to certify and approve the integrated data before it can be used for any of the purposes mentioned herein. The lead programmers for this effort are S. Diggs and E. Boer of C4.

Field data integration. Field data integration and analyses for assessing data quality and the attainment of mission objectives are required. Informed decisions in flight mission planning depend critically upon information derived from such in-field, quick-look, integrated data sets and their analyses.

The real-time data collected in the field will consist of the following: (1) standard GTS data stream (surface, buoy, upper-air reports); (2) satellite imagery from the on-site SeaSpace AVHRR and GMS systems with forecast products from the Australian BOM McIDAS; (3) AVHRR imagery from the Tiros satellites; (4) INMARSAT reports from the Vickers; (5) a subset of aircraft data from the ER-2, Learjet, and P-3; and (6) miscellaneous forecast products and data collected by the Fiji Meteorological Service, Australian BOM, New Zealand Meteorological Service, and National Weather Service (NWS) in Hawaii. Subsets of these data will be reformatted and processed using the SIO-C4 system and archived on-site. These data sets will later be used along with the complete data sets in the final processing following the field-phase. The following subsection describes the data sets and archival in greater detail.

During the CEPEX field-phase, instrument PIs (with the exception of F. Valero) are responsible for making quick-look data available within 12 to 18 hours after acquisition or flight completion, for use in assessing data quality and for preliminary analyses for mission assessment and planning. The RAMS package of F. Valero (see Table 6) consists of 13 independent instruments on two different platforms (the ER-2 and the Learjet). Considerable time (approximately 12 hours) is required just to load the data into computers, and it would take approximately 48 hours to produce even quick-look data. Thus, quick-look radiation data will be provided in about 48 hours. CEPEX is particularly interested in detailed analyses of the following two quantities:

  1. Water vapor from dropsondes. Collocated dropsondes with upsondes from the Vickers, special launches from island stations, and vertical profiles from a cryogenic hygrometer (Learjet) will be intercompared to assess the performance of the dropsondes.
  2. Cloud cluster sampling. The success of CEPEX depends upon an adequate sampling of various super cloud clusters. For this purpose, effort will be made to employ AVHRR and VISSR (GMS) data along the flight track, to estimate the number of independent clear, cloudy, and anvils samples obtained by the ER-2, the Learjet, and the P-3. AVHRR data will also be used to estimate the greenhouse effect and cloud forcing along the flight track, to get a crude estimate of the statistics of these quantities. These estimates will be compared with quick- look RAMS data for spot checks.

Integration of data from the primary platforms. The primary platforms are the ER-2, the Learjet, the P-3, and the Vickers. The data from these platforms have to be integrated to yield the estimates of the physical quantities shown in the ovals in Figure 27. Several steps are involved in this process.

  1. Instruments within each platform have to be integrated to yield a desired quantity. An example of how the ER-2 radiation data are integrated to obtain the column greenhouse effect (Ga) and cloud forcing (Cl) is shown in Figure 28. Because the computation of Ga and Cl requires the identification of whether a scene is clear or overcast, auxiliary narrow-band instruments (TDDR or NFOVR; see Tables 6 and 7) are required, as shown in Figure 27. In the case of the Learjet (not shown), microphysics instruments will be used to identify the scene.
  2. Derived quantities (e.g., Ga and Cl) from the different platforms have to be integrated. For example, Cl from the Learjet has to be subtracted from Cl from the ER-2 to obtain anvil heating. Likewise, Ga and Cl from the Vickers and the P-3 have to be integrated with ER-2 data to obtain the f parameter (see Table 1). Another challenging example is water vapor, which is measured by the Learjet, the ER-2, the Vickers, and the P-3. Data from all of these platforms have to be integrated to map out the vertical distribution of water vapor along the equatorial Pacific.
  3. The final step is to obtain the means (over all of the flight legs) of the derived and primary quantities. In the case of water vapor, this process will involve the integration of all of the available platforms (see Figure 27).
The final product will consist of derived and directly observed quantities for individual flight or ship tracks, as well as a representative time mean with statistics of variability. It is anticipated that during this period the instrument PIs will critically examine their data both for quality and instrument malfunction or drift and that data integration will aid this process.
At the end of this stage, an estimate of the west to east variations of Ga, cloud forcing, water-vapor distribution, and evaporation along the flight tracks will be obtained. These data will be provided to the data-integration steering committee and to the various instrument PIs.

Fully integrated data sets. A schematic of this integration is shown in Figure 29. First, GMS data will be used to collocate the various platforms and to identify the cloud type (low, middle, or high) and spatial scale (super-clusters, cumulus scale, etc.) from each of the platforms for each flight or ship track. Data from the aircraft will be used to calibrate other, longer-term platforms to obtain seasonal or synoptic means. For example, the P-3 data will be used to validate evaporation estimated from the buoy data; buoy data will then be used to get seasonal means. Likewise, GMS and AVHRR radiances will be calibrated with RAMS on the ER-2; satellite data will then be used to get column Ga and Cl.

The overall assessment of the thermostat hypothesis depends upon the representativeness of the along-the-flight-track data for synoptic and seasonal means. CEPEX will rely heavily upon the integration of flight data with satellite data to obtain the spatial and temporal averages of the quantities. For radiation fluxes, the aircraft data will serve as in-situ truth for satellite data. The satellite radiances will be calibrated with the ER-2 and Learjet radiation fluxes to obtain broadband long-wave and solar fluxes from AVHRR and GMS. The satellite-derived broadband fluxes will be used to obtain synoptic maps of the column greenhouse effect and cloud forcing. With respect to evaporation, the P-3 data will be put into bins according to the following: (1) warm ocean (SST > Tc): clear or disturbed with deep convective and anvil cloud category; (2) cold ocean (SST < Tc): clear or disturbed. The satellite data will be used to obtain synoptic means of these categories and for summing the evaporation fluxes for each of these categories. Similar procedures will be adopted for the surface radiation fluxes.

CEPEX Central Archive

Each CEPEX observation satisfies a specific requirement in the design to test the thermostat hypothesis—the central goal of CEPEX. The attainment of this goal depends critically on the timely integration of each CEPEX data set into the CCA.

Individual investigators will submit their reduced and documented data to the CCA located at NCAR as soon as possible, but by no later than October 1993. No integrated data sets will be distributed from the CCA until this time. For investigators with larger data sets (i.e., F. Valero), mutually agreed upon case studies will be processed first and submitted to the CCA by the 8 October deadline. The full data sets will be delivered to the CCA no later than April 1994 (funding permitting). Reduced data are observations converted to the physical quantities directly sensed by the instrument with quality-control inspection, flagging of bad data, and documentation of the quality-control process. Documentation or metadata includes calibration, quality, and navigation information (which describes the conversion to physical units), the conditions of observation, the location and time of the observation (as well as instrument specifications), field performance evaluations, and data tape format description. Upon mutual agreement of the CEPEX individual investigators the submission of "special analysis period" data may be requested earlier.

Data Protocol

These guidelines, schedules, and protocols are intended to encourage the orderly and efficient analyses, interpretation, and publication of the scientific results obtained through the CEPEX project. CEPEX individual investigators may wish to apply their observations to questions related to the primary scientific objectives and beyond. CEPEX encourages the broadest application of CEPEX and related data (e.g., TOGA-COARE and ARM) to scientific studies. Prior to the release of data from the CCA, the following data protocol for CEPEX individual investigators will be in effect.

Data sharing. CEPEX individual investigators will have free access to all data acquired during the project. The normal vehicle for data dissemination will be a transfer of data via the CCA; however, direct transfers of data between individual investigators is also encouraged.

Data release. CEPEX individual investigators may release their data to whomever they wish. However, they may not release the data of other individual investigators without consent.

Consent of individual investigator. No unpublished data of another individual investigator may be used in a publication or presentation without the participation or consent of that investigator.

Collaborative investigations. Any data sets resulting from collaborative investigations involving CEPEX individual investigators will be made available to the CCA. This includes all collaborative efforts both within and outside the CEPEX individual investigator group.

Data Publication

CEPEX operations summary and data inventory document. Following the field-phase, OFPS will produce a summary of operations conducted in the field, detailed descriptions of the various data sets collected during CEPEX, data access procedures, and an inventory of all data collected. This document will be produced during the final data processing and completed by early October 1993.

CEPEX synopsis paper. A synopsis of CEPEX field-phase key operational activities and preliminary results will be prepared by project personnel and key individual investigators for publication in an appropriate journal(s). The paper(s) will be designed to be a quick-look publication to inform the scientific community at an early stage of the CEPEX mission and to highlight particularly interesting observations. The paper(s) will be submitted by November 1993.

CEPEX special issue. Results from the CEPEX field-phase may be published in a special issue of an appropriate journal. The special issue will contain (1) an overview paper co-authored by project personnel and key individual investigators and (2) a collection of papers on CEPEX results. The special issue decision will be made by the CEPEX individual investigators.

Back to the Table of Contents