DATA MANAGEMENT AND INTEGRATION POLICY
The development and maintenance of a comprehensive and accurate database is a critical step in
meeting the scientific objectives of CEPEX. This chapter provides an overview of these
objectives, pertinent data sets, field data integration, the CEPEX Central Archive, schedules and
data protocol, and data publications.
All data management activities will be coordinated between the Center for Clouds, Chemistry, and
Climate (C4) at Scripps Institution of Oceanography and UCAR's Office of Field Project Support
(OFPS). The CEPEX data management objectives have both a real time and a final processed
component. The real-time component will involve a limited quality assurance data ingest in the
field for operational support as well as specific scientific analysis using the SIO-C4 data system.
These data collected in real time will have degraded resolution because of communication
restrictions and processing limitations. Final data sets will contain higher resolution and will be
subject to more rigorous quality assurance following the field-phase. This final processing will be
accomplished jointly between SIO and OFPS, with the CCA establishment and distribution
performed by the OFPS through NCAR in Boulder, Colorado.
Data Management Objectives
The objectives of CEPEX data management are as follows:
- Organize the collection of all data of interest to CEPEX, including both data collected by
special observing platforms and the most accurate sources of pertinent conventional data.
- Provide real-time access to data for forecasting and field operations.
- Prepare field data integration and analyses in a timely manner.
- Identify and coordinate validation procedures used by various participants in the
preparation of research data sets.
- Insure the long-term and easy availability of CEPEX data through the establishment of a
central data archive.
CEPEX Data Sets
SurfaceThe surface data collected for CEPEX will consist of standard measurements from the
operational island meteorological stations and ocean buoys in the CEPEX region. Measurements
of radiation, boundary-layer flux, precipitation, and radar reflectivity will be recorded aboard the
Vickers. Abbreviated data reports from the Vickers will also be received in real time at Fiji via
INMARSAT (International Maritime Satellite). Higher-resolution data (i.e., from ISS stations, the
Vickers, etc.) will be collected from the respective countries and agencies involved following the
field-phase and will be used in the final data sets.
Upper-airThe upper-air data collected for CEPEX will consist of standard/supplemental
rawinsonde and profiler data from the existing island stations in the CEPEX region. The real-time
data will be reduced in vertical resolution, although the final data sets will consist of the higher
vertical resolution (i.e., 10-sec or 50-m) data received from the sites following the field-phase.
Special ozone and frostpoint hygrometer soundings will be taken onboard the Vickers.
Abbreviated sounding reports from the Vickers will also be received and archived in real time at
Fiji. The higher-resolution soundings archived onboard will be used in the final data sets.
AircraftA subset of aircraft data from the ER-2, the Learjet, and the P-3 will be collected and
archived following each flight in Fiji. Some data will not be available until the day after each
flight. The subsets of data will consist of either a degraded resolution (i.e., temporal or spatial)
and/or a limited number of parameters or products. This is especially true for the P-3, which will
not operate from Fiji and will transmit a limited data set (selected parameters) from various remote
P-3 waypoints. Following the field program, each individual investigator will be responsible for
processing, quality assuring, and providing the entire data set from his/her instruments in a
mutually agreed upon standard format.
SatelliteSatellite data will be collected in both real time and backup modes during CEPEX. The
primary data sources in Fiji will be satellite imagery from the on-site SeaSpace GMS/AVHRR
ingest and display workstations. Backup archival arrangements will include GMS/AVHRR
imagery archived by the BOM in Melbourne, Australia and by Aeromet in Kwajalein, Marshall
Islands as well as NOAA.
DMSP data will be archived using several methods. First, OLS 2.5-km resolution imagery will be
archived by the National Snow and Ice Data Center in Boulder, Colorado. A request for the 0.5-
km resolution imagery has been submitted to the U.S. Air Force, but may not be approved due to
military constraints. Second, SSM/I moisture data will be archived by NASA/Marshall Space
Flight Center in Huntsville, Alabama, as part of NASA's WetNET program. Data from the
TOPEX/POSEIDON satellite will be routinely collected, processed, archived, and available from
the NASA/Jet Propulsion Laboratory in Pasadena, California. (TOPEX: Topographic Ocean
Experiment; POSEIDEN is the French satellite program.)
ModelAvailable model products will be collected in real time and archived in various formats for
CEPEX. Primarily, the analysis and forecast fields will be obtained for the European Centre for
Medium-Range Weather Forecasting (ECMWF), United States NMC, New Zealand
Meteorological Service, and Australian TAPS models. The primary real-time source for these
products will be McIDAS. Final archiving of model products will be performed by NCAR
(ECMWF and NMC products), the New Zealand Meteorological Service, and the Australian BOM.
The PIs of the proposal will have individual responsibilities in collecting data as follows:
The archival of this data will remain the responsibility of UCAR and SIO-C4.
- radiation fluxes from the ER-2 and the LearjetF. Valero
- water-vapor data from the ER-2J. Anderson
- lidar on the ER-2J. Spinhirne
- cloud microphysics and state parameters from the LearjetL. Rose and A. Heymsfield
- temperature and water vapor from dropsondesS. Williams and S. Sherwood
- evaporation fluxes, radiation fluxes, microphysical parameters, and water-vapor data
from the P-3R. Grossman and P. Flatau
- satellite dataW. Collins
- buoy dataM. McPhaden and D. Cutchin
- balloon data from the R/V VickersS. Oltmans, P. Crutzen, and D. Kley
- SST and radiation fluxes from the R/V VickersH. Grassl and W. Emery
- meteorological data from the R/V VickersD. Cutchin
- radar data from the R/V VickersS. Rutledge
- long-wave spectral data from the R/V VickersD. Lubin
- DOE-ARM site in the western PacificD. Cutchin
- island ISS and radiosondesS. Williams
The data integration process consists of three stages:
Data integration will be conducted at SIO-C4. This integration process will be directed by V.
Ramanathan, with concurrence by a steering committee initially consisting of: S. Williams, W.
Collins, A. Heymsfield, and F. Valero. The steering committee will have to certify and approve
the integrated data before it can be used for any of the purposes mentioned herein. The lead
programmers for this effort are S. Diggs and E. Boer of C4.
- field data integration
- integration of data from primary platforms
- fully integrated data sets
Field data integration. Field data integration and analyses for assessing data quality and the
attainment of mission objectives are required. Informed decisions in flight mission planning
depend critically upon information derived from such in-field, quick-look, integrated data sets and
The real-time data collected in the field will consist of the following: (1) standard GTS data stream
(surface, buoy, upper-air reports); (2) satellite imagery from the on-site SeaSpace AVHRR and
GMS systems with forecast products from the Australian BOM McIDAS; (3) AVHRR imagery
from the Tiros satellites; (4) INMARSAT reports from the Vickers; (5) a subset of aircraft data
from the ER-2, Learjet, and P-3; and (6) miscellaneous forecast products and data collected by the
Fiji Meteorological Service, Australian BOM, New Zealand Meteorological Service, and National
Weather Service (NWS) in Hawaii. Subsets of these data will be reformatted and processed using
the SIO-C4 system and archived on-site. These data sets will later be used along with the complete
data sets in the final processing following the field-phase. The following subsection describes the
data sets and archival in greater detail.
During the CEPEX field-phase, instrument PIs (with the exception of F. Valero) are responsible
for making quick-look data available within 12 to 18 hours after acquisition or flight completion,
for use in assessing data quality and for preliminary analyses for mission assessment and planning.
The RAMS package of F. Valero (see Table 6) consists of 13 independent instruments on two
different platforms (the ER-2 and the Learjet). Considerable time (approximately 12 hours) is
required just to load the data into computers, and it would take approximately 48 hours to produce
even quick-look data. Thus, quick-look radiation data will be provided in about 48 hours.
CEPEX is particularly interested in detailed analyses of the following two quantities:
- Water vapor from dropsondes. Collocated dropsondes with upsondes from the
Vickers, special launches from island stations, and vertical profiles from a cryogenic hygrometer
(Learjet) will be intercompared to assess the performance of the dropsondes.
- Cloud cluster sampling. The success of CEPEX depends upon an adequate
sampling of various super cloud clusters. For this purpose, effort will be made to employ
AVHRR and VISSR (GMS) data along the flight track, to estimate the number of independent
clear, cloudy, and anvils samples obtained by the ER-2, the Learjet, and the P-3. AVHRR data
will also be used to estimate the greenhouse effect and cloud forcing along the flight track, to get a
crude estimate of the statistics of these quantities. These estimates will be compared with quick-
look RAMS data for spot checks.
Integration of data from the primary platforms. The primary platforms are the ER-2, the Learjet,
the P-3, and the Vickers. The data from these platforms have to be integrated to yield the estimates
of the physical quantities shown in the ovals in Figure 27. Several steps are involved in this
The final product will consist of derived and directly observed quantities for individual flight or
ship tracks, as well as a representative time mean with statistics of variability. It is anticipated that
during this period the instrument PIs will critically examine their data both for quality and
instrument malfunction or drift and that data integration will aid this process.
- Instruments within each platform have to be integrated to yield a desired quantity. An
example of how the ER-2 radiation data are integrated to obtain the column greenhouse effect (Ga)
and cloud forcing (Cl) is shown in Figure 28. Because the computation of Ga and Cl requires the
identification of whether a scene is clear or overcast, auxiliary narrow-band instruments (TDDR or
NFOVR; see Tables 6 and 7) are required, as shown in Figure 27. In the case of the Learjet (not
shown), microphysics instruments will be used to identify the scene.
- Derived quantities (e.g., Ga and Cl) from the different platforms have to be integrated.
For example, Cl from the Learjet has to be subtracted from Cl from the ER-2 to obtain anvil
heating. Likewise, Ga and Cl from the Vickers and the P-3 have to be integrated with ER-2 data to
obtain the f parameter (see Table 1). Another challenging example is water vapor, which is
measured by the Learjet, the ER-2, the Vickers, and the P-3. Data from all of these platforms have
to be integrated to map out the vertical distribution of water vapor along the equatorial Pacific.
- The final step is to obtain the means (over all of the flight legs) of the derived and
primary quantities. In the case of water vapor, this process will involve the integration of all of the
available platforms (see Figure 27).
At the end of this stage, an estimate of the west to east variations of Ga, cloud forcing, water-vapor
distribution, and evaporation along the flight tracks will be obtained. These data will be provided
to the data-integration steering committee and to the various instrument PIs.
Fully integrated data sets. A schematic of this integration is shown in Figure 29. First, GMS data
will be used to collocate the various platforms and to identify the cloud type (low, middle, or high)
and spatial scale (super-clusters, cumulus scale, etc.) from each of the platforms for each flight or
ship track. Data from the aircraft will be used to calibrate other, longer-term platforms to obtain
seasonal or synoptic means. For example, the P-3 data will be used to validate evaporation
estimated from the buoy data; buoy data will then be used to get seasonal means. Likewise, GMS
and AVHRR radiances will be calibrated with RAMS on the ER-2; satellite data will then be used
to get column Ga and Cl.
The overall assessment of the thermostat hypothesis depends upon the representativeness of the
along-the-flight-track data for synoptic and seasonal means. CEPEX will rely heavily upon the
integration of flight data with satellite data to obtain the spatial and temporal averages of the
quantities. For radiation fluxes, the aircraft data will serve as in-situ truth for satellite data. The
satellite radiances will be calibrated with the ER-2 and Learjet radiation fluxes to obtain broadband
long-wave and solar fluxes from AVHRR and GMS. The satellite-derived broadband fluxes will
be used to obtain synoptic maps of the column greenhouse effect and cloud forcing. With respect
to evaporation, the P-3 data will be put into bins according to the following: (1) warm ocean (SST
> Tc): clear or disturbed with deep convective and anvil cloud category; (2) cold ocean (SST < Tc):
clear or disturbed. The satellite data will be used to obtain synoptic means of these categories and
for summing the evaporation fluxes for each of these categories. Similar procedures will be
adopted for the surface radiation fluxes.
CEPEX Central Archive
Each CEPEX observation satisfies a specific requirement in the design to test the thermostat
hypothesisthe central goal of CEPEX. The attainment of this goal depends critically on the
timely integration of each CEPEX data set into the CCA.
Individual investigators will submit their reduced and documented data to the CCA located at
NCAR as soon as possible, but by no later than October 1993. No integrated data sets will be
distributed from the CCA until this time. For investigators with larger data sets (i.e., F. Valero),
mutually agreed upon case studies will be processed first and submitted to the CCA by the 8
October deadline. The full data sets will be delivered to the CCA no later than April 1994 (funding
permitting). Reduced data are observations converted to the physical quantities directly sensed by
the instrument with quality-control inspection, flagging of bad data, and documentation of the
quality-control process. Documentation or metadata includes calibration, quality, and navigation
information (which describes the conversion to physical units), the conditions of observation, the
location and time of the observation (as well as instrument specifications), field performance
evaluations, and data tape format description. Upon mutual agreement of the CEPEX individual
investigators the submission of "special analysis period" data may be requested earlier.
These guidelines, schedules, and protocols are intended to encourage the orderly and efficient
analyses, interpretation, and publication of the scientific results obtained through the CEPEX
project. CEPEX individual investigators may wish to apply their observations to questions related
to the primary scientific objectives and beyond. CEPEX encourages the broadest application of
CEPEX and related data (e.g., TOGA-COARE and ARM) to scientific studies. Prior to the release
of data from the CCA, the following data protocol for CEPEX individual investigators will be in
Data sharing. CEPEX individual investigators will have free access to all data acquired during the
project. The normal vehicle for data dissemination will be a transfer of data via the CCA; however,
direct transfers of data between individual investigators is also encouraged.
Data release. CEPEX individual investigators may release their data to whomever they wish.
However, they may not release the data of other individual investigators without consent.
Consent of individual investigator. No unpublished data of another individual investigator may be
used in a publication or presentation without the participation or consent of that investigator.
Collaborative investigations. Any data sets resulting from collaborative investigations involving
CEPEX individual investigators will be made available to the CCA. This includes all collaborative
efforts both within and outside the CEPEX individual investigator group.
CEPEX operations summary and data inventory document. Following the field-phase, OFPS will
produce a summary of operations conducted in the field, detailed descriptions of the various data
sets collected during CEPEX, data access procedures, and an inventory of all data collected. This
document will be produced during the final data processing and completed by early October 1993.
CEPEX synopsis paper. A synopsis of CEPEX field-phase key operational activities and
preliminary results will be prepared by project personnel and key individual investigators for
publication in an appropriate journal(s). The paper(s) will be designed to be a quick-look
publication to inform the scientific community at an early stage of the CEPEX mission and to
highlight particularly interesting observations. The paper(s) will be submitted by November 1993.
CEPEX special issue. Results from the CEPEX field-phase may be published in a special issue of
an appropriate journal. The special issue will contain (1) an overview paper co-authored by project
personnel and key individual investigators and (2) a collection of papers on CEPEX results. The
special issue decision will be made by the CEPEX individual investigators.
Back to the Table of Contents