Report of the Sun-Earth Connection Study
Team
of the Space Science Data Systems Technical
Working Group
July 1998, River Bend Workshop
Richard Bogart (Stanford University)
Robert Hanisch (Space Telescope Science
Institute)
Joseph King (National Space Science
Data Center)
Roger Pyle (Bartol Research Institute/University
of Delaware)
David Sibeck (The Johns Hopkins University
Applied Physics Laboratory)
Ray Walker (University of California
Los Angeles)
David Winningham (Southwest Research
Institute)
Executive Summary
This report describes the system architecture
and organization of a data service to manage the data associated with the
Sun-Earth Connection theme within the context of the NASA Space Science
Data System. We propose a distributed service in which the data sets are
managed at a number of sites by scientists actively involved in data analysis.
These data providers will be grouped under three service groups: Solar
Physics, Terrestrial Environment Imaging, and In Situ Space
Physics. The data providers will also be organized into three cross-cutting
user views: The Sun as a Star, The Sun in Space, and The
Earth in Space. The organization of the proposed data service will consist
of a "thin" management layer that will provide a unified budget and reporting
path, and a governing council that will ensure interoperability and sensitivity
to interdisciplinary needs. The overall budget for the fully organized
and operational data service is estimated at $5 million per year.
1 Introduction
Scientific discoveries provide the true return
from NASA's investment in space exploration. To maximize this return, researchers
must be able to exploit the rich database of observations from NASA's past,
present, and future missions. Successful data use requires easy location
of and access to active and historical archives, as well as the information
and tools necessary to interpret the observations. Functions are being
incorporated within the NASA Space
Science Data System (SSDS) designed to provide common search and access
tools across the range of space science. An essential component of the
SSDS will be a system designed to manage the abundance of data associated
with the Sun-Earth
Connection (SEC) theme, with an emphasis on enabling correlative studies
both within the theme and within the SSDS. This report recommends a structure
for such a service and describes the relation of that service to broader
level SSDS activities.
It is both natural (in view of limited
budgets) and desirable (for cross-discipline compatibility) that the nascent
Sun-Earth Connection Data Service (SECDS) will make use of, and build upon
the tools and organizational structure developed by the more mature astrophysics
data environment, Planetary Data System (PDS), and other existing systems.
These systems have demonstrated that there are many benefits to improving
community organization to a greater degree than has been the case within
the Sun-Earth Connection theme. Their use of common data systems and formats
greatly facilitates the delivery of data. Nevertheless, it is clear that
any successful data system must evolve from existing SEC functions and
services, rather than being imposed upon the research community. It is
also a central tenet of the SSDS that data are best managed by scientists
actively engaged in their analysis.
There are some problems in the SEC data
environment that are serious impediments to scientific research. These
problems include: 1) publicly inaccessible data sets; 2) data sets that
are only available in formats which are ineffective for scientific analysis;
3) data set documentation that does not support independent data use; 4)
the excessively wide range of data formats in use; and 5) difficulty of
locating data in today's distributed data environment. While some important
data sets and services are currently conveniently accessible to the SEC
community from a number of sites, we believe the establishment of an SECDS
is important. The SECDS will be established to: 1) ensure accessibility
to a broad suite of data sets; 2) promote interoperable search and use
of data and other services across multiple SEC disciplines and sites; 3)
improve the interface through which mission data flow to public-access
sites; and 4) promote interoperability across the entire space science
domain through participation in and adherence to the standards of SSDS.
These recommendations are an outgrowth
of the Community Wide
Workshop on NASA's Space Physics Data System held at Rice University
in June 1993, and are guided by the recommendations
of the Task Group on Science Data Management . These recommendations
are also responsive to NASA's Science
Information Services Study Team preliminary report. They are a direct
result of the efforts of the SSDS
Technical Working Group (SSDS TWG) to increase interoperability between
the various space science disciplines.
1.1 General Requirements
The SECDS will serve the needs of scientists
in and across the traditional disciplines of Solar Physics, Cosmic and
Heliospheric Physics, and Magnetosphere, Ionosphere, Thermosphere, and
Mesosphere Physics. From a user viewpoint, it will be capable of supporting
research along three major scientific themes: The Sun as A Star (including
helio- and asteroseismology, solar and stellar activity, and luminosity
variations); The Sun in Space (including studies of coronal heating, the
solar wind, the heliosphere, interplanetary energetic particle populations,
and the interplanetary magnetic field); and The Earth in Space (including
the Earth's magnetosphere and upper atmosphere and their response to the
changing space environment). From a data management viewpoint, the SECDS
will be organized by general data classification: Solar Physics, Terrestrial
Environment Imaging, and In Situ Space Physics (see Figure
1).
The SECDS is also designed to assist scientists
in more diverse fields who need to analyze or correlate data arising from
research in different traditional disciplines. It must assume responsibility
for all data sets of scientific interest resulting from NASA missions and
investigations within the Sun-Earth Connection Theme and the former Space
Physics Division, including data from such projects as the International
Solar Terrestrial Physics Program (SOHO, Polar, Wind, Equator-S, Geotail),
Yohkoh, TRACE, Ulysses, ACE, FAST, TIMED, SMM, UARS, KPVT, IMP8, and Voyager
1 & 2 missions. The SECDS should also strive to provide full access
to and interoperability with other relevant data archives, including both
non-NASA space missions and ground-based observatories. In cases where
such data are important to Office of Space Science (OSS) research and are
at risk of becoming inaccessible, SECDS should endeavor to preserve, curate,
and archive the data in accessible form.
The SECDS will function as an integral
component of the SSDS with the aim of developing a level of connection
and interoperability useful to scientists involved in cross-disciplinary
research. Particular attention should be paid to coordination with closely
related disciplines in other fields, such as stellar activity, stellar
winds, asteroseismology, particle physics, planetary magnetospheres and
atmospheres, and cosmic ray sources.
1.2 Services
The SECDS will provide the following primary
services through a distributed data system architecture (described in Section
3):
-
Direct and rapid electronic access to data
sets from all existing and future missions. Since the data sets are currently
held at a large number of locations, the SECDS must provide a catalog of
data set holdings within the community. It is expected that data distribution
via the Internet will continue to increase in popularity; however, provisions
must also be made for delivery of substantial data sets on off-line media
as appropriate.
-
Arrangements for the permanent (deep) archiving
and suitable curation of all scientifically valuable data sets obtained
by NASA missions and NASA-funded investigations. As resources permit, the
SECDS will also incorporate data sets obtained by non-NASA funded domestic
and foreign agencies.
-
Coordination of the development and review
of mission data management plans for NASA missions, taking account of community
needs. SECDS will identify problems with these plans and will report on
the status of these plans to NASA Headquarters. SECDS will provide guidelines
to data management plans, documentation, formats, media, and archiving
that support standardization within SEC and, where possible, across the
solar-terrestrial, astrophysics, and planetary science disciplines.
-
Development of standards related to data formats,
media, and documentation through work with community members. Information
and expertise on data standards will be provided to both data providers
and users of archived data.
-
Assistance in the development of search tools
and browse-level data products for access to data catalogs and relevant
information, such as spacecraft trajectories, observing times, and object
locations, within the framework of the SSDS.
-
Restoration of previously obtained data sets
relevant to the SEC theme and support for the development of value-added
products. SECDS will identify data sets requiring restoration and/or enhanced
levels of accessibility on the basis of provider and user interest.
-
Coordination of activities involving public
outreach and education with the relevant data centers.
2 Scientific User View of the System
The scientific user view of SECDS reflects
scientific interests that may lie in accessing cross-disciplinary data
sets. This suggests that a user interface presenting three cross-cutting
thematic categories would be useful. We propose that the thematic categories
The Sun as Star, he Sun in Space, and The Earth in Space
be implemented at the SECDS coordination level and overseen by the Project
Scientist and the Science Members of the Coordinating Council. Each category
will consist of a mapping of relevant data sets from areas covered by the
three Service Groups (see sections 3 and 4 for details concerning these
entities).
As seen in Figure 1,
each Service Group is responsible for providing a portion of each thematic
view of the data. The exact mechanism by which each thematic view is presented
will be determined by the Service Groups.
3 Data Management System Architecture
The overall system architecture of the SECDS
is designed to accomplish two goals: 1) to provide users with rapid access
to well-documented, Sun-Earth Connection data and 2) to provide efficient
management of the SECDS. To this end, the SECDS will consist of three levels
that represent different management, user service, and data provider responsibilities
(see Figure 2).
The first level in the SECDS architecture
will consist of a "thin" Management Office, an Advisory Committee, and
a Coordinating Council. These three bodies are described in detail in section
4.0.
The next level of the SECDS will be composed
of service groups organized by scientific discipline or data type. We propose
that there be three groups at this level that are data-oriented and reflect
similar collection procedures and data structures. These groups will be
organized around Solar Physics, Terrestrial Environment Imagery (auroral
and energetic neutral imaging, for example), and In Situ Space Physics
(e.g., interplanetary measurements and magnetospheric, ionospheric, and
atmospheric data sets). The responsibilities of these three Service Groups
are given in section 3.1.
The third level in this system architecture
will consist of a dynamically evolving set of Data Providers that will
accomplish specified tasks including data set management and the development
of software tools. These Providers may supply data to one or more Service
Groups, depending on their function. There may also be Support Providers
at this level, which will be chartered to provide specified software tools
or support functions to a particular Service Group. The Providers? responsibilities
are discussed in section 3.2.
3.1 Level II Service Groups
The primary role of each Level II Service
Group is to identify, acquire, validate, and deliver scientific data sets
in its area of responsibility to scientific users and to provide for their
permanent archive in a timely and cost-effective manner.
3.1.1 Data Identification and
Acquisition
An integral part of the data identification
and acquisition process will be the pre- and post-launch interactions between
the Level II Service Groups and spaceflight project personnel. These groups
will work to ensure the orderly and timely flow of well-documented and
standardized data into SECDS archives. To accomplish this task, scientific
and technical experts within the Level II Service Groups will interface
with project personnel starting early in the project data management planning
phases. Data identification and acquisition topics that will be discussed
include 1) a Project Data Management Plan (PDMP), 2) relevant standards
and guidelines for the preparation and archiving of data and supporting
material, and 3) tools and services available through SECDS and elsewhere
that will be useful in data preparation and archiving and also for reaching
the project's science objectives.
PDMPs will be reviewed by Service Group
personnel for adherence to SECDS guidelines and standards. These personnel
will also review (and arrange for reviews of) data and supporting material
to be archived as those products are first created and judged ready for
archiving by Data Providers (project or PI level) and iterate with Data
Providers as needed to ensure that the products are correct, complete,
comprehensible, and standardized. Some current or older projects may be
too budget-constrained to provide the best organized and annotated still-reversible
data set and supporting material. In these cases, Service Group personnel
will consult with project-level and/or instrument PI personnel to define
a data preparation/archiving activity that represents an affordable effort
and the best benefit-to-cost ratio data products in light of anticipated
future demand. Data prepared for archiving must be documented sufficiently
to support independent use.
Service Group and project personnel will
explore the feasibility and cost effectiveness of providing public access
to SECDS-adherent data and supporting material from project facilities
while those facilities exist. In many cases, Service Group personnel will
also work directly with instrument Principal Investigators (PIs) concerning
the archiving and public accessibility of data sets and supporting material
created at PI sites rather than at central project facilities. Such PI-specific
efforts (data products, supporting materials, schedules and pathways for
archiving, etc.) should also be addressed in the PDMP.
In interactions with both project-level
and PI-level personnel, a key role of Service Group personnel will be to
explain the available tools and services. These tools and services will
aid in satisfying requirements and may be useful for project and PI data
management and analysis.
3.1.2 Data Preparation
Level II Service Groups identify potential
Level III Data Providers, work with them to develop mutually agreeable
data formats and media, and sponsor (either funded or unfunded) the preparation
by the Providers of data sets in these formats and media, along with documentation
sufficient to interpret the observations. The organization and requirements
of the Level II Service Groups must be sufficiently flexible to be responsive
both to the needs of new missions and to evolving trends in information
technology as they affect the scientific community.
Service Groups validate data sets and accompanying
metadata as they are produced and ingested.
3.1.3 Data Access
Rapid access to well-documented data (including
the results of relevant models) is the ultimate goal of the SECDS. With
advice from the community and in response to user requests, the SECDS will
provide for the most rapid possible access to the digital data within the
constraint of available resources. The locations of the data repositories
and the means of access may vary with both data set and time, in response
to community needs.
3.1.4 Value-Added Services
A distributed, dynamically updated, on-line,
searchable catalog of data sets, metadata, software, and relevant models
will be an essential feature of the SECDS. The catalog must be consistent
with those maintained by the Planetary and Astrophysics communities (i.e.,
for integration into the SSDS) and must be compatible with the search engines
that these communities employ. The catalogs will be automatically updated
by the Service Groups and Data Providers using distributed data base technologies,
reflecting newly available resources and changes in the status of existing
resources. Service Groups, in consultation with each other, the SECDS management,
and SSDS representatives, will identify keywords suitable for describing
SECDS resources to members of both the SECDS and broader SSDS communities.
In particular, these catalogs must be developed in parallel with the 'search'
and 'browse' functions described below, so as to provide a comprehensive
view of the data holdings in the system. The catalog must provide a short
but complete description including keywords, information concerning the
time period covered, and pointers to contact persons and bibliographies
concerning the resources. Finally, the catalogs (or subsections of them)
must be downloadable by interested users or service providers.
Service Groups, and others, will provide
search interfaces which serve as primary entry points for potential data
users. By querying the search interfaces with appropriate keywords, users
obtain lists of all links to resources matching the query. Possible search
criteria include, but are not limited to, time interval, spacecraft, instrument/data
type, and region of space. It must be possible to use the interface iteratively
to isolate the data set of interest. Lower level search engines must be
able to forward unsuccessful or incomplete requests to other elements of
the distributed search facilities transparently. The search interface should
also provide information concerning available tools, value-added products,
contact points, and bibliographies for selected data sets.
Service Groups may produce value-added
products and tools such as browse (low resolution) parameters and on-line
plotting routines suitable for both standard and interdisciplinary scientific
research, or they may commission Data Providers to undertake these tasks.
The browser should be able to view both the entire data set and selected
short intervals when appropriate. These tools should be applicable across
the browseable data sets held by the SECDS. Service Groups must ensure
that efforts to develop tools do not duplicate those already being used
in the community. Tools should be written as system-independent software
to facilitate maximum portability. Such software products must be registered
in the searchable catalog.
Service Groups will identify general-purpose
tools and disseminate knowledge about them to the community and to the
other Service Groups. Under some circumstances, with the approval of the
Coordinating Council, they may establish Support Providers to develop general-purpose
software.
3.1.5 Standards
Service Groups will develop and/or apply standards
for user interfaces, directories, data formats, documentation, node connectivity,
distribution and archive media. They will set expectations for Data Providers
and develop criteria for evaluating their success. Proposed standards should
be considered and approved by the Coordinating Council.
3.1.6 Compliance
The Level II Service Groups will primarily
be responsible for facilitation of data access to the scientific community.
This will involve encouragement and facilitation of adherence to data preparation
and archiving standards and schedules. SECDS Service Groups will periodically
report to the Coordinating Council (and, through it, to NASA Headquarters)
on the level of compliance to expected standards and schedules by spaceflight
projects and other NASA-funded Data Providers. Ultimately, it is NASA's
responsibility to provide incentives for compliance for both NASA and non-NASA
official Data Providers.
3.1.7 Communications
Service Groups maintain links to each other
and a distributed system of multiple access points to the SECDS.
Service Groups may keep records of user
names and data set requests in order to establish usage rates for metric
reporting, contact users concerning improvements in service of updates
to databases, and identify needs for improvements in service.
3.2 Level III Data and Support Providers
Level III Data Providers will provide data
and technological expertise to the SECDS. Typical Data Provider tasks include
ingesting data from both active and past missions into the SECDS, entering
and updating the characteristics of these data sets into the distributed
data base (Figure 3), and providing verified data
to users. Under certain circumstances, Support Providers may be established
to prepare tools for the SECDS. Existing data centers and archives will
be assimilated into the SECDS as appropriate, and new Providers may be
created to respond to specific needs. Level III Data Providers are best
viewed as typically transient extensions of Level II Service Groups. Data
Providers receive their charter and funding through Service Groups.
All Data Providers will be required to
hand the data off to the Service Group prior to the conclusion of the Data
Provider?s activities. Upon selection each Provider will prepare a data
transfer plan. The data must adhere to SECDS standards for data documentation
and be on archivable media. The Service Group will make sure that the data
transfer plan adheres to SECDS standards and that the data are in a form
that the Service Group can make accessible to the community after the relationship
with the Provider formally ends. The Service Group has the responsibility
for data access after the Data Provider?s activities end. This does not
necessarily mean that the Service Group will provide access to the data
directly. Its job is to make sure the data are accessible.
Not all Data Providers need to receive
funding from NASA or the SECDS. Other US government agencies or foreign
entities may sponsor the formation of Data Providers. If these Data Providers
subscribe to SECDS and SSDS data policies, they may associate with Service
Groups to become part of the SECDS.
4 Management Structure
4.1 Overview
The management of the SECDS will reflect the
system?s distributed architecture. The Management Office, a small, efficient
organization consisting of the Project Scientist, the Project Manager,
and clerical support, will have overall responsibility for the SECDS. The
Management Office will be responsible for the fiscal management of the
project and coordination of the science activities. A separate Coordinating
Council will set SECDS policy and an Advisory Committee will provide oversight
of the SECDS activities.
4.2 Governance
4.2.1 Project Scientist
The SECDS Project Scientist, an individual
cognizant and experienced in data management and use, will have overall
management responsibility for the SECDS. The SECDS Project Scientist will
work to facilitate communication between the scientific community and the
SECDS, will represent SECDS at scientific and project meetings, and will
chair the Coordinating Council. This will be a full time position. The
Project Scientist should be an active research scientist in a field related
to the Sun-Earth Connections. It is anticipated that the Project Scientist
will spend ~50% time on SECDS management and facilitation activities and
~50% on scientific research.
4.2.2 Project Manager
The Project Manager will have the day-to-day
management responsibility for the SECDS. Under the direction of the Project
Scientist and with the assistance of the Service Groups, the Project Manager
will prepare the annual budget for the project. He or she will oversee
contract negotiations with the Service Groups and Data Providers. The Project
Manager will be responsible for coordinating the activities of the separate
SECDS bodies and for the day to day interaction with NASA Headquarters.
He or she will complete and submit project financial reports and will coordinate
the submission of financial reports from the Groups or Providers. This
will be a full-time position.
4.2.3 Clerical Support
A half-time clerical position will be provided
to support the Project Scientist and Project Manager.
4.2.4 Coordinating Council
The Coordinating Council will consist of the
Project Scientist (Chair), the Chief Scientists of the Service Groups,
the Project Manager, and three members of the scientific community. These
community members will represent the scientific themes (The Sun as a Star,
The Sun in Space and The Earth in Space) in Sun-Earth Connection research.
They will have the important task of assuring that the needs of their disciplines
are met by SECDS. They will be appointed by the Project Scientist and will
be compensated for their work on the Coordinating Council. The Council
will be responsible for setting SECDS policy within the overall guidelines
of NASA and SSDS policy. They will make financial decisions based on the
budget submitted by the Project Manager. They will also set priorities
for data ingestion (mission activities and data restoration projects) and
for development of standards and tools for data management and access.
The Coordinating Council must approve Data Provider funding. It is anticipated
that the Coordinating Council will meet 3-4 times per year.
4.2.5 SECDS Advisory Committee
The SECDS Advisory Committee will provide
high-level oversight of the activities of the Data Service. It will regularly
review the performance of SECDS and report to the Program Manager, Mr.
J. Bredekamp. This committee will consist of members of the scientific
community appointed through the Program Manager. It will meet twice per
year.
4.2.6 Service Group and Data
Provider Management
The chief scientists of the SECDS Service
Groups will participate in the overall management of SECDS by their membership
on the SECDS Coordinating Council. They will define and manage the ongoing
activities of the Groups. In addition, the Group scientists will be responsible
for the performance of the associated Data Providers. They will define
upcoming work and prepare and budgets for the Management Office. The Group
scientists will prepare bimonthly reports of the Groups? activities and
accomplishments and deliver them to the Management Office. It is anticipated
that each of the Service Groups will establish its own advisory structure.
4.3 Selection of Service Groups and
Data Providers
The Management Office shall be competitively
selected via the standard NASA peer review process of proposals submitted
in response to a Request for Proposal (RFP). We strongly recommend that
the Management Office be at a site within the active SEC science community
recognized for its scientific, technical and management capabilities.
The Service Groups and Data Providers (many
of which are expected to reside in universities and other institutions
outside NASA) will be competitively selected. An initial SECDS RFP will
be issued specifying that three Service Groups be selected (Solar Physics,
Terrestrial Environment Imagery, and In Situ Space Physics). The responding
proposals can specify Data Providers needed to acquire the data, models,
and necessary software. Individual proposals may be for the Service Groups
only or for combinations of Service Group and Data Providers. The participants
will be selected by using the standard NASA peer review process. They will
be recompeted every 5 years. The specific relationship between the Service
Groups and Data Providers will be determined by individual contracts.
Service Groups may solicit proposals from
time to time for specified purposes. They will also annually review unsolicited
proposals for Data Providers supporting specific data sets or tools. The
SECDS Coordinating Council will have final authority to approve or disapprove
the Data Providers.
4.4 Interface to NASA
The SECDS will be funded solely by NASA OSS
for the benefit of the OSS research endeavor and to meet the OSS requirement
to ensure the public availability of SEC mission data. The SECDS is responsible
to NASA for the effective use of public funds in pursuit of its goals.
The SECDS Project Manager will submit to NASA bimonthly reports on SECDS
activities and accomplishments. Under the direction of the Project Scientist,
he or she will develop and provide to NASA budgets for future years. The
SECDS Project Scientist will arrange for Headquarters-requested reviews
of any parts of the SECDS (in addition to hosting regular SECDS Advisory
Committee reviews).
Responsibility of the SECDS to the broader
Space Science community and the SSDS is ensured via the oversight of the
ISSOMOWG and SSDS TWG.
5 Funding Level
Given the breadth of activities and scope
of data holdings associated with the SEC theme, and based on a comparison
with the other NASA OSS data systems, the full cost of SECDS activities
is estimated to be approximately $5 million per year. This includes the
costs associated with data services currently funded through active missions.
An initial funding level of $1.5 million/year should suffice for the organization
and operations of the SECDS Service Groups and Data Providers and the Coordinating
Council, ramping up to the $5 million/year level as Data Providers are
incorporated into the system.
Acknowledgments
The participants gratefully acknowledge the
logistical support of Southwest Research Institute, particularly the input
and support of Richard Murphy and Cyndi Farmer.
List of Acronyms
ACE -- Advance Composition Explorer
FAST -- Fast Auroral Snapshot
IMP -- Interplanetary Monitoring Platform
ISSOMOWG -- Integrated Space Science Operations
Mission Operations Working Group
KPVT -- Kitt Peak Vacuum Telescope
NASA -- National Aeronautics and Space Administration
OSS -- Office of Space Science
PDS -- Planetary Data System
RFP -- Request for Proposal
SEC -- Sun-Earth Connection
SECDS -- Sun-Earth Connection Data Service
SMM -- Solar Maximum Mission
SOHO -- Solar and Heliospheric Observatory
SSDS -- Space Science Data System
TIMED -- Thermosphere, Ionosphere, Mesosphere
Energetics and Dynamics
TRACE -- Transition Region and Coronal Explorer
TWG -- Technical Working Group
UARS -- Upper Atmosphere Research Satellite