Sun-Earth Connection Data Availability Catalog

Questions and comments to these notes and this catalog are invited. Last revision of these notes is 01/10/2003.

Table of Contents

  • Goals
  • Introduction
  • Mission Overview Summary Matrix
  • Mission Summary Matrices
  • Summary Reports
  • Updates and Additions
  • Important Definitions
  • Appendix A: Updating the Catalog
  • Appendix B: Database Specifics
  • Mission Fields
  • Dataset Fields
  • Goals

    At the request of the NASA Headquarter's Office of Space Science (OSS) we have undertaken a joint effort to define and build a catalog of "Data Availability for Current Sun-Earth Connection (SEC) Missions." This catalog is an attempt to summarize the kinds and extent of data now being made available in current SEC programs. This catalog and associated reports will then serve as a focused information source for OSS and SEC interests in this topic and, secondarily, as an alternative way for the NASA science community to locate data of interest from current missions. The format of these reports is specifically laid out to encourage an appropriate data access practices and to highlight possible current problem areas.

    Introduction

    The catalog report and entry pages are accessible at the URL:

  • http://spdf.gsfc.nasa.gov/SPD
  • The highest level page allows access to either reports about the data or forms to populate and update the information in the catalog. There are two basic kinds of display pages:

    Updates to specific data products may be initiated from within the summary pages. Update information entered by users will be acknowledged by a returned page but is manually reviewed before updates become active in the displayed reports (typically within one working day).

    Mission Overview Summary Matrix

    Information in the "Mission Overview" summary matrix includes:

    PDMP status is classified as "signed", "draft", "none" (does not exist), "*" (unknown or unpopulated status) or "other". Where online copies or other relevant information exist, the status for that project is a link to that information.

    Mission Summary Matrices

    The layout of the mission summary matrices are designed to encourage good overall practices in making SEC data available. The model is not optimally applicable to all missions but is reasonable for most providers. It is inappropriate to use the matrices only in themselves and without reference to the more descriptive information in the catalog for any purpose of evaluation.

    Information in the mission summary matrices includes:

    Data products are presently divided into three categories:

    For each data product, ten critical attributes are identified:

    Presently, valid values for these fields are generally "Y" (yes), "N" (no), and "*" (unknown or value not yet entered). "Public?" has the special value "R" (by request only) where appropriate. "Usable?" has the special value "D" (difficult to use) where neither a simple "Yes" or "No" seems to quite capture the sense of that data product. Each "Y" for "Exists?" is linked to the section of the "one-pager" for that investigation/data product. A few "N" values are also so linked where data products have been defined but don't yet exist and/or exist with severe limitations.

    In the mission summary matrix displays, these are shortened to be seven columns under each data category headed by the abbreviations E, P, U, C, A, and Arc. The values displayed in the matrices represent the "maximum" of the values for individual data products where more than one data product in a given category exists. For example, three distinct digital high-resolution data products for an investigation with "Public?" status flags of "Y", "R" and "N" would populate the "Public?" field for digital high-resolution data on the mission matrix as "Y."

    Some data products may be referenced more than one time, where they may contain significant data from more than one investigation or where they are mission-wide data products to be reflected both on the mission line and for all included investigations. The prototype example of such product is the ISTP Key Parameters data product. Some data products will be entered both as graphical and digital data (if the data content is available in both ways).

    Please see the section below on "Important Definitions" for further discussion on how the above fields are defined and how valid values are intended to be assigned.

    Summary Reports

    The summary format provides more in-depth information about data products from a specific mission or investigation, and includes hot links to all known sources of data. The pages are intended to start with a very short summary of the investigation's intrinsic data capabilities, followed by a section for each class of data. Within a class, each described data product includes information about the data's time span, whether documentation and software are available, as well as general comments. Links to the data source and known alternative sources are given. Links also now exist to update pages (see below) for investigations and for individual data products directly from those sections.

    Updates and Additions

    Updates or additions to the catalog may be entered via forms linked to various pages or by e-mail of appropriate information to the system using the " feedback" page of this catalog or direct e-mail to the staff.

    When a mission, investigation or data product is selected for entry/update, a mission/investigation update page is then generated with the current catalog information pre-populated. When submitted via the update forms, an acknowledgment page of the update request is returned to the originator. However, the catalog and underlying database are NOT directly updated nor will changes be seen before a minimal manual screening by the staff here is performed (usually within one working day). We apologize for this inconvenience, but we need to reasonably ensure updates are valid and from probably valid sources. The catalog has been structured so that updates and additions are only infrequently required after initial population.

    Among the most important update fields is the "certification date" by the Project Scientist or Principal Investigator (or his/her agent) as appropriate. This field is intended to establish that our information has been reviewed by an authoritative source and found generally sufficient as of a given date. As the standards of the catalog evolve, as more information becomes available or as the information in the catalog is reviewed, further updates may be made by the staff here in a few cases. It is our intention that PIs will be informed of substantive changes and their "certification" stamp removed. Full deletion of data products once initially defined can only be done by the system administrator.

    Important Definitions

    Many fields in the catalog (a) should have self-evident definitions and (b) have been generally pre-populated for known missions, investigations and data products. Please see Appendix B for a full list of definitions.

    A basic question is what constitutes a "reportable" data product in the context of this information summary. The test is very much at the discretion of the data provider. Products reported should have significant value in representing that investigation's data and are of greatest interest, typically when comprehensive and current in time coverage. Typically lower-resolution products will be more time-continuous. Non-time-continuous products should have clear special value (highest resolution, special processing adding unique and important value). Providers should aggregate descriptions of related data products where sensible. Providers should initially classify and characterize products as they think the community would find fair. The definitions and standards are being kept loose so providers can use good judgment in some organized framework (i.e. organized enough to do "matrix" summaries) without being forced into too-narrow definitions.

    Beyond this question of "what's a data product?", the fields requiring further discussion are

    Data Product Categories: Multiple data product categories are defined to separate different levels of data products being made accessible and to try to encourage the data-providing community to supply a more appropriate range of data products generally. The definitions of the product categories are also deliberately kept loose.

    The distinction between "graphic" and "digital" products should be self-evident. "Full-telemetry-resolution" are products that nominally preserve the full data content of what's sent from the spacecraft.

    The prototype example of a "digital low-resolution" data product is the ISTP "Key Parameters" (KPs). Thus, low-resolution digital data are approximately "1-minute average" field, plasma and particle data and "few minute" average imager data, but varying upon the investigation. Data products are more or less time-continuous for time of instrument operation. Such data includes some range of typical/important parameters from the investigation including some calibration to geophysical units. Averaging of spectral data into broader energy bands and images to a useful but less than full intrinsic resolution are typical. Typical research use of a low-resolution digital data product include characterization of data on larger physical or time scales and/or in setting a context for understanding higher-resolution data from another investigation.

    One kind of high-resolution data product is continuous for times of instrument operation at "medium" or "higher" time resolution (e.g. spin resolution) with a reasonably complete set of important physical parameters at reasonably detailed energy, spatial etc. resolution. A second kind of digital high-resolution product is data rendered to physical/calibrated units at the highest possible instrument resolution but possibly generated only for a set of times ("events", perhaps for only 5-10% of the total instrument time coverage) of special interest. Both kinds of products are highly desirable.

    Data Product Characteristics: In addition to free-form text fields holding very brief English-language descriptions of data product characteristics, there are also five important characteristics of any given data product.

    The question of "existence" is addressed above in what kinds of data products should be included. Data products that are just a sampling of data very out of date should probably not be included.

    Data are "public" when they are available to all users without any requirement for specific approval of their subsequent use by the Principal Investigator / data provider. Data may be supplied with many caveats and still be public. Data products classified as public for this set of summaries should be produced and released in a timely way. The characterization "public?" includes a separate value for data to be made available "by request" as opposed to directly accessible to all users.

    Data "usability" is an attempt to capture whether an appropriate combination of documentation about the format, meaning and appropriate use of the data (including software and any other ancillary materials) exists and is readily available so that a given data product can actually be independently understood and correctly used for research purposes. This characterization has been introduced to flag that some data products are presently being held archivally to ensure their preservation but do not in fact yet have any reasonable set of materials to make the data usable by anyone outside the Investigator team. Other data products have nominal format descriptions but very limited description of what's actually being measured. The special value "difficult" exists for this field to characterize data that may have nominally complete description but complexity such that real understanding and use of the data is highly non-trivial. Much "full-telemetry-resolution" data will be classified as "difficult" (at best) to use.

    Data are "citable" if their use (without further consultation with the provider) is not explicitly disallowed. BUT ... (a) data may be described as "citable with caution" including the recommendation the data provider should be consulted for research use as a good practice, and (b) the data user is always ultimately responsible that the data have been reasonably cross-checked to be reliable and are not being accidentally misinterpreted or misused.

    Data are "electronically Accessible" when they can be retrieved more or less routinely via network links. Online data is clearly the base concept. But electronically accessible data in this context may be maintained in robotic, batch-operation systems that promote or deliver data on request within hours or a day normally. Electronically accessible data may be maintained in manual tape libraries so long as there is an established and generally routine/reliable procedure for promoting/delivering data in a timely way. Electronically-accessible data will typically be supplied at no cost to users.

    Data are considered formally "Archived" when submitted (or being submitted on an ongoing basis) to a recognized NASA (e.g. NSSDC, SDAC) or other Agency center. Missions or data providers who feel their data are being archivally held but not at such a center should be sure to indicate what facility they consider to be the archival holder of their data at this time.


    Feedback | Sun-Earth Connection Data Availability Catalog Home Page

    Curator:

    Howard Leckner, leckner@mail630.gsfc.nasa.gov, (301)286-9270
    QSS, Code 632,
    NASA/Goddard Space Flight Center, Greenbelt, MD 20771

    NASA Official: R. E. McGuire,
    Head, Space Physics Data Facility
    (Code 632, NASA/GSFC),
    Robert.E.McGuire@gsfc.nasa.gov, (301)286-7794.

    Last Modified: 03/24/97 GCG.