ISTP Metadata Guidelines: Variable Attributes

Variable attributes are linked with each individual variable, and provide additional information about each variable. A standard set of these attributes is very important, for this is where the information can be stored in a commonly defined manner. Note that CDF attributes are case-sensitive and must exactly follow what is shown here (ISTP attribute names must be capitalized). Additional Variable attributes can be defined but they must start with a letter and can otherwise contain letters, numbers and the underscore character (no other special characters allowed). The variable attributes can be listed in any order.

The following table lists all the attributes and the type of variables for which they are needed. If a type of variable is not listed, then that attribute need not be defined for that particular variable. However, if a given variable has an attribute that is not needed, it will be ignored in most ISTP compliant applications. (RV is record or time varying)

See Alphabetical list of Variable Attribute Definitions.

Attribute Need Notes
CATDESC Required Required for all variables
DEPEND_0 Required data, RV support_data, and RV metadata
DEPEND_1 Required data of the following form:
  • 1D spectrogram
  • 1D stack_plot
  • 2D spectrogram
  • image
DEPEND_2 Required data of the following form:
  • 2D spectrogram
  • image
DEPEND_3 Required data of the following form:
  • 3D spectrogram
DISPLAY_TYPE Required data
FIELDNAM Required All
FILLVAL Required data, RV support_data, and RV metadata
FORMAT Required all not using FORM_PTR
FORM_PTR Required 1D data, support_data, and metadata not using FORMAT
LABLAXIS Required data of the following form:
  • image
  • scalar time_series
  • 1D spectrogram
Also needed for support_data that does not utilize LABL_PTR_X
LABL_PTR_1 Required data of the following form:
  • 1D time_series
  • 2D spectrogram
Also needed for 1D and 2D support_data without a LABLAXIS
LABL_PTR_2 Required data of the following form:
  • 2D spectrogram
  • 3D spectrogram
Also needed for 2D support_data without a LABLAXIS
LABL_PTR_3 Required data of the following form:
  • 3D spectrogram
UNITS Required data and support_data not using UNIT_PTR
UNIT_PTR Required 1D data and support_data not using UNITS
VALIDMIN Required data and RV support_data
VALIDMAX Required data and RV support_data
VAR_TYPE Required all
SCALETYP Recommended data not using SCALE_PTR and support_data
SCAL_PTR Recommended multidimensional data not using SCALETYP
VAR_NOTES Recommended all
AVG_TYPE Optional data or RV support_data
DELTA_PLUS_VAR Optional data
DELTA_MINUS_VAR Optional data
DICT_KEY Optional All
MONOTON Optional Epoch data
SCALEMIN Optional data and RV support_data
SCALEMAX Optional data and RV support_data
V_PARENT Optional data
LIMITS_WARN_MIN Optional data
LIMITS_WARN_MAX Optional data
LIMITS_NOMINAL_MIN Optional data
LIMITS_NOMINAL_MAX Optional data
VARIABLE_PURPOSE Optional data
DERIVN cluster required for derived variables
sig_digits cluster recommended data
SI_conv cluster recommended data

Variable attributes for time documentation (not needed for predefined CDF_TIME_TT2000)

Attribute Need Notes
TIME_BASE Recommended (Important for netCDF files and clarity) fixed (0AD, 1900, 1970 (POSIX), J2000 (used by CDF_TIME_TT2000), 4714 BC (Julian)) or flexible (provider-defined)
TIME_SCALE Recommended TT (same as TDT, used by CDF_TIME_TT2000), TAI (same as IAT, TT-32.184s), UTC (includes leap seconds), TDB (same as SPICE ET), EME1950 [default: UTC]
LEAP_SECONDS_INCLUDED Recommended for UTC only comma-delimited list (within brackets) of leap seconds
REFERENCE_POSITION Optional Topocenter (local), Geocenter , rotating Earth geoid (used by CDF_TIME_TT2000)
UNITS Optional SI measurement unit: s, ms(milliseconds for EPOCH variables), ns(nanoseconds for CDF_TIME_TT2000), ps(picoseconds for EPOCH16)
RESOLUTION Optional using ISO8601 relative time format, for example: "1s" = 1 second. Resolution provides the smallest change in time that is measured.
ABSOLUTE_ERROR Optional Absolute or systematic error, in same units as Units attribute.
RELATIVE_ERROR Optional Relative or random error, in same units as Units attribute - to specify the accuracy of the time stamps relative to each other.
BIN_LOCATION Optional relative position of time stamp to the data measurement bin, with 0.0 at the beginning of time bin and 1.0 at the end. Default is 0.5 for the time at the center of the data measurement.

Variable Attribute Definitions in alphabetical order

ABSOLUTE_ERROR --- optional

Absolute or systematic error, in same units as Units attribute.

AVG_TYPE --- optional

sets up useful default conditions: different techniques appropriate to averaging different types of data. If this attribute is not present, standard average, i.e., simple arithmetic mean, is assumed. The value of this attribute can be used with application software. The valid options are listed below:

BIN_LOCATION --- optional

relative position of time stamp to the data measurement bin, with 0.0 at the beginning of time bin and 1.0 at the end. Default is 0.5 for the time at the center of the data measurement. Since clock readings are usually truncated, the real value may be closer to 0.0.

CATDESC --- required for all variables

(catalog description) is an approximately 80-character string which is a textual description of the variable and includes a description of what the variable depends on. This information needs to be complete enough that users can select variables of interest based only on this value. (see CDAWeb www-based interface). Examples:

DELTA_PLUS_VAR and DELTA_MINUS_VAR --- optional

are included to point to a variable (or variables) which stores the uncertainty in (or range of) the original variable's value. The uncertainty (or range) is stored as a (+/-) on the value of the original variable. For many variables in ISTP, the original variable will be at the center of the interval so that only one value (or one set of values) of uncertainty (or range) will need to be defined. In this case, DELTA_PLUS_VAR, and DELTA_MINUS_VAR will point to the same variable. See example. The value of the attribute must be a variable in the same CDF data set.

DEPEND_0 --- required for time-varying variables

explicitly ties a data variable to the time variable on which it depends. All variables which change with time must have a DEPEND_0 attribute defined. The value of DEPEND_0 is 'Epoch', the time ordering parameter for ISTP. Different time resolution data can be supported in a single CDF data set by defining the variables Epoch, Epoch_1, Epoch_2, etc. each representing a different time resolution. These are "attached" appropriately to the variables in the CDF data set via the attribute DEPEND_0. The value of the attribute must be a variable in the same CDF data set. See example.

DEPEND_1, DEPEND_2, etc --- required for dimensional variables as shown in table above. (1D time series data variables do not need a DEPEND_1 defined.)

ties a dimensional data variable to a support_data variable on which the i-th dimension of the data variable depends. The number of DEPEND attributes must match the dimensionality of the variable, i.e., a one-dimensional variable must have a DEPEND_1, a two-dimensional variable must have a DEPEND_1 and a DEPEND_2 attribute, etc. The value of the attribute must be a variable in the same CDF data set. See example.

DERIVN --- Cluster required for derived variables

A text string identifying the derivation of the variable, possibly including a function/algorithm name or journal reference. Most derived variables will not be unique, and this information is essential if the product is to be compared/validated elsewhere.

DICT_KEY --- optional

comes from a data dictionary keyword list and describes the variable to which it is attached. The ISTP standard dictionary keyword list is described in ISTP Dictionary Keywords.

DISPLAY_TYPE --- required for data variables

tells automated software what type of plot to make and what associated variables in the CDF are required in order to do so. Some valid values are listed below:

FIELDNAM --- required for all variables

holds a character string (up to 30 characters) which describes the variable. It can be used to label a plot either above or below the axis, or can be used as a data listing heading. Therefore, consideration should be given to the use of upper and lower case letters where the appearance of the output plot or data listing heading will be affected.

FILLVAL --- required for time varying variables

is the number inserted in the CDF in place of data values that are known to be bad or missing. Fill data are always non-valid data. The ISTP standard fill values are listed below. Fill values are automatically supplied in the ISTP CDHF ICSS environment (ICSS_KP_FILL_VALUES.INC) for key parameters produced at the CDHF. The FILLVAL data type must match the data type of the variable. For key parameters produced outside of the CDHF, the values below should be used. In addition, the CDF library has special cases for the FILLVAL and PADVALUE numbers for the three time variable data types, where the display string is purposefully set to a recognizable string rather than the actual stored number:

Time data type Variable type Stored number Input/Output string
TT2000 FILLVAL -9223372036854775808LL 9999-12-31:23:59:59.999999999
PADVALUE -9223372036854775807LL 0000-01-01:00:00:00.000000000
EPOCH FILLVAL -1.0E31 9999-12-31:23:59:59.999
PADVALUE 0.0 0000-01-01:00:00:00.000
EPOCH16 FILLVAL -1.0E31, -1.0E31 9999-12-31:23:59:59.999999999999
PADVALUE 0.0 0000-01-01:00:00:00.000000000000

FORMAT --- required if not using FORM_PTR

is the output format used when extracting data values out to a file or screen (using CDFlist). The magnitude and the number of significant figures needed should be carefully considered. A good check is to consider it with respect to the values of VALIDMIN and VALIDMAX attributes. The output should be in Fortran format.

FORM_PTR --- required if not using FORMAT

has as its value a variable which stores the character strings (up to 20 characters per character string) representing the desired output format for the original variable. FORM_PTR is used instead of FORMAT. The value of the attribute must be a variable in the same CDF data set.

LABLAXIS --- required if not using LABL_PTR_1

should be a short character string (approximately 10 characters, but preferably 6 characters - more only if absolutely required for clarity) which can be used to label a y-axis for a plot or to provide a heading for a data listing.

LABL_PTR_1, LABL_PTR_2, etc. --- required if not using LABLAXIS

is used to label a dimensional variable when one value of LABLAXIS is not sufficient to describe the variable or to label all the axes. LABL_PTR_i is used instead of LABLAXIS, where i can take on any value from 1 to n where n is the total number of dimensions of the original variable. The value of LABL_PTR_1 is a variable which will contain the short character strings which describe the first dimension of the original variable. The actual labels should be short as described above for LABLAXIS. The value of the attribute must be a variable in the same CDF data set. See example.

LEAP_SECONDS_INCLUDED --- recommended for UTC only

comma-delimited list (within brackets) of leap seconds included in the form of a lists of ISO8601 times when each leap second was added, appended with the size of the leap second in ISO8601 relative time (+/- time, most commonly: "+1s") [default: standard list of leap seconds up to time of data]. Leap_Seconds_Included is needed to account for time scales that don't have all 34 (in 2009) leap seconds and for the clocks in various countries that started using leap seconds at different times. The full list is required to handle the equally or more common case where a time scale starts at a specific UTC but continues on without leap seconds in TAI mode; this is basically what missions that don't add leap seconds are doing.

$ cat tai-utc.dat | awk 'ORS="," { val = $7 - prev } {prev = $7} { print $1$2"01+" val "s" }'

LEAP_SECONDS_INCLUDED="1961JAN01+1.42282s,1961AUG01-0.05s,1962JAN01+0.47304s,1963NOV01+0.1s,1964JAN01+1.29427s,1964APR01+0.1s,1964SEP01+0.1s,1965JAN01+0.1s,1965MAR01+0.1s,1965JUL01+0.1s,1965SEP01+0.1s,1966JAN01+0.47304s,1968FEB01-0.1s,1972JAN01+5.78683s,1972JUL01+1s,1973JAN01+1s,1974JAN01+1s,1975JAN01+1s,1976JAN01+1s,1977JAN01+1s,1978JAN01+1s,1979JAN01+1s,1980JAN01+1s,1981JUL01+1s,1982JUL01+1s,1983JUL01+1s,1985JUL01+1s,1988JAN01+1s,1990JAN01+1s,1991JAN01+1s,1992JUL01+1s,1993JUL01+1s,1994JUL01+1s,1996JAN01+1s,1997JUL01+1s,1999JAN01+1s,2006JAN01+1s,2009JAN01+1s"

LIMITS_WARN_MIN and MAX --- optional

Values which define the limits where damage is likely to occur for values outside these values (often referred to as red limits). Visualization software can use these attributes for indicating limits on plots or other warnings. The values data type must match the data type of the variable.

LIMITS_NOMINAL_MIN and MAX --- optional

Values which define the range of nominal operations and where values outside the range of these values should be flagged as warnings (often referred to as yellow limits). Visualization software can use these attributes for indicating limits on plots or other warnings. The range of LIMITS_NOMINAL_MIN and LIMITS_NOMINAL_MAX fall within the range of LIMITS_WARN_MIN and LIMITS_WARN_MAX. Yellow limits are often set a certain percentage away from the red limits to give the operator a chance to respond before the red limits are reached. The values data type must match the data type of the variable.

MONOTON --- optional

Indicates whether the variable is monotonically increasing or monotonically decreasing. Use of MONOTON is strongly recommended for the Epoch time variable, and can significantly increase the performance speed on retrieval of data. Valid values: INCREASE, DECREASE.

RELATIVE_ERROR --- optional

Relative or random error, in same units as Units attribute - to specify the accuracy of the time stamps relative to each other. This is usually much smaller than Absolute_Error.

REFERENCE_POSITION --- optional

Topocenter (local), Geocenter , rotating Earth geoid (used by CDF_TIME_TT2000). Reference_Position is optional metadata to account for time variance with position in the gravity wells and with relative velocity. While we could use a combined TimeSystem attribute that defines mission-specific time scales where needed, such as UTC-at-STEREO-B, it's cleaner to keep them separate as Time_Scale=UTC and Reference_Position=STEREO-B.

RESOLUTION --- optional

Using ISO8601 relative time format, for example: "1s" = 1 second. Resolution provides the smallest change in time that is measured.

SCALEMIN and SCALEMAX --- optional

are values which can be based on the actual values of data found in the CDF data set or on the probable uses of the data, {\em e.g.}, plotting multiple files at the same scale. Visualization software can use these attributes as defaults for plotting. The values must match the data type of the variable.

SCALETYP --- recommended for non-linear scales if not using SCAL_PTR

indicates whether the variable should have a linear or a log scale as a default. If this attribute is not present, linear scale is assumed.

SCAL_PTR --- recommended for non-linear scales if not using SCALETYP

is used for dimensional variables when one value of SCALTYP is not sufficient. SCAL_PTR is used {\em instead of} SCALTYP, and will point to a variable which will be of the same dimensionality as the original variable. The allowed values are linear and log. The value of the attribute must be a variable in the same CDF data set.

sig_digits --- Cluster recommended

This attribute provides the number of significant digits or other measure of data accuracy in a TBD manner. It is to allow compression software to optimise the number of digits to retain, and users to assess the accuracy of products. This operation is subject to the deliberations of the 'network traffic report' Task Group, DS-CFC-TN-0001, on compression algorithms and implementation. Restrictions on data compression may also influence the format and choice of data type used by the CDF generation software.

SI_conversion --- Cluster recommended

The conversion factor to SI units. This is the factor that the variable must be multiplied by in order to turn it to generic SI units. It will copntain two text fields separated by the delimiter >. The first is the conversion and the second is the standard unit that it converts to. For example the magnetic field for FGM will be in nT, and to convert to Tesla the value of SI_conv will be '1.0e-9>Tesla'. The use of text allows this attribute to be parsed and the value must be extracted in software.

TIME_BASE --- recommended for Time variables (important for netCDF files and clarity)

fixed (0AD, 1900, 1970 (POSIX), J2000 (used by CDF_TIME_TT2000), 4714 BC (Julian)) or flexible (provider-defined)

TIME_SCALE --- optional

TT (same as TDT, used by CDF_TIME_TT2000), TAI (same as IAT, TT-32.184s), UTC (includes leap seconds), TDB (same as SPICE ET), EME1950 [default: UTC]

UNITS --- required if not using UNIT_PTR (optional for time variables)

is a character string (no more than 20 characters, but preferably 6 characters) representing the units of the variable, e.g., nT for magnetic field. If the standard abbreviation used is short then the units value can be added to a data listing heading or plot label. Use a blank character, rather than "None" or "unitless", for variables that have no units (e.g., a ratio or a direction cosine).

For CDF_TIME_TT2000: SI measurement unit: s, ms(milliseconds for EPOCH variables), ns(nanoseconds for CDF_TIME_TT2000), ps(picoseconds for EPOCH16)

UNIT_PTR --- required if not using UNITS

has as its value a variable which stores the character strings (up to 20 characters per character string) representing the units of the original variable, which can be added to a data listing heading or plot label. Use a blank character, rather than "None" or "unitless", for variables that have no units (e.g., a ratio or a direction cosine). If this attribute is used, then UNITS is not used. The value of the attribute must be a variable in the same CDF data set.

VALIDMIN and VALIDMAX --- required for time varying data and support_data

hold values which are, respectively, the minimum and maximum values for a particular variable that are expected over the lifetime of the mission. The values must match the data type of the variable.

VAR_NOTES --- optional

holds ancilliary information about the variable and can be any length.

VAR_TYPE --- required for all variables

identifies a variable as either

VARIABLE_PURPOSE --- optional

are a list of tags/keywords separated by commas that indicate probable uses of the variable and its function. Software can use these attributes to find the primary variables in the dataset, find variables with a common function, indicate variables suitable for specific purposes such as summary plots or educational displays, etc. Tags could indicate a common geophysical quantity to enable matching several variables of the same kind. For instance, all magnetic field variables could be tagged with VARIABLE_PURPOSE="Magnetic_Field" even though they have different coordinate systems or cadences. Software could use this tag to identify variables with a common theme for easier distinguishing between groups of variables and selecting between them. The values are always in a character string. Suggested tags/keywords include:

V_PARENT --- optional for use with derived variables

identifies the "attached" variable which stores the parent variable(s) of a derived variable. The "attached" variable can be dimensional and sized to hold as many parents as necessary. The syntax of each entry would be: logical_file_id>variable_name.


Return to Top

Return to ISTP Metadata Guidelines

CDF home page