Data Model#
Modules in the data_model
sections provide functionality for reading, writing, and validation of data.
Data products ingested or produced by simtools generally follows the CTAO data model.
data_reader#
Helper module for reading of standardized simtools data products.
- data_model.data_reader.read_table_from_file(file_name, schema_file=None, validate=False, metadata_file=None)[source]#
Read astropy table from file and validate against schema.
Metadata is read from metadata file or from the metadata section of the data file. Schema for validation can be given as argument, or is determined from the metadata associated to the file.
- Parameters:
- file_name: str or Path
Name of file to be read.
- schema_file: str or Path
Name of schema file to be used for validation.
- validate: bool
Validate data against schema (if true).
- metadata_file: str or Path
Name of metadata file to be read.
- Returns:
- astropy Table
Table read from file.
- Raises:
- FileNotFoundError
If file does not exist.
- data_model.data_reader.read_value_from_file(file_name, schema_file=None, validate=False)[source]#
Read value from file and validate against schema.
Expect data to follow the convention for how simulation model parameters are stored in the simulation model database: to be a single value stored in the ‘value’ field (with possible units in the ‘units’ field). Metadata is read from metadata file or from the metadata section of the data file. Schema for validation can be given as argument, or is determined from the metadata associated to the file.
- Parameters:
- file_name: str or Path
Name of file to be read.
- schema_file: str or Path
Name of schema file to be used for validation.
- validate: bool
Validate data against schema (if true).
- Returns:
- astro quantity or str
Value read from file. If units are given, return an astropy quantity, otherwise a string. Return None if no value is found in the file.
- Raises:
- FileNotFoundError
If file does not exist.
format_checkers#
Custom format checkers for jsonschema validation.
- data_model.format_checkers.check_array_element(element)[source]#
Validate array elements for jsonschema.
- data_model.format_checkers.check_array_triggers_name(name)[source]#
Validate array trigger names for jsonschema.
- data_model.format_checkers.check_astropy_unit(unit_string)[source]#
Validate astropy units (including dimensionless) for jsonschema.
metadata_collector#
Metadata collector for simtools.
This should be the only module in simtools with knowledge on the implementation of the observatory metadata model.
- class data_model.metadata_collector.MetadataCollector(args_dict, metadata_file_name=None, data_model_name=None, observatory='cta', clean_meta=True)[source]#
Collects metadata to describe the current simtools activity and its data products.
Collect metadata from command line configuration, input data, environment, and schema descriptions. Depends on the CTAO top-level metadata definition.
Two dictionaries store two different types of metadata:
top_level_meta: metadata for the current activity
input_metadata: metadata from input data
- Parameters:
- args_dict: dict
Command line parameters
- metadata_file_name: str
Name of metadata file (only required when args_dict is None)
- data_model_name: str
Name of data model parameter
- observatory: str
Name of observatory (default: “cta”)
- clean_meta: bool
Clean metadata from None values and empty lists (default: True)
- clean_meta_data(meta_dict)[source]#
Clean metadata dictionary from None values and empty lists.
- Parameters:
- meta_dict: dict
Metadata dictionary.
- get_data_model_schema_dict()[source]#
Return data model schema dictionary.
- Returns:
- dict
Data model schema dictionary.
- get_data_model_schema_file_name()[source]#
Return data model schema file name.
The schema file name is taken (in this order) from the command line, from the metadata file, from the data model name, or from the input metadata file.
- Returns:
- str
Name of schema file.
metadata_model#
Definition of metadata model for input to and output of simtools.
Follows CTAO top-level data model definition.
data products submitted to SimPipe (‘input’)
data products generated by SimPipe (‘output’)
- data_model.metadata_model.get_default_metadata_dict(schema_file=None, observatory='CTA')[source]#
Return metadata schema with default values.
Follows the CTA Top-Level Data Model.
- Parameters:
- schema_file: str
Schema file (jsonschema format) used for validation
- observatory: str
Observatory name
- Returns:
- dict
Reference schema dictionary.
model_data_writer#
Model data writer module.
- class data_model.model_data_writer.ModelDataWriter(product_data_file=None, product_data_format=None, output_path=None, use_plain_output_path=True, args_dict=None)[source]#
Writer for simulation model data and metadata.
- Parameters:
- product_data_file: str
Name of output file.
- product_data_format: str
Format of output file.
- args_dict: Dictionary
Dictionary with configuration parameters.
- output_path: str or Path
Path to output file.
- use_plain_output_path: bool
Use plain output path.
- args_dict: dict
Dictionary with configuration parameters.
- static dump(args_dict, output_file=None, metadata=None, product_data=None, validate_schema_file=None)[source]#
Write model data and metadata (as static method).
- Parameters:
- args_dict: dict
Dictionary with configuration parameters (including output file name and path).
- output_file: string or Path
Name of output file (args[“output_file”] is used if this parameter is not set).
- metadata: dict
Metadata to be written.
- product_data: astropy Table
Model data to be written
- validate_schema_file: str
Schema file used in validation of output data.
- static dump_model_parameter(parameter_name, value, instrument, model_version, output_file, output_path=None, use_plain_output_path=False, metadata_input_dict=None)[source]#
Generate DB-style model parameter dict and write it to json file.
- Parameters:
- parameter_name: str
Name of the parameter.
- value: any
Value of the parameter.
- instrument: str
Name of the instrument.
- model_version: str
Version of the model.
- output_file: str
Name of output file.
- output_path: str or Path
Path to output file.
- use_plain_output_path: bool
Use plain output path.
- metadata_input_dict: dict
Input to metadata collector.
- Returns:
- dict
Validated parameter dictionary.
- get_validated_parameter_dict(parameter_name, value, instrument, model_version)[source]#
Get validated parameter dictionary.
- Parameters:
- parameter_name: str
Name of the parameter.
- value: any
Value of the parameter.
- instrument: str
Name of the instrument.
- model_version: str
Version of the model.
- Returns:
- dict
Validated parameter dictionary.
- static prepare_data_dict_for_writing(data_dict)[source]#
Prepare data dictionary for writing to json file.
Ensure sim_telarray style lists as strings. Replace “None” with “null” for unit field.
- Parameters:
- data_dict: dict
Dictionary with lists.
- Returns:
- dict
Dictionary with lists converted to strings.
- validate_and_transform(product_data_table=None, product_data_dict=None, validate_schema_file=None, is_model_parameter=False)[source]#
Validate product data using jsonschema given in metadata.
If necessary, transform product data to match schema.
- Parameters:
- product_data_table: astropy Table
Model data to be validated.
- product_data_dict: dict
Model data to be validated.
- validate_schema_file: str
Schema file used in validation of output data.
- is_model_parameter: bool
True if data describes a model parameter.
- write(product_data=None, metadata=None)[source]#
Write model data and metadata.
- Parameters:
- product_data: astropy Table
Model data to be written
- metadata: dict
Metadata to be written.
- Raises:
- FileNotFoundError
if data writing was not successful.
- write_dict_to_model_parameter_json(file_name, data_dict)[source]#
Write dictionary to model-parameter-style json file.
- Parameters:
- file_namestr
Name of output file.
- data_dictdict
Data dictionary.
- Raises:
- FileNotFoundError
if data writing was not successful.
- write_metadata_to_yml(metadata, yml_file=None, keys_lower_case=False)[source]#
Write model metadata file (yaml file format).
- Parameters:
- metadata: dict
Metadata to be stored
- yml_file: str
Name of output file.
- keys_lower_case: bool
Write yaml keys in lower case.
- Returns:
- str
Name of output file
- Raises:
- FileNotFoundError
If yml_file not found.
- TypeError
If yml_file is not defined.
validate_data#
Validation of data using schema.
- class data_model.validate_data.DataValidator(schema_file=None, data_file=None, data_table=None, data_dict=None, check_exact_data_type=True)[source]#
Validate data for type and units following a describing schema; converts or transform data.
Data can be of table or dict format (internally, all data is converted to astropy tables).
- Parameters:
- schema_file: Path
Schema file describing input data and transformations.
- data_file: Path
Input data file.
- data_table: astropy.table
Input data table.
- data_dict: dict
Input data dict.
- check_exact_data_type: bool
Check for exact data type (default: True).
- validate_and_transform(is_model_parameter=False)[source]#
Validate data and data file.
- Parameters:
- is_model_parameter: bool
This is a model parameter (add some data preparation)
- Returns:
- data: dict or astropy.table
Data dict or table
- Raises:
- TypeError
if no data or data table is available