Data Model#

Modules in the data_model sections provide functionality for reading, writing, and validation of data. Data products ingested or produced by simtools generally follows the CTAO data model.

data_reader#

Helper module for reading of standardized simtools data products.

data_model.data_reader.read_table_from_file(file_name, schema_file=None, validate=False, metadata_file=None)[source]#

Read astropy table from file and validate against schema.

Metadata is read from metadata file or from the metadata section of the data file. Schema for validation can be given as argument, or is determined from the metadata associated to the file.

Parameters:
file_name: str or Path

Name of file to be read.

schema_file: str or Path

Name of schema file to be used for validation.

validate: bool

Validate data against schema (if true).

metadata_file: str or Path

Name of metadata file to be read.

Returns:
astropy Table

Table read from file.

Raises:
FileNotFoundError

If file does not exist.

data_model.data_reader.read_value_from_file(file_name, schema_file=None, validate=False)[source]#

Read value from file and validate against schema.

Expect data to follow the convention for how simulation model parameters are stored in the simulation model database: to be a single value stored in the ‘value’ field (with possible units in the ‘units’ field). Metadata is read from metadata file or from the metadata section of the data file. Schema for validation can be given as argument, or is determined from the metadata associated to the file.

Parameters:
file_name: str or Path

Name of file to be read.

schema_file: str or Path

Name of schema file to be used for validation.

validate: bool

Validate data against schema (if true).

Returns:
astro quantity or str

Value read from file. If units are given, return an astropy quantity, otherwise a string. Return None if no value is found in the file.

Raises:
FileNotFoundError

If file does not exist.

format_checkers#

Custom format checkers for jsonschema validation.

data_model.format_checkers.check_array_element(element)[source]#

Validate array elements for jsonschema.

data_model.format_checkers.check_array_triggers_name(name)[source]#

Validate array trigger names for jsonschema.

data_model.format_checkers.check_astropy_unit(unit_string)[source]#

Validate astropy units (including dimensionless) for jsonschema.

data_model.format_checkers.check_astropy_unit_of_length(unit_string)[source]#

Validate astropy units that this is an astropy unit of length.

data_model.format_checkers.check_astropy_unit_of_time(unit_string)[source]#

Validate astropy units that this is an astropy unit of time.

metadata_collector#

Metadata collector for simtools.

This should be the only module in simtools with knowledge on the implementation of the observatory metadata model.

class data_model.metadata_collector.MetadataCollector(args_dict, metadata_file_name=None, data_model_name=None, observatory='cta', clean_meta=True)[source]#

Collects metadata to describe the current simtools activity and its data products.

Collect metadata from command line configuration, input data, environment, and schema descriptions. Depends on the CTAO top-level metadata definition.

Two dictionaries store two different types of metadata:

  • top_level_meta: metadata for the current activity

  • input_metadata: metadata from input data

Parameters:
args_dict: dict

Command line parameters

metadata_file_name: str

Name of metadata file (only required when args_dict is None)

data_model_name: str

Name of data model parameter

observatory: str

Name of observatory (default: “cta”)

clean_meta: bool

Clean metadata from None values and empty lists (default: True)

clean_meta_data(meta_dict)[source]#

Clean metadata dictionary from None values and empty lists.

Parameters:
meta_dict: dict

Metadata dictionary.

collect_meta_data()[source]#

Collect and verify product metadata for each main-level metadata type.

get_data_model_schema_dict()[source]#

Return data model schema dictionary.

Returns:
dict

Data model schema dictionary.

get_data_model_schema_file_name()[source]#

Return data model schema file name.

The schema file name is taken (in this order) from the command line, from the metadata file, from the data model name, or from the input metadata file.

Returns:
str

Name of schema file.

get_site(from_input_meta=False)[source]#

Get site entry from metadata. Allow to get from collected or from input metadata.

Parameters:
from_input_meta: bool

Get site from input metadata (default: False)

Returns:
str

Site name

get_top_level_metadata()[source]#

Return top level metadata dictionary (with updated activity end time).

Returns:
dict

Top level metadata dictionary.

metadata_model#

Definition of metadata model for input to and output of simtools.

Follows CTAO top-level data model definition.

  • data products submitted to SimPipe (‘input’)

  • data products generated by SimPipe (‘output’)

data_model.metadata_model.get_default_metadata_dict(schema_file=None, observatory='CTA')[source]#

Return metadata schema with default values.

Follows the CTA Top-Level Data Model.

Parameters:
schema_file: str

Schema file (jsonschema format) used for validation

observatory: str

Observatory name

Returns:
dict

Reference schema dictionary.

data_model.metadata_model.validate_schema(data, schema_file)[source]#

Validate dictionary against schema.

Parameters:
data

dictionary to be validated

schema_file (dict)

schema used for validation

Raises:
jsonschema.exceptions.ValidationError

if validation fails

model_data_writer#

Model data writer module.

class data_model.model_data_writer.ModelDataWriter(product_data_file=None, product_data_format=None, output_path=None, use_plain_output_path=True, args_dict=None)[source]#

Writer for simulation model data and metadata.

Parameters:
product_data_file: str

Name of output file.

product_data_format: str

Format of output file.

args_dict: Dictionary

Dictionary with configuration parameters.

output_path: str or Path

Path to output file.

use_plain_output_path: bool

Use plain output path.

args_dict: dict

Dictionary with configuration parameters.

static dump(args_dict, output_file=None, metadata=None, product_data=None, validate_schema_file=None)[source]#

Write model data and metadata (as static method).

Parameters:
args_dict: dict

Dictionary with configuration parameters (including output file name and path).

output_file: string or Path

Name of output file (args[“output_file”] is used if this parameter is not set).

metadata: dict

Metadata to be written.

product_data: astropy Table

Model data to be written

validate_schema_file: str

Schema file used in validation of output data.

static dump_model_parameter(parameter_name, value, instrument, model_version, output_file, output_path=None, use_plain_output_path=False, metadata_input_dict=None)[source]#

Generate DB-style model parameter dict and write it to json file.

Parameters:
parameter_name: str

Name of the parameter.

value: any

Value of the parameter.

instrument: str

Name of the instrument.

model_version: str

Version of the model.

output_file: str

Name of output file.

output_path: str or Path

Path to output file.

use_plain_output_path: bool

Use plain output path.

metadata_input_dict: dict

Input to metadata collector.

Returns:
dict

Validated parameter dictionary.

get_validated_parameter_dict(parameter_name, value, instrument, model_version)[source]#

Get validated parameter dictionary.

Parameters:
parameter_name: str

Name of the parameter.

value: any

Value of the parameter.

instrument: str

Name of the instrument.

model_version: str

Version of the model.

Returns:
dict

Validated parameter dictionary.

static prepare_data_dict_for_writing(data_dict)[source]#

Prepare data dictionary for writing to json file.

Ensure sim_telarray style lists as strings. Replace “None” with “null” for unit field.

Parameters:
data_dict: dict

Dictionary with lists.

Returns:
dict

Dictionary with lists converted to strings.

validate_and_transform(product_data_table=None, product_data_dict=None, validate_schema_file=None, is_model_parameter=False)[source]#

Validate product data using jsonschema given in metadata.

If necessary, transform product data to match schema.

Parameters:
product_data_table: astropy Table

Model data to be validated.

product_data_dict: dict

Model data to be validated.

validate_schema_file: str

Schema file used in validation of output data.

is_model_parameter: bool

True if data describes a model parameter.

write(product_data=None, metadata=None)[source]#

Write model data and metadata.

Parameters:
product_data: astropy Table

Model data to be written

metadata: dict

Metadata to be written.

Raises:
FileNotFoundError

if data writing was not successful.

write_dict_to_model_parameter_json(file_name, data_dict)[source]#

Write dictionary to model-parameter-style json file.

Parameters:
file_namestr

Name of output file.

data_dictdict

Data dictionary.

Raises:
FileNotFoundError

if data writing was not successful.

write_metadata_to_yml(metadata, yml_file=None, keys_lower_case=False)[source]#

Write model metadata file (yaml file format).

Parameters:
metadata: dict

Metadata to be stored

yml_file: str

Name of output file.

keys_lower_case: bool

Write yaml keys in lower case.

Returns:
str

Name of output file

Raises:
FileNotFoundError

If yml_file not found.

TypeError

If yml_file is not defined.

validate_data#

Validation of data using schema.

class data_model.validate_data.DataValidator(schema_file=None, data_file=None, data_table=None, data_dict=None, check_exact_data_type=True)[source]#

Validate data for type and units following a describing schema; converts or transform data.

Data can be of table or dict format (internally, all data is converted to astropy tables).

Parameters:
schema_file: Path

Schema file describing input data and transformations.

data_file: Path

Input data file.

data_table: astropy.table

Input data table.

data_dict: dict

Input data dict.

check_exact_data_type: bool

Check for exact data type (default: True).

validate_and_transform(is_model_parameter=False)[source]#

Validate data and data file.

Parameters:
is_model_parameter: bool

This is a model parameter (add some data preparation)

Returns:
data: dict or astropy.table

Data dict or table

Raises:
TypeError

if no data or data table is available

validate_data_file()[source]#

Open data file and read data from file.

Doing this successfully is understood as file validation.

validate_parameter_and_file_name()[source]#

Validate that file name and key ‘parameter_name’ in data dict are the same.