Data Model#

Modules in the data_model sections provide functionality for reading, writing, and validation of data. Data products ingested or produced by simtools generally follows the CTA data model.

data_reader#

data_model.data_reader.read_table_from_file(file_name, schema_file=None, validate=False, metadata_file=None)[source]#

Read astropy table from file and validate against schema. Metadata is read from metadata file or from the metadata section of the data file. Schema for validation can be given as argument, or is determined from the metadata associated to the file.

Parameters:
file_name: str or Path

Name of file to be read.

schema_file: str or Path

Name of schema file to be used for validation.

validate: bool

Validate data against schema (if true).

metadata_file: str or Path

Name of metadata file to be read.

Returns:
astropy Table

Table read from file.

Raises:
FileNotFoundError

If file does not exist.

data_model.data_reader.read_value_from_file(file_name, schema_file=None, validate=False)[source]#

Read value from file and validate against schema. Expect data to follow the convention for how simulation model parameters are stored in the simulation model database: to be a single value stored in the ‘value’ field (with possible units in the ‘units’ field). Metadata is read from metadata file or from the metadata section of the data file. Schema for validation can be given as argument, or is determined from the metadata associated to the file.

Parameters:
file_name: str or Path

Name of file to be read.

schema_file: str or Path

Name of schema file to be used for validation.

validate: bool

Validate data against schema (if true).

Returns:
astro quantity or str

Value read from file. If units are given, return an astropy quantity, otherwise a string. Return None if no value is found in the file.

Raises:
FileNotFoundError

If file does not exist.

metadata_collector#

Metadata collector for simtools.

This should be the only module in simtools with knowledge on the implementation of the metadata model.

class data_model.metadata_collector.MetadataCollector(args_dict, metadata_file_name=None, data_model_name=None)[source]#

Collects and combines metadata associated to describe the current simtools activity and its data products. Collect as much metadata as possible from command line configuration, input data, environment, schema descriptions. Depends on the CTAO top-level metadata definition.

Parameters:
args_dict: dict

Command line parameters

metadata_file_name: str

Name of metadata file (only required when args_dict is None)

data_model_name: str

Name of data model parameter

collect_meta_data()[source]#

Collect and verify product metadata from different sources.

get_data_model_schema_dict()[source]#

Return data model schema dictionary.

Returns:
dict

Data model schema dictionary.

get_data_model_schema_file_name()[source]#

Return data model schema file name. The schema file name is taken (in this order) from the command line, from the metadata file, from the data model name, or from the input metadata file.

Returns:
str

Name of schema file.

get_site(from_input_meta=False)[source]#

Get site entry from metadata. Allow to get from collected or from input metadata

Parameters:
from_input_meta: bool

Get site from input metadata (default: False)

Returns:
str

Site name

metadata_model#

Definition of metadata model for input to and output of simtools. Follows CTAO top-level data model definition.

  • data products submitted to SimPipe (‘input’)

  • data products generated by SimPipe (‘output’)

data_model.metadata_model.get_default_metadata_dict(schema_file=None, observatory='CTA')[source]#

Returns metadata schema with default values. Follows the CTA Top-Level Data Model.

Parameters:
schema_file: str

Schema file (jsonschema format) used for validation

observatory: str

Observatory name

Returns:
dict

Reference schema dictionary.

data_model.metadata_model.validate_schema(data, schema_file)[source]#

Validate dictionary against schema.

Parameters:
data

dictionary to be validated

schema_file (dict)

schema used for validation

Raises:
jsonschema.exceptions.ValidationError

if validation fails

model_data_writer#

Model data writer module.

class data_model.model_data_writer.ModelDataWriter(product_data_file=None, product_data_format=None, args_dict=None)[source]#

Writer for simulation model data and metadata.

Parameters:
product_data_file: str

Name of output file.

product_data_format: str

Format of output file.

args_dict: Dictionary

Dictionary with configuration parameters.

static dump(args_dict, output_file=None, metadata=None, product_data=None, validate_schema_file=None)[source]#

Write model data and metadata (as static method).

Parameters:
args_dict: dict

Dictionary with configuration parameters (including output file name and path).

output_file: string or Path

Name of output file (args[“output_file”] is used if this parameter is not set).

metadata: dict

Metadata to be written.

product_data: astropy Table

Model data to be written

validate_schema_file: str

Schema file used in validation of output data.

validate_and_transform(product_data=None, validate_schema_file=None)[source]#

Validate product data using jsonschema given in metadata.

If necessary, transform product data to match schema.

Parameters:
product_data: astropy Table

Model data to be validated

validate_schema_file: str

Schema file used in validation of output data.

write(product_data=None, metadata=None)[source]#

Write model data and metadata.

Parameters:
product_data: astropy Table

Model data to be written

metadata: dict

Metadata to be written.

Raises:
FileNotFoundError

if data writing was not successful.

static write_dict_to_model_parameter_json(file_name, data_dict)[source]#

Write dictionary to model-parameter-style json file.

Parameters:
file_namestr

Name of output file.

data_dictdict

Dictionary to be written.

Raises:
FileNotFoundError

if data writing was not successful.

write_metadata_to_yml(metadata, yml_file=None, keys_lower_case=False)[source]#

Write model metadata file (yaml file format).

Parameters:
metadata: dict

Metadata to be stored

yml_file: str

Name of output file.

keys_lower_case: bool

Write yaml keys in lower case.

Returns:
str

Name of output file

Raises:
FileNotFoundError

If yml_file not found.

TypeError

If yml_file is not defined.

validate_data#

class data_model.validate_data.DataValidator(schema_file=None, data_file=None, data_table=None, data_dict=None, check_exact_data_type=True)[source]#

Validate data for type and units following a describing schema; converts or transform data if required.

Data can be of table or dict format (internally, all data is converted to astropy tables).

Parameters:
schema_file: Path

Schema file describing input data and transformations.

data_file: Path

Input data file.

data_table: astropy.table

Input data table.

data_dict: dict

Input data dict.

check_exact_data_type: bool

Check for exact data type (default: True).

validate_and_transform(is_model_parameter=False)[source]#

Data and data file validation.

Parameters:
is_model_parameter: bool

This is a model parameter (add some data preparation)

Returns:
data: dict or astropy.table

Data dict or table

Raises:
TypeError

if no data or data table is available

validate_data_file()[source]#

Open data file and read data from file (doing this successfully is understood as file validation).

validate_parameter_and_file_name()[source]#

Validate that file name and key ‘parameter_name’ in data dict are the same.