Dataset

group harp_dataset

The HARP harp_dataset module contains everything regarding HARP datasets.

A Dataset contains a list of references to HARP products together with optional metadata on each product. The primary reference to a product is the value of the ‘source_product’ global attribute of a HARP product.

Typedefs

typedef struct harp_dataset_struct harp_dataset

HARP Dataset typedef

Functions

int harp_dataset_new(harp_dataset **new_dataset)

Create new HARP dataset. The metadata will be initialized with zero product metadata elements.

Parameters:

new_dataset – Pointer to the C variable where the new HARP product metadata will be stored.

Returns:

  • 0, Success.

  • -1, Error occurred (check harp_errno).

void harp_dataset_delete(harp_dataset *dataset)

Delete HARP dataset.

Parameters:

dataset – Pointer to the dataset to free.

void harp_dataset_print(harp_dataset *dataset, int (*print)(const char*, ...))

Print HARP dataset.

Parameters:
  • dataset – Pointer to the dataset to print.

  • print – Pointer to the function that should be used for printing.

int harp_dataset_import(harp_dataset *dataset, const char *path, const char *options)

Import metadata for products into the dataset. If path is a directory then all files (recursively) from that directory are added to the dataset. If path references a .pth file then the file paths from that text file (one per line) are imported. These file paths can be absolute or relative and can point to files, directories, or other .pth files. If path references a product file then that file is added to the dataset. Trying to add a file that is not supported by HARP will result in an error. Directories and files whose names start with a ‘.’ will be ignored.

Note that datasets cannot have multiple entries with the same ‘source_product’ value. Therefore, for each product where the dataset already contained an entry with the same ‘source_product’ value, the metadata of that entry is replaced with the new metadata (instead of adding a new entry to the dataset or raising an error).

Parameters:
  • dataset – Dataset into which to import the metadata.

  • path – Path to either a directory containing product files, a .pth file, or a single product file.

  • options – Ingestion module specific options (optional); should be specified as a semi-colon separated string of key=value pair; only used for product files that are not already in HARP format.

Returns:

  • 0, Success.

  • -1, Error occurred (check harp_errno).

int harp_dataset_get_index_from_source_product(harp_dataset *dataset, const char *source_product, long *index)

Lookup the index of source_product in the given dataset.

Parameters:
  • dataset – Dataset to get index in.

  • source_product – Source product reference.

  • index – Pointer to the C variable where the index in the dataset for the product is returned.

Returns:

  • 0, Success.

  • -1, Error occurred (check harp_errno).

int harp_dataset_has_product(harp_dataset *dataset, const char *source_product)

Test if dataset contains an entry with the specified source product reference.

Parameters:
  • dataset – Dataset in which to find the product.

  • source_product – Source product reference.

Returns:

  • 0, Dataset does not contain a product with the specific source reference.

  • 1, Dataset contains a product with the specific source reference.

int harp_dataset_add_product(harp_dataset *dataset, const char *source_product, harp_product_metadata *metadata)

Add a product reference to a dataset.

Parameters:
  • dataset – Dataset in which to add a new entry.

  • source_product – The source product reference of the new entry.

  • metadata – The product metadata of the new entry (can be NULL); the dataset is the new owner of metadata.

Returns:

  • 0, Success.

  • -1, Error occurred (check harp_errno).

int harp_dataset_prefilter(harp_dataset *dataset, const char *operations)

Filter products in dataset based on operations. Remove any entries from the dataset that can already be discarded based on filters at the start of the operations string. This includes comparisons against datetime/datetime_start/datetime_stop and collocate_left/collocate_right operations. The filters will be matched against the metadata in the dataset. The datatime_start and datetime_stop attributes will be used for the datetime filters and the source_product attribute for the collocation filters.

Parameters:
  • dataset – Dataset that should be filtered.

  • operations – Operations to execute; should be specified as a semi-colon separated string of operations.

Returns:

  • 0, Success.

  • -1, Error occurred (check harp_errno).

struct harp_dataset_struct
#include <harp.h>

HARP Dataset struct