logsnarf.schema module
- class logsnarf.schema.Schema(schema_file, default_tz=<UTC>)[source]
Bases:
object
The Schema class represents a BigQuery JSON schema.
Objects of this class are able to
load and verify schema files which should contain a JSON representation of a list of fields as defined by https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#TableFieldSchema
parse JSON strings, cooercing where able and appropriate fields to appropriate types as defined by the schema.
validate python objects against the BigQuery schema.
- Parameters:
schema_file (file) – File-like object containing the BigQuery JSON schema.
default_tz (datetime.tzinfo) – Timezone to use on date strings that don’t contain TZ information.
- Raises:
ValueError – if the schema file doesn’t contain valid JSON
- ignore_fields = ['table', '_sha1']
Fields in this list are permitted, even if they aren’t part of the schema. In Logsnarf we use this for the tables field, which tells us which table this log line belongs in, and we remove it from the entry before upload.
- loads(json_string)[source]
Deserialize json_string into a python object.
This applies all schema checks and post-processors.
- Parameters:
json_string (string|bytes) – utf-8 encoded string containing a JSON document.
- Returns:
The JSON document as a python object
- Return type:
- Raises:
logsnarf.errors.ValidationError – if the JSON is valid, but does not contain a document that conforms to the BigQuery schema.
ValueError – if the string does not contain a valid JSON document.
- registerPostprocessor(fn)[source]
Register a post processor.
Registers a function to be called on the result of every JSON object decoded by the Schema object.
- Parameters:
fn (callable) – A callable that takes on argument, the decoded JSON object, and returns the new version of that object.
- setFieldValidator(field_name, fn)[source]
Override the validator for a particular field in the schema.
- Parameters:
field_name (str) – The field name to replace the validator for. If referring to a field of a subrecord, use dotted notation. e.g. recordfield.subrecord.item
fn (callable) – A callable that recieves the root object, and the current value of the field, and returns the new value. In the case where the value is invalid, it should raise errors.ValidationError
- setObjectLoadHook(fn)[source]
Set the object load hook used by json.loads.
- Parameters:
fn (callable) – A callable that takes a non-literal, decoded json object, and returns an updated version of that object.
- toUnixTimestamp(_parent, value)[source]
Validator for TIMESTAMP fields.
- Parameters:
- Returns:
validated value
- Return type:
- Raises:
logsnarf.errors.ValidationError – if value is not, or can not be converted to, a unix timestamp.
- validateJSON(root_obj)[source]
Validate that an object matches the BigQuery schema.
- This involves
ensuring all fields in the object are known
all required fields are present.
running the field validators on each field
- Parameters:
root_obj (dict) – the object (dict) to validate against the schema.
- Returns:
validated object
- Return type:
- Raises:
logsnarf.errors.ValidationError – if the object is not valid against the schema
- static validateSchemaField(field)[source]
Validate a field of a schema.
For clarity this is implemented with asserts. During normal schema validation this is wrapped in a ValidationError in validateSchema
- Parameters:
field (dict) – The field to validate.
- Raises:
AssertionError – if the field is invalid.