Data Quality
QuickStart Module
A tutorial showing you how to ensure the data in your collections is of high quality
Using Types to define quality rules
When you define a type, there are various aspects of it which then controls data quality as new data is written.
Value type
When you define a property on a type, you specify the type of data that will be stored in that property.
The type definition below defines 3 properties:
- name is of type String
- temperature and humidity are of type Timeseries
sensor = type
    name as String() not null
    temperature as timeseries()
    humidity as timeseries()
end
Any data written using type=sensor that had those 3 properties would have to conform to the defined types or else it would be rejected.
Not null directive
Adding the directive not null to a property definition ensures that no null values can be written to this property.
Default directive
Adding a default directive is an alternative way of dealing with null values, in this case if the value on the document is null, the default value will be used instead.
The default value can be:
- A statically defined value
- A value looked up from another document
Example of a statically defined default value
If the name property is missing or null, it is replaced with the value Undefined
sensor = type
    name as String() default "Undefined"
end
Example of a looked up default value
If the name property is missing or null, it is replaced with the value looked up using the property name in the type property from the object sensor_metadata
sensor = type
    type as String() not null
    name as String() default ${object:"sensor_metadata"}[type].name
end
The sensor_metadata would look something like this:
{
  "types": ["A", "B"],
  "A": {
    "name": "Type A"
  },
  "B": {
    "name": "Type B"
  }
}
Value constraints
You can add constraints to a type which check the value or values of properties to ensure they fulfill a condition.
Constraints can be configured, upon failure, to either:
- Reject the whole document (default)
- Mark the document with a Failed status by adding the directive markafter thecheckdirective
Example of using the mark directive
sensor = type
    type as String() not null
    name as String() not null
    constraint type_valid check mark (type = "test")
end
Example of value in a list
This example checks to see that the value in the type property is one of "temperature", "humidity" or "both"
sensor = type
    type as String() not null
    name as String() not null
    constraint type_valid check (type in ["temperature","humidity","both"])
end
Example of using a regex expression
The following using a regex expression to validate that an input value matches a certain patter
sensor = type
    type as String() not null
    name as String() not null
    zip as String() not null
    constraint zip_valid check mark (zip matches "^\d{5}(-\d{4})?$")
end
Example of looked up constraint
The following constraint checks to see that the value in the type property exists in a types array in a sensor_metadata object.
sensor = type
    type as String() not null
    name as String() default ${object:"sensor_metadata"}[type].name
    
    constraint type_valid check (type in ${object:"sensor_metadata"}.types)
end