Skip to main content

Data Quality

QuickStart Module

A tutorial showing you how to ensure the data in your collections is of high quality

Using Types to define quality rules

When you define a type, there are various aspects of it which then controls data quality as new data is written.

Value type

When you define a property on a type, you specify the type of data that will be stored in that property.

The type definition below defines 3 properties:

  • name is of type String
  • temperature and humidity are of type Timeseries
sensor = type
name as String() not null
temperature as timeseries()
humidity as timeseries()
end

Any data written using type=sensor that had those 3 properties would have to conform to the defined types or else it would be rejected.

Not null directive

Adding the directive not null to a property definition ensures that no null values can be written to this property.

Default directive

Adding a default directive is an alternative way of dealing with null values, in this case if the value on the document is null, the default value will be used instead.

The default value can be:

  • A statically defined value
  • A value looked up from another document

Example of a statically defined default value

If the name property is missing or null, it is replaced with the value Undefined

sensor = type
name as String() default "Undefined"
end

Example of a looked up default value

If the name property is missing or null, it is replaced with the value looked up using the property name in the type property from the object sensor_metadata

sensor = type
type as String() not null
name as String() default ${object:"sensor_metadata"}[type].name
end

The sensor_metadata would look something like this:

{
"types": ["A", "B"],
"A": {
"name": "Type A"
},
"B": {
"name": "Type B"
}
}

Value constraints

You can add constraints to a type which check the value or values of properties to ensure they fulfill a condition.

Constraints can be configured, upon failure, to either:

  • Reject the whole document (default)
  • Mark the document with a Failed status by adding the directive mark after the check directive

Example of using the mark directive

sensor = type
type as String() not null
name as String() not null

constraint type_valid check mark (type = "test")
end

Example of value in a list

This example checks to see that the value in the type property is one of "temperature", "humidity" or "both"

sensor = type
type as String() not null
name as String() not null

constraint type_valid check (type in ["temperature","humidity","both"])
end

Example of using a regex expression

The following using a regex expression to validate that an input value matches a certain patter

sensor = type
type as String() not null
name as String() not null
zip as String() not null

constraint zip_valid check mark (zip matches "^\d{5}(-\d{4})?$")
end

Example of looked up constraint

The following constraint checks to see that the value in the type property exists in a types array in a sensor_metadata object.

sensor = type
type as String() not null
name as String() default ${object:"sensor_metadata"}[type].name

constraint type_valid check (type in ${object:"sensor_metadata"}.types)
end