Structured Commands

A list of commands that create structures or perform more complex tasks.

Introduction

A structured command simplifies the creation of the more complex elements in OpenDataDSL. A structure generally takes the form:

varname = structure
    (config)*
    (statement)*
end

action

Command to create actions that can be used in workflows

Syntax

varname = (action|gateway) in "category"
  (comment)? 
  (actionInput|actionOutput)* 
  (actionExit)? 
  (statement)* 
end

actionInput: in varname as declaredType ("desc")? (optional)?
actionOutput: out varname as declaredType ("desc")?
actionExit: exit "name" (, "name")*

Description

A workflow action is a small block or snippet of code that performs a specific task and can be used in your own custom workflows. They can have input and output data and 1 or more exit transitions which can be configured in a workflow.

Explanation of the syntax

action or gateway - This command can create actions or gateways:
- An action takes optional inputs, performs a task and creates optional outputs
- A gateway takes an input and based on that input takes a specific exit transition
The category is a string that is used to place the action in a specific category
The comment on the first line of the action is used as the description of the action
Action inputs can have an optional directive
The action exits are simply string exit names, such as “ok” or “failed”

Examples

A simple send batch action

test_send_batch = action in "loaders"
    // Send a batch of data to the server to be updated
    in batch as Object "The batch to upload"
    exit "ok","failed"
    
    on error ignore
    send input.batch
    if error
        return "failed"
    else
        return "ok"
    end
end

// Save the action to the server
save ${action:test_send_batch}

An action to read an object from the object service

read_object = action in "general"
    // Read an object
    in id as Scalar "The ID of the object to read"
    out obj as Object "The ODSL Object"
    exit "ok", "failed" 

    on error ignore 
    output.obj=${object:input.id}
    if error
        print error
        return "failed"
    else
        return "ok"
    end        
end

aggregate

Create an aggregation pipeline to group and summarise data

Syntax

varname = aggregate service 
  (pipelineItem)* 
end

pipelineItem: (pipelineMatch|pipelineGroup|pipelineSort|NL|comment);
pipelineMatch: match (condition)+;
pipelineGroup: group assign (, assign)*;
pipelineSort: sort sortItem (, sortItem)*;
sortItem: (assign|varname (asc|desc)?);

Description

The aggregate command creates an aggregation pipeline to group and summarise data from any service.

Explanation of the syntax

The match pipeline item allows you to filter out data using a condition clause
The group pipeline item groups or summarises data using one or more of the following summarising functions:
- count() - counts the number of occurances
The sort pipeline item sorts the results according to the specified field and asc(ending) or desc(ending) order

Examples

The following example summarises the status of process executions filtered for a specific service

summary = aggregate ${exec}
    match service="ETL"
    group _id="$status", qty=count()
    sort qty desc
end

find

Used to search the database

Syntax

varname = find 
  (top n)?
  (unique field from | profile profilename (for range)? from)?
  (activevar|avservice)
  (where (condition)+)?
  (end)?

Layout

The find command can be used in a single line or multi-line format for better readability, e.g.

// Single line search of private audit records
records = find ${audit} where timestamp > ${date:"today"} and timestamp < ${date:"tomorrow"}

// Multi-line version of the same search
records = find ${audit} where 
    timestamp > ${date:"today"} 
    and timestamp < ${date:"tomorrow"}
end

Options

top

The top option allows you to find a smaller sample of data, e.g.

// Retrieve the first 15 objects
objects = find top 15 ${object} where dataset="ARGUS_DEL"

unique

The unique option allows you to get a list of unique values for a specific field in a resource

profile

The profile option is only used when searching through objects. It allows you to search objects, but return a list of data objects linked to the object where the profile name matches the passed in profilename. This option also allows you to specify a range, which can be a single date or a range of dates in the following syntax:

Single date as a Date or a String, e.g. “2021-07-16”
Single date using the date active variable service, e.g. ${date:”yesterday”}
From a date, e.g. from(“2021-01-01”)
The last number of days, e.g. last(3)
A range of dates using between, e.g. between(“2021-01-01”,${date:”yesterday”})

condition

The condition option is used to filter the results of the find command. The syntax of the conditions is as follows:

expression (<|<=|>|>=|=|==|!=|like|intersects|within) expression
    | condition (and|or) condition
    | ( condition )

####### Examples of conditions

category = "extractors"
timestamp > "2020-11-03T12:23:40"
timestamp > ${date:"today"} and timestamp < ${date:"tomorrow"}
name like "ch"
location within Sphere(\[ 51.72961, 0.47612 \], 20 / 3963.2)

Description

The find command is a powerful way of searching a resource for the data you require. It returns a virtual list of items that match the specified conditions.

Examples

Find all

In its simplest form, the find command can be used to list all items from a service, e.g.

// Get a list of all public calendars
calendars = find ${calendar:public}

Simple filtering

// Get a list of public actions for a specific category
pactions = find ${action:public} where category = "extractors"

Filtering by date

// Get a list of private audit records for today
records = find ${audit} where timestamp > ${date:"today"} and timestamp < ${date:"tomorrow"}

A list of data using the object profile option

data = find profile SPOT for yesterday from ${object:public} where dataset == "ECB_FX"

Various date range queries

// Data in get query
data = ${data:"#ECB_FX.EURZAR:SPOT"} for ${date:"yesterday"}
data = ${data:"#ECB_FX.EURZAR:SPOT"} for yesterday
data = ${data:"#ECB_FX.EURZAR:SPOT"} for last(3)
data = ${data:"#ECB_FX.EURZAR:SPOT"} for from(yesterday)
data = ${data:"#ECB_FX.EURZAR:SPOT"} for between("2021-05-01",yesterday)

Geospatial

// Define a polygon and search for objects within it
london = Polygon(\[\[51.5386, -0.4956\],\[51.6445, -0.0753\],\[51.5205, 0.1753\],\[51.3479, -0.1163\],\[51.5386, -0.4956\]\])
items = find ${object:"TestGeometry"} where location within london

function

Create your own user definable function (UDF)

Syntax

function name ( ((byref)? param (, (byref)? param)*)? )
  (comment)?
  (statement)*
end

Description

The function command allows you to create a custom function which can be used in your OpenDataDSL scripts - either in the same script or you can write functions in a script which can then be imported into a script.

The function is called using the name of the function and the parameters passed in the same order as declared.

To return a value from a function, you need to create a variable in the function as the same name as the function.

Documentation

Any comments that are added above the function becomes the function description visible when hovering over the function when it is called.

Multi-line comments can also include '@ variables' which can describe parameters and information about the script itself:

@category - when placed in a comment block at the top of the script, this is used as the category for the file which is used in the GUI to provide categorised lists of scripts.
@param - this is used to provide a description of a function parameter

Parameter modifiers

By default, parameters are passed 'by value', which means that changing those values in the function does not change the variable that was passed in. Adding the byref parameter modifier means that the value is passed 'by reference' so the actual variable itself is passed into the function.

Example of script comment:

/**
 * @category report
 * Functions for creating reports
 */

Example of function comment:

/**
 * Bootstrap an input curve
 * to create a monthly arbitrage-free curve
 * @param input The input curve
 */

Examples

An example that creates a bootstrapped and shaped curve

/**
 * Create an arbitrage free monthly curve from the input curve and use simple shaping
 * @param input The input curve
 */
function bootstrapAndShape(input)
    boot = bootstrapCurve(input)
    bootstrapAndShape = shape(boot)
end

transform

Used in an ETL process to map data from an input format to output objects

Syntax

name = transform inputVariable into (declaredType|object|rows) as varname 
  (comment)? 
  transformOptions 
  transform 
end

transformOptions: (transformCreate)? (transformUnique)? (on error ignore)? (transformIgnore)?

transformCreate: create with varname (, varname)* (transformClear)?
transformUnique: unique var=value
transformIgnore: ignore condition (, condition)*
transformClear: clear varname (, varname)*

transform: (assign|method|transformIf|print|comment|NL)*
transformIf: if condition
  transform (elseif condition transform)* 
  (else transform)?
end

Description

The transform command creates a transformer that can be used to convert an input object usually read from a file or web URL into one or more output objects

Explanation of the syntax

The inputVariable is the input object that you want to transform from
The into declaration can be one of:
- A named public or private declaredType
- A generic object
- A rows object, which is effectively rows or properties like a CSV file or spreadsheet
The as varname part defines the temporary object variable holding the current value as the input variable is iterated through
The comment on the first row of the transformer is used as the description of the transformer
The options section at the top of the transformer can contain the following:
- create - this defines which column or property name in the input variable which when the value changes defines what creates an output object, e.g. if there is a name property in the input variable which is different for each element that you want to create an object for you would use the option: create with name.
  - clear - this works with create and allows you to clear some property values in between creating a new object, this means that you don’t get properties repeating from other objects if their value isn’t supplied.
- unique - this defines the unique id for the output variables and enables you to concatenate properties and clean up vairable names etc.
- on error ignore - this allows the transformer to complete ignoring and transformation errors - sometimes this is necessary when formats cause execution errors
- ignore - this allows you to skip elements that match any supplied condition, for example if a proeprty is null which would cause issues you can specify that condition here

Examples

The following example creates a transformer and then runs the transformer using input data from an xml file on the ECB web site.

// Create the transformer
ECB_FX = transform xml into #ForeignExchange as fx
    create with Cube
    unique id = "ECB\_FX\_EUR" + fx.currency
    SPOT = TimeSeries(xml.Cube.Cube.time, "BUSINESS", fx.rate)
    category = "Foreign Exchange"
    product = "ECB_FX"
    provider = "European Central Bank"
    model = "EUR" + fx.currency
    description = "European Central Bank Euro FX rates EUR/" + fx.currency
    base = "EUR"
    currency = fx.currency
end

// Test
xml=${xml:"https://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml"}
models = ECB_FX.run(xml)
for model in models
    print model.id
next

The following example shows some static JSON content being transformed in 2 different ways and also introduces using metadata to decorate the output results.

// The static input JSON
json = `{
    timestamp: "12Oct2020",
    data: \[
        {
            identity: "A",
            value: 1.2
        },
        {
            identity: "B",
            value: 1.5
        },
        {
            identity: "C",
            value: 999
        }
    \]
}`

// Convert to an object
input = ${json:json}

// Create a type to convert the JSON into
example = type
    name as String()
    description as String()
    value as Scalar()
end

// Create some metadata to decorate the output data
metadata = Object()
metadata.A = "A description"
metadata.B = "B description"

// A transformer to convert the JSON into the example type
tx = transform input into example as x
    create with data
    unique name = x.identity
    on error ignore
    ignore x.identity == "C"
    name = x.identity
    description = metadata.get(x.identity)
    value = x.value
end

// Run the transformer and output the results as JSON
result = tx.run(input)
print json(result)

// Another transformer that outputs the results as rows
rowtx = transform input into rows as x
    create with data
    unique name = x.identity
    on error ignore
    ignore x.identity == "C"
    name = x.identity
    description = metadata.get(x.identity)
    value = x.value
end

// Run the transformer and print out the results and size of the results
rowresult = rowtx.run(input)
print json(rowresult)
print rowresult.size

type

Create a custom data type for your private data

Syntax

declaredType = (versioned)? type (extends declaredType)?
  (comment)?
  (typeProperty | typeExpression | typeMethod )*
end

typeProperty: propertyName as type ( (qualifier)? )
  (matches regex)?
  (default defaultValue)?

typeExpression: propertyName as expression

typeMethod: propertyName as function( (param (,param)*)? )
  functionBody
end

Description

An explanation of the syntax:

declaredType is the id of the type
The versioned option makes all objects of this type versioned
The extends option allows you to extend another type, creating a specialisation of that type
The comment on the first line after the declaration becomes the description of this type
A property on the type can be one of:
- property
- expression
- method
The type of the property can be one of:
- dimension - This is a special property type that is used by GUI applications, e.g. the web portal to filter objects based on a value of this property. A dimension is generally used as a property that has a very low uniqueness, e.g. like country or source
- string - A simple string property
- number - A generic number, can be integer or decimal
- scalar - A generic scalar value, can be string, numeric or boolean
- date - A date or datetime value
- boolean - A true/false value
- list - A list of objects - the type of object can be restricted using a qualifier
- object - A generic object
- duration - A timespan value, e.g. 4 days
- geometry - A geometric shape
- timeseries - A timeseries value
- curve - A curve value
- specific type - Any public or private type declaration
The property can have a qualifier used to limit the type, e.g. if the type is a list and you want to restrict the list elements to dates

Versioning

If you specify the versioned option when defining the type, then all objects of this type will be versioned. This means any updates to that object where a value of a property has changed will cause a new version to be created and the existing version is archived. More details about data versioning can be found here.

Extending

By default, all types extend the #Object type which has the following properties defined:

name - string
description - string
classification - dimension
geolocation - geometry

However, you can elect to extend any public or your own private type in order to create a specialised version of that type, which will give you a type hierarchy. A simple example of why you would want to do this is shown below:

Type Hierarchy

The diagram below shows a simple type hierarchy, starting with #Object we create a generic Person type which adds the properties PhoneNumber and Email. We then extend the Person object into 2 specialised types: Student and Professor which add some more specific properties.

With this type hierarchy, we can add Student objects and Professor objects, but we can also list all Person objects showing all students and professors, so if I knew someones email address and seached in Person, it would find the person irrespective of whether they are a student or a professor.

For more examples and in-depth descriptions, see the section Data Modelling.

Examples

Example of a type hierarchy:

Widget = versioned type
    price as Number()
end

Cable = versioned type extends Widget
    length as Number()
end

Switch = versioned type extends Widget
    ports as Number()
end

workflow

Create a workflow

Syntax

varname = workflow IN "category"
  (comment)? 
  (actionInput|actionOutput)* 
  (actionExit)?
  workflowStart (phase|workflowEvent|NL|comment)*
end

workflowStart: WF_START
  workflowTransition
end
workflowBody: (workflowEvent|workflowAction|workflowGateway|workflowWorkflow)
workflowEvent: WF_EVENT varname AS "string"
  (assign|comment|returnstmt|workflowTransition)* 
end
workflowAction: WF_ACTION varname input
  (assign|comment|workflowTransition)* 
end
workflowGateway: WF_GATEWAY varname input
  (assign|comment|workflowTransition)* 
end
workflowWorkflow: WF_WORKFLOW varname input
  (assign|comment|workflowTransition)* 
end
workflowTransition: "name" -> connection

// Workflow phases
phase: phase "name" (retries INT)? (delay INT TIMEUNIT)? (then reschedule)? (external)?
  (statement|workflowBody)* 
end
reschedule: reschedule INT TIMEUNIT
fail: fail expression
abort: abort expression

Description

The configuration of a workflow is best done using the workflow GUI in the web portal (Not available yet), but it can also be done in OpenDataDSL code.

Anatomy of a workflow

A workflow has some input, output and exit configuration at the start - just like an action. The input information is passed in via a process or as an object if running the workflow manually.

Workflow Blocks

All workflows have building blocks in them:

WF_START - There must be exactly 1 of these, which indicates the start point and it only contains a transition which is the first workflow element that is called
WF_ACTION - This is a block which configures a workflow action. You can:
- define the action transition routing, i.e. the route to take given the transition information when the action completes
- assign the action input variables from the workflow input or any previous action outputs
- run the action using the input variables
WF_GATEWAY - This block configures a workflow gateway. It is configured in the same way as an action block and is generally used to route workflows according to an expression
WF_WORKFLOW - This block configured a sub-workflow, it is configured in the same way as an action - NOTE than any workflow can be used as a sub-workflow
WF_EVENT - This is generally used as an end point of a workflow and is used to return the transition information back to the calling application.

Workflow Phase

A workflow can nest its WF_ACTION, WF_GATEWAY and WF_WORKFLOW blocks in a phase.

It is recommended that you use phases in a workflow for the following reasons:

It breaks a workflow into distinct sections which get reported in real-time whilst the workflow is executing
It allows you to time sections of the workflow
Each phase allows for custom configuration of retries, retry delay and rescheduling

Examples

wf_xml_extract = workflow in "data-loaders"
    // Extract some data
    in url as Scalar
    exit "success", "failed"
    
    WF_START
        "ok" -> act_extract_xml
    end

    phase "EXTRACT"
        WF_ACTION act_extract_xml ai
            "ok" -> stopok
            "failed" -> stopfailed
            ai.url = input.url
            result = ${action:"#extract_xml"}.run(ai, output)    
        end
    end

    WF_EVENT stopok as "success"
        return "ok"
    end
    WF_EVENT stopfailed as "failed"
       return "failed"
    end
end

As a line-by-line breakdown of this workflow

Define a workflow called wf_xml_extract in the category data-loaders
Set a description for this workflow
Define an input variable called url which is a Scalar
Define exit transitions for the workflow as success and failed
Define the workflow start point
Transition to the action named act_extract_xml (the transition name is ignored)
Define a workflow phase called EXTRACT
Define an action block called act_extract_xml with an input variable called ai
Route the “ok” transition to stopok
Route the “failed” transition to stopfailed
Set the url on the action input to be the input url (passed in by the process)
Run the #extract_xml action passing in the ai variable and the global output variable
Define the stopok event as a success transition for the whole workflow
Define the stopfailed event as a failed transition for the whole workflow

Introduction​

action​

Syntax​

Description​

Explanation of the syntax​

Examples​

aggregate​

Syntax​

Description​

Explanation of the syntax​

Examples​

find​

Syntax​

Layout​

Options​

top​

unique​

profile​

condition​

Description​

Examples​

Find all​

Simple filtering​

Filtering by date​

A list of data using the object profile option​

Various date range queries​

Geospatial​

function​

Syntax​

Description​

Documentation​

Parameter modifiers​

Examples​

transform​

Syntax​

Description​

Explanation of the syntax​

Examples​

type​

Syntax​

Description​

Versioning​

Extending​

Type Hierarchy​

Examples​

workflow​

Syntax​

Description​

Anatomy of a workflow​

Workflow Blocks​

Workflow Phase​

Examples​

Introduction

action

Syntax

Description

Explanation of the syntax

Examples

aggregate

Syntax

Description

Explanation of the syntax

Examples

find

Syntax

Layout

Options

top

unique

profile

condition

Description

Examples

Find all

Simple filtering

Filtering by date

A list of data using the object profile option

Various date range queries

Geospatial

function

Syntax

Description

Documentation

Parameter modifiers

Examples

transform

Syntax

Description

Explanation of the syntax

Examples

type

Syntax

Description

Versioning

Extending

Type Hierarchy

Examples

workflow

Syntax

Description

Anatomy of a workflow

Workflow Blocks

Workflow Phase

Examples