Aggregating data

QuickStart Module

This quickstart module gives an a comprehensive overview on aggregating data using the MongoDB aggregation pipeline.

Using ODSL to create aggregation pipelines

An aggregation pipeline consists of an array of stages that are executed in order to produce an output array of resultant documents.

You can easily construct a MongoDB aggregation pipeline in the ODSL language using the aggregate command, here is a simple example:

aggregate ${object:"m101:sample_training.zips"}
match state = "NY" and city = "ALBANY"

Aggregation Stages

The following MongoDB Aggregation Stages are fully supported within the ODSL language:

All other pipeline stages can be used in ODSL in JSON format, e.g.

aggregate ${object}
{ "$collStats": { "storageStats": { } } }


Below are a set of examples which show a variety of aggregations, aggregation stages and functions

Match and count

aggregate ${object:"m101:sample_training.zips"}
match state = "NY" and city = "ALBANY" and pop > 5000
count "zips"

Using group and sort

aggregate ${object:"m101:sample_training.companies"}
group _id="$founded_year", num=sum(1)
sort num asc

Using addFields and project

aggregate ${object:"m101:sample_training.companies"}
match ipo != null
addFields symbol="$ipo.stock_symbol"
project name, symbol

Using geoNear, sort and limit

lands_end = Point([-5.712946243042564, 50.0692015134188])
aggregate ${object:"m101:sample_geospatial.shipwrecks"}
geoNear near=lands_end, distanceField="distanceFromLandsEndKm", spherical=true, distanceMultiplier=0.000156786
project name, distanceFromLandsEndKm
sort distanceFromLandsEndKm asc
limit 5

Using lookup

aggregate ${object:"m101:sample_mflix.movies"}
lookup from "comments" localField "_id" foreignField "movie_id" as "comments"
match comments > size(0)
limit 1

Using bucket

aggregate ${object:"m101:sample_training.companies"}
match founded_year > 1980 and number_of_employees != null
bucket "$number_of_employees" boundaries [ 0, 20, 50, 100, 500, 1000, Infinity ]

Using autoBucket

aggregate ${object:"m101:sample_training.companies"}
match founded_year > 1980 and number_of_employees != null
bucketAuto "$number_of_employees" buckets 5

Using unwind

aggregate ${object:"m101:sample_training.companies"}
match number_of_employees>100000
project name, city="$"
unwind "$city"
match city="Tokyo"