Wednesday, 28 December 2011

Esper Aggregation Examples



I've been playing with more Esper stuff. Here is some quick useful aggregation examples I've found. Again - if your looking for the context of these posts check my 'esper' label.




Aggregating Events into a Single Event

If you have multiple events, and wish to aggregate them into a single event, you can use aggregation functions with a window. The window provides a time frame in which to evaluate, as aggregation is only really that useful within a span of time, or span of events (there are however exceptions).

Given the following data:

{"itemName"=>"test","price"=>100}
{"itemName"=>"test","price"=>200}
{"itemName"=>"test","price"=>300}

We want to return every 5 seconds the sum of all these pieces. This can be done with a query using a time_batch window, and using the aggregate 'sum' function:

select sum(price) from OrderEvent.win:time_batch(5 sec)

Which will return the following if the input data is pushed into the stream in the first 5 seconds:

{"sum(price)"=>600}

You may notice however, that it will emit events every 5 seconds like this afterwards:

{"sum(price)"=>nil}

You can correct this by changing the select clause to match on the aggregate function. This isn't done with a 'where' clause but instead with aggregates you can use the 'having' clause:

select sum(price) from OrderEvent.win:time_batch(5 sec) having sum(price) >= 0

Now it will only emit an event if the sum of the price is equal to or over 0 and nil events will no longer be returned.

Other examples of aggregate functions are provided in the documentation.


Selecting Distinct Events from a Stream

Often you want to be able to pick out distinct events from a series of events defined within a window (either a time window or number of events usually).

Lets say you have a series of events:


{"id" => 4, "itemName"=>"test", "price"=>100}
{"id" => 5, "itemName"=>"test", "price"=>100}
{"id" => 6, "itemName"=>"test", "price"=>300}
{"id" => 7, "itemName"=>"test", "price"=>300}
{"id" => 8, "itemName"=>"foo", "price"=>300}
{"id" => 9, "itemName"=>"test", "price"=>500}


You can use the distinct keyword during the select to pluck out only the distinct events:


select distinct itemName, price from OrderEvent.win:time_batch(1 sec)


Which would return:


{"itemName"=>"test", "price"=>100}
{"itemName"=>"test", "price"=>300}
{"itemName"=>"foo", "price"=>300}
{"itemName"=>"test", "price"=>500}







No comments:

Post a Comment