Generating Complex Events from a Partitioned Stream

In my last entry I looked at a monitoring use-case and used CQL to generate complex monitoring events that aggregated a series of underlying, simple monitoring events. The idea was that every three events coming from a particular machine room should produce an outgoing complex monitoring event. In this entry I would like to remove the restriction, made until now, that all monitoring events come from the same machine room.

So, suppose that we have monitoring events coming from many machine rooms. It's somewhat unrealistic, after all, that a large company would have a single machine room hosting all of its computing resources. The events coming from the different machine rooms are all mixed together in a single inbound stream of events, called the 'alerts' stream.  The key to handling this more general input stream is to use the 'partition by' feature of CQL, however, partition by has some important limitations that one should be aware of. Here's an example of a query that uses partition by:

 

istream (
SELECT alerts.machineRoom as machineRoom,
recent.administrator as administrator,
sum(alerts.severity) as totalSeverity
FROM alerts[partition by machineRoom rows 3] as alerts,
alerts [partition by machineRoom rows 1] as recent
WHERE recent.machineRoom= alerts.machineRoom
GROUP BY alerts.machineRoom, recent.administrator
HAVING COUNT (*) = 3)




The query above includes the machine room name and system administrator, in addition to the overall severity in the generated complex alert event. Note the use of the 'partition by' in the definition of the event windows in the FROM clause. 'Partition by' converts a single input stream into what can be thought of logically as a set of input streams by partitioning it based on a set of stream attributes. In this case, partitioning is done based on the machineRoom attribute since we want to aggregate alerts coming from the same machine room.  This makes each partition behave as though it is a stream of events coming from a single machine room.  The WHERE clause makes sure that the most recent event from each machine room is joined with the most recent three events from that same machine room. The joined relation is then grouped by machineRoom and administrator so that the total severity for the machine room can be calculated.

If we run the query above on the following input stream:

[AlertEvent: machineRoom-mr10, severity-2 administrator-Hoyong]
[AlertEvent: machineRoom-mr11, severity-3 administrator-Robin]
[AlertEvent: machineRoom-mr10, severity-7 administrator-Alex]
[AlertEvent: machineRoom-mr11, severity-5 administrator-Cedric]
[AlertEvent: machineRoom-mr10, severity-1 administrator-Eric]
[AlertEvent: machineRoom-mr11, severity-7 administrator-Thorick]
[AlertEvent: machineRoom-mr10, severity-4 administrator-Anand]
[AlertEvent: machineRoom-mr10, severity-10 administrator-Seth]
[AlertEvent: machineRoom-mr10, severity-1 administrator-Andy]


We would like to see the following output:

[ComplexEvent: machineRoom-mr10 totalSeverity-10.0 administrator-Eric]
[ComplexEvent: machineRoom-mr11 totalSeverity-1-15.0 administrator-Thorick]
[ComplexEvent: machineRoom-mr10 totalSeverity-1-15.0 administrator-Andy]
 

But instead we see this:

[ComplexEvent: machineRoom-mr10 totalSeverity-10.0 administrator-Eric]
[ComplexEvent: machineRoom-mr11 totalSeverity-15.0 administrator-Thorick]
[ComplexEvent: machineRoom-mr10 totalSeverity-12.0 administrator-Anand]
[ComplexEvent: machineRoom-mr10 totalSeverity-15.0 administrator-Seth]
[ComplexEvent: machineRoom-mr10 totalSeverity-15.0 administrator-Andy]
 

Why the extra events? The problem is that the individual window partitions are not sliding in units of three events, as we would like. Thus, after the first three events for a particular machine room arrive, every subsequent event from that same machine room causes the generation of a complex alert.  One is tempted to fix this by doing something like the following:

 
istream (
SELECT alerts.machineRoom as machineRoom,
recent.administrator as administrator,
sum(alerts.severity) as totalSeverity
FROM alerts[partition by machineRoom rows 3 slide 3] as alerts,
alerts [partition by machineRoom rows 1] as recent
WHERE recent.machineRoom= alerts.machineRoom
GROUP BY alerts.machineRoom, recent.administrator
HAVING COUNT (*) = 3)
 

Note the addition of the slide keyword in the window definition. By using the slide keyword we were able to achieve the desired behavior in the single machine room case, however, CQL does not currently support this functionality when using partition by (The query above will actually not compile without errors.). So, how can we fix it?  Well, the trick is to use another feature of CQL which is the MATCH_RECOGNIZE cause. This clause provides a general pattern matching facility and is very powerful. Here is what the MATCH_RECOGNIZE query for our current use-case looks like:


 

SELECT T.machineRoom, T.totalSeverity,T.administrator
FROM alerts
MATCH_RECOGNIZE (
PARTITION BY machineRoom
MEASURES
machineRoom as machineRoom,
sum (severity) as totalSeverity,
administrator as administrator
PATTERN (A*? B)
DEFINE
B as count(*) = 3
) as T
 

Notice that this query also contains the notion of partitioning and partitions the incoming alerts stream based on the machineRoom just like our previous attempts.   The MEASURES clause defines the three values that are computed by the pattern match. These are the same as before; machine room, administrator name for the most recent alert, and the total severity. The pattern clause defines the pattern being matched. In this case the pattern consists of any sequence of three events. The pattern is applied to a partition of the original alerts stream so, in effect, it is attempting to match any three events from the same machine room. The DEFINE clause is where we specify the 'three events' constraint. The ' count (*)' counts all events that are part of the match. Another thing to note is that by default, freestanding references to stream attributes such as 'machineRoom' in the MEASURES clause refer to the most recent event matched by the pattern. Note that all events that are part of the same match have the same machine room (because of the partition by). For the same reason, 'administrator' refers to the administrator of the most recent event which could be different than the administrator for other events that are part of the same match. This is as it was in our previous examples.  The 'sum (severity)' is performed over all events that are part of the match.


 


One final thing to note about our pattern is the use of the '*?' quantifier. This quantifier has the same meaning as the '*' quantifier -- it matches zero or more events. However, '*?' is reluctant in that it will attempt to match the minimum number of events first before going on to attempt a larger match. Had we instead used the '*' quantifier, the match would never have terminated because '*' is greedy and would attempt to match the largest number of events before producing a match. -- all events in the stream in this case.  It's important to realize when thinking about patterns and pattern matching in CQL that conditions such as A and B are matched independently. In other words, the 'count(*)=3' condition is not considered when matching A, but only when matching B.  In our sample query we count on the fact that 'A*?' will match 2 events and then allow B to match the third event producing the overall match of three events.


 


Pattern matching in CQL is a very powerful feature. As we saw in this entry, it can help to make up for some current limitations in the CQL language. Hand-in-hand with the power of pattern matching also comes complexity, however, and a lot of new semantics even for developers that are familiar with traditional SQL. It's worth studying the documentation and trying some simple pattern matching examples to get the hang of things, before jumping into a complicated use-case.


 


Technorati Tags:

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

bocadmin_ww

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today