How to create a “heatmap” graph network visualization

What we’re after

@CyberSiftIO we’ve been going through an exercise of adding “confidence levels” to our visualizations. In other words, how confident is the CyberSift engine that an alert really is an anomaly/outlier? The above screenshot shows one of the ways we visualize the output from this exercise. Each blue node is an internal PC/Server, while the other nodes are detected anomalies – ranging from green (low confidence) to red (high confidence). This heatmap-style visualization immediately allows an analyst to focus on those anomalies that really matter. Without the different colors, there may be too many alerts to investigate, but with the heatmap colors and analyst immediately figures out that best start looking at the deep orange alert on the top right corner. In this post we’ll outline how we built the below visualization.

WhatsApp Image 2017-10-16 at 16.54.21.jpeg


The toolset

Here’s the libraries we used:

  • ReactJS as our base framework (optional – in reality any JS framework could be used, we just like how easy and structured ReactJS makes everything)
  • CytoscapeJS as our graph network visualization library
  • D3.js for some helper functions

 


The coding

We’ll assume you have a basic ReactJS app up and running (if not… use create-react-app). The first order of the day is to format our data in a way that CytoscapeJS expects it. In this particular case, this means building an array of objects. Assuming our array of objects is going to be called cystoscape_elementswe first loop through the internal nodes (light blue ones) and push a data object onto this array:

The most important thing to note in the above is that we add an “extra” object attribute named “backgroundColor” and set to “internal“, which we’ll later use for styling these nodes with the appropriate light blue color.


Next, we need to append our external nodes to the cytoscape_elements object. However unlike in our above code, these nodes need to be given a different color depending on their “confidence rating”. Let’s assume that the confidence rating can range from 0 (low outlier score) to 1 (high outlier score). We need to convert this range into a color palette. Fortunately, D3.js allows you to do exactly that, in a simple way:

In the above code, we defined a linear scale with a domain (possible number values) between 0 and 1. In the last line, we map the domain to a custom color range. The first RGB value is the start color which maps to the numerical value of “0”, the middle is the “pivot” value, and the last is the end color which maps to a numerical value of “1”.

Once we have this color scale, we can push external nodes to our array using a similar strategy as above:

Note how  as before, we define a new object attribute called “confidence”, and we subsequently populate this attribute with our previously defined color scale to convert the numerical confidence to an actual RGB color. We’ll use this attribute later to style the node.


Next, we add the “edges” in the graph connecting the nodes together in a pretty straightforward fashion:

Note the use of javascript type coercion (from int to string) as our id, and note the javascript scope of allowing the inner forEach function to increment the id_counter variable.


Last, we put it all together:

In the above code, note the lines:

  • Line 12: CytoscapeJS uses a system of “selectors” which allow you to style different elements of the graph network. In this case, we’re interested in nodes.
  • Line 16: We set the background-color CSS attribute to the value of our “confidence” attribute in the data object (if present) using the notation below (NB: this would be my entire takeaway from this article…)
'data(attribute)'
  • Line 32: We again use selectors to style the internal nodes with a light blue. Recall that internal nodes had a data attribute of “backgroundColor” set to “src“, and we leverage this in the selector:
'node[backgroundColor = "src"]'

The above code in plain English says: select those nodes whose “backgroundColor is set to “src

 


Wrap Up

As you can see, the most important points in this article relate to how to effectively use CytoscapeJS custom data attributes along with CytoscapeJS selectors. Together, they allow you to create very striking visualizations that really communicate a point efficiently to your target audience.

Advertisements

Elasticsearch REST API: JEST upsert

I’ve already written about tips and tricks when using the Elasticsearch Java API. The Elasticsearch REST API has been going from strength to strength, and it seems that going forward the Elasticsearch team will focus more on the REST API than the native JAVA client. At the time of writing however, the official java REST library doesn’t seem to have support for the abstraction of the bulk API, so I followed some advice and looked into the JEST library.

The only snag with the Jest library is that when it comes to bulk operations, the documentation only gives examples of scripted updates. The Elasticsearch update API also allows for updates using partial documents. Jest supports this functionality, but I couldn’t find good documentation for this. Here-under is an example for anyone looking for this:

The important points:

  • You can still use the official java elasticsearch client’s “XContentFactory.jsonBuilder” library to more easily build your JSON objects.
  • The trick is in line 26 above:

jsonBuilder().startObject().startObject(doc”)

This creates a nested object with “doc” as the inner JSON object, as outlined by the elasticsearch documentation:

{
    "doc" : {
        "name" : "new_name"
    }
}

The first “startObject()” creates the outer curly brackets, while the second startObject(“doc”) creates the inner “doc” object.

  • We add content to the JSON object in lines 27-29
  • Just like we had to use two startObject() calls, we need to close the object with two endObject() calls as shown in line 31

The rest of the snippet deals with the actual bulk update. We pass the object we just created into an Update Builder, which gives us a “Bulkable Object” that we can pass on to the jest bulk processor. The snippet is taken from a larger program where it resides in a loop – which explains the if/else clause in lines 37-48; it’s important to “flush” the bulk service every so often. The native java client would to this automatically – so far in Jest you need to account for this yourself