About these ads

David Vassallo's Blog

If at first you don't succeed; call it version 1.0

Category Archives: Security

Getting started with Neo4J and security data analysis

During a recent study module for a MSc I am undertaking we discussed the importance of continuous monitoring of data sources as part of a sound security defensive strategy. This lead me down a very interesting path, eventually culminating in my discovery of an entire subset in security discipline many refer to as “Security visualization”. There are plenty of examples on the internet as regards this discipline, one of the best links I found is “secviz”, located here:

http://secviz.org/

My focus during this was actually trying to use graph databases to visualize security data, which in it’s raw form is simply CSV logs from our corporate firewall. I chose to explore graph databases via Neo4J, an extremely popular, open source, graph database. This article is intended as a quick reference or high level overview to any who’d like to get started in this field, and also as a reminder to my future self as to what I did so I can pick it up again sometime soon…

Getting Neo4J installed on ubuntu was child’s play. Copy/pasting the debian install script provided from the Neo4J site was all it took. A quick overview of cypher (Neo4J’s declarative language) is also easy for those coming from a SQL background. What did require thorough reading was the procedure on how to import CSVs into Neo4J, found here:

http://docs.neo4j.org/chunked/stable/cypherdoc-importing-csv-files-with-cypher.html

the firewall logs I used in this study came from a Palo Alto firewall, specifically the threat logs. These logs give the source (attacker) and destination (target) IP address of an attack, as well as the name of the attack. So first step is to open the raw CSV file, and whittle it down to the important columns, ending up with the following (ip addresses changed to protect the innocent):

 

Source Destination Threat
192.168.7.6 192.168.1.2 Microsoft RPC Endpoint Mapper(30845)
192.168.9.5 192.168.9.4 SSH2 Login Attempt(31914)
95.169.160.24 1.2.3.4 DNS ANY Request(34842)
202.43.161.114 1.2.3.5 Win32.Conficker.C p2p(12544)
195.190.146.40 5.4.7.8 Suspicious or malformed HTTP Referer field(35554)
192.68.7.106 192.168.21.2 Microsoft Windows SMB Negotiate Request(35364)
192.16.2.200 92.168.123.3 Microsoft RPC Endpoint Mapper(30845)

– Warning –

The logs can be quite voluminous, and I didn’t pay much attention to the various optimizations that can and should be done (like indexing), so some queries can take a significant amount of time to execute.

First step is to import that csv file into Neo4J. The cypher code I used was:
CREATE INDEX ON :IP(address)
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///home/davidvassallo/Downloads/log.csv" AS csvLine
MERGE (source:IP { address: csvLine.Source })
MERGE (destination:IP { address: csvLine.Destination })
MERGE (attack:Threat { attack: csvLine.Threat })
CREATE (source)-[:USED]->(attack)-[:AGAINST]->(destination)

In the first part we load the appropriate CSV file (note the triple slash in the beginning for Linux systems), and loop through each line of the csv file, storing it in the variable csvLine. Breaking down a MERGE command, we have:

blog_neo4j_1

The first “MERGE” line basically instruct Neo4J to create nodes of type IP, with an attribute/property called address from the “Source” column of the CSV file. If the node already exists, then don’t create a new one (hence the use of MERGE instead of CREATE). We do the same for the Destination column in the next line, and finally we create nodes of type “Threat” out of the threat column in the CSV line. Last but not least, the CREATE lines creates the edges (or relationships) between the source, destination, and attack. There are multiple ways to do this, but in my head it came down to:

“A source used an attack againstdestination

One notices that the cypher instruction very closely resembles the english sentence I wrote. In the cypher instruction, nodes are enclosed in parenthesis, while relationships are enclosed in square brackets. With that out of the way, we can get to visualizing our data. After pointing our browser to http://localhost:7474, we load up all nodes of type “threat”:

blog_neo4j_2

Click to enlarge

Note how the generating cypher command is also give to the user (number 2 above). Even at this stage, without running any queries, we can get some really useful information, by double clicking on the threat nodes. To give a concrete example, it’s worth while analyzing the following picture:

click to enlarge

Click to enlarge

You see two attackers (on the right hand side) which launched a variety of attacks designed to try to break into RDP. I checked the firewall and it really does have RDP open for some reason (a definite plus to using security visualization… auditing your own traffic). You see that one attacker (80.82.65.60) is probably a poor schmuck who got infected by a worm. Morto is a worm that propagates using RDP. The other attacker on the top of the diagram is a more organized attacker…. First he did reconnaissance and found an open RDP port (that’s the “Ncrack RDP” scan) , then tested if the scan was actually correct by trying to use the actual RDP client (that’s the microsoft rdp initial attempt), then, finding out he couldnt guess the password, he launched a full blown brute force attempt against the server.

I haven’t even begun to scratch at the surface of Neo4J…. there’s plenty more to discover, but I’ll leave you with some very useful queries I constructed while on the learning journey:

--- Find sources and targets of SSH2 login attempts:
MATCH (a)-[r]-(b) WHERE a.attack =~ "SSH2.*" OR b.attack =~ "SSH2.*" RETURN a,b LIMIT 25
--- Node Traversal (who attacked this node id)
--- the node ID can be determined by clicking once on the node
--- it will appear in parenthesis in the black box that appears on the right
START n=node(233)
MATCH (x)-[r]->(n)
RETURN n,x
-- who used attack id 78, return only distinct attackers
START n=node(78)
MATCH (a)-[r]->(n)
RETURN DISTINCT(a),n
LIMIT 500

References:

About these ads

Email Error: 450 Client host rejected

A couple of our clients sometimes have issues when sending email, with a returned non-delivery report stating the following:

Peer server rejected email:

450 Client host rejected: ‘cannot find your hostname’

It turns out this is a very strict check (usually performed by postfix), that is controlled via the directive reject_unknown_client_hostname in the postfix configuration. The documentation for the directuve can be found here:

http://www.postfix.org/postconf.5.html#reject_unknown_client_hostname

As per the link above, the 450 error is returned when:

1) the client IP address to name mapping fails

-or-

2) the name to address mapping fails

-or-

3) the name to address mapping does not match the client IP address. 

 

the solutions to each of the issues above are all related to the DNS infrastructure:

1) Ensure you have the correct PTR (reverse record) that returns a valid hostname for your outgoing email server’s IP address. Example:

host 78.133.115.83
83.115.133.78.in-addr.arpa domain name pointer compunet.com.mt.

2) Ensure that the hostname returned in the SMTP greeting and that returned in step (1) both resolve back to the correct IP address. Example:

host compunet.com.mt
compunet.com.mt has address 78.133.115.83

3) Ensure that your public IP for the email server matches that returned in (2)

Follow

Get every new post delivered to your Inbox.

Join 158 other followers