Introduction In a previous article, we explored how PCA can be used to plot credit card transactions into a 2D space, and we proceeded to visually analyse the results. In this article, we take this process one step further and use hierarchical clustering to automate parts of our analysis, making it even easier for our … Continue reading Analyzing credit card transactions using machine learning techniques – 3
Principal Component Analysis - Introduction and Data Preperation Principal Component Analysis [PCA] is an unsupervised algorithm which reduces dimensionality and is widely used. A good visual explanation can be found here: http://setosa.io/ev/principal-component-analysis/ As mentioned in our previous article, Correspondence Analysis works exclusively on categorical data. In contrast, PCA accepts only numerical data. This means our data … Continue reading Analyzing credit card transactions using machine learning techniques – 2
Introduction In this 3-part series we'll explore how three machine learning algorithms can help a hypothetical financial analyst explore a real data set of credit card transactions to quickly and easily infer relationships, anomalies and extract useful data. Data Set The data set we'll use in this hypothetical scenario is a real data set released … Continue reading Analyzing credit card transactions using machine learning techniques
Maybe the link between your smartphone keyboard and current machine learning research in cybersecurity is not apparent at first glance, but the technology behind both is extremely similar: both leverage deep learning architectures called Recurrent Neural Networks [RNNs], specifically a type of RNN called Long Short Term Memory [LSTM]. One of the main advantages of … Continue reading What do Smartphone Predictive Text and Cybersecurity have in common?
I've recently published the thesis I wrote in fulfillment of my Masters in Computer Security, entitled BioRFID: A Patient Identification System using Biometrics and RFID Anyone interested can download and read the whole thesis here: https://www.researchgate.net/publication/317646400_BioRFID_A_Patient_Identification_System_using_Biometrics_and_RFID In this article I'll give an extremely compressed version of the thesis and how the work therein can be … Continue reading Cyber Security: Sparse coding and anomaly detection
A big part of what we do at CyberSift is anomaly detection. The recent WannaCry attack highlighted the growing threat of ransomware in the security landscape. The WannaCry authors may have made amateur mistakes, and there may be more stealthy and profitable attacks than WannaCry, but the negative impact it has had on Windows users … Continue reading Anomaly detection vs Ransomware
Hyperparameter optimization in neural networks is generally done heuristically, by varying each individual parameter such as learning rate, batch size and number of steps. Sklearn automates this by using the GridSearchCV  Usually Sklearn's examples and documentation is spot on and copy/pasting an example works with minimal changes. However this wasn't quite the case with … Continue reading Nugget Post: Skflow GridSearch