How do you "influence" a ML model? For example, imagine a scenario where you'd like to detect anomalies in a given data set. You reach for your favourite algorithm - in my case Isolation Forest: Our example output from Isolation Forest It does fine for most cases, except that one data point which invariably gets … Continue reading Machine Learning: Oversampling vs Sample Weighting
Decision tree forests rightly get a lot of attention due to their robust nature, support for high dimensions and easy decipherability. The most well known uses of decision tree forests are: Classification - given a set of samples with certain features, classify the samples into discrete classes which the model has been trained on. Regression … Continue reading 3 uses for random decision trees / forests you (maybe) didn’t know about
One common requirement for users of Elasticsearch is to have automatic alerts sent out whenever some query gets matched, or when some other condition gets satisfied. In fact, Yelp have come up with a python-based solution for this in the form of Elastalert, which at time of writing, is extremely popular with over 5.5K stars … Continue reading Is it Elastalert? No – it’s NiFi!!
Introduction In a previous article, we explored how PCA can be used to plot credit card transactions into a 2D space, and we proceeded to visually analyse the results. In this article, we take this process one step further and use hierarchical clustering to automate parts of our analysis, making it even easier for our … Continue reading Analyzing credit card transactions using machine learning techniques – 3
Principal Component Analysis - Introduction and Data Preperation Principal Component Analysis [PCA] is an unsupervised algorithm which reduces dimensionality and is widely used. A good visual explanation can be found here: http://setosa.io/ev/principal-component-analysis/ As mentioned in our previous article, Correspondence Analysis works exclusively on categorical data. In contrast, PCA accepts only numerical data. This means our data … Continue reading Analyzing credit card transactions using machine learning techniques – 2
Introduction In this 3-part series we'll explore how three machine learning algorithms can help a hypothetical financial analyst explore a real data set of credit card transactions to quickly and easily infer relationships, anomalies and extract useful data. Data Set The data set we'll use in this hypothetical scenario is a real data set released … Continue reading Analyzing credit card transactions using machine learning techniques
Maybe the link between your smartphone keyboard and current machine learning research in cybersecurity is not apparent at first glance, but the technology behind both is extremely similar: both leverage deep learning architectures called Recurrent Neural Networks [RNNs], specifically a type of RNN called Long Short Term Memory [LSTM]. One of the main advantages of … Continue reading What do Smartphone Predictive Text and Cybersecurity have in common?