The Power of Non-Linear Analysis

A lot of what an analyst does is perform linear analysis. These analyses are guaranteed to produce human-readable stories, even if they aren't  insightful.

The world in which we live is not linear.  In the book, 17 Equations That Changed The World, the author, Ian Stewart,  selects only three equations that are linear, the rest are non-linear.

This shows the limitations of linear analysis in explaining the world around us. A lot of what we experience in life is non-linear, from the flight of a ball in the air (parabolic) to the growth of your savings (exponential).

What's true of the physical world is also true of the human brain too.  One example is the way in which our brains use non-linear relationships to evaluate choices. One of the foundational tenets in the field of Behavioral Economics is Kahneman and Tversky's Prospect Theory and Loss Aversion.

Loss aversion describes the non-linear relationship between the value associated with gain of an item versus the value associated with loss of the same item.  We would rather not lose something than find that same thing.

Whether we are conscious of this or not, our brains use it every day when we evaluate choices, protecting what we own is a greater driver of behavior than gaining new things,  it is one reason why the insurance market is so large.

A good analyst will over lay this non-linear understanding of the world when interpreting findings, however it would be useful if analytics software could allow for human-readable non-linear analytics (it's what makes Support Vector Machines so powerful, yet so indecipherable).

Parallelization Begins

Having built a bullet-proof k-fold analytics engine, we have begun the process of migrating it to a parallel computing framework. As the size of the datasets that Knowledge Leaps is processing has increased in terms of volume and quantity, switching to a parallel framework will add scalable improvements in speed and performance. While we had limited the number of cross validations (the k value) to a maximum of 10, we will be able to increase it further with a minimal increase in compute time and much improved accuracy calculations.

Adding parellel-ization to the batch data engineering functionality will also increase the data throughput of the application. Our aim is to deliver a 10X - 20X improvements data throughput on larger datasets.