Why Focus On Retail?

The main difference between a retailer built for the web and a physical retailer predating the internet, is that in any transaction in the former the concept of a unique customer is a necessary requirement to complete a sale. This isn't true of a traditional retailer. In most you can walk in, pay, and remain anonymous. In terms of growing an audience this puts the web-retailer at an advantage* over the traditional retailer.

What Knowledge Leaps can do is use data engineering and analytics to provide a customer view. A view of their business that is beyond the basket view. This allows retailers to manage and grow their audience in the same way the web-retailers do.

That's how Knowledge Leaps can help.

Why we want to focus on retail? We believe that retail is important to society. It puts people on the sidewalks and pavements of our cities,  making them vibrant places to live as well as attracting other businesses and services too.

*There are a few disadvantages for web businesses:  the cost of creating brand awareness and driving traffic, compare this to the traditional retailer that gets some free traffic by virtue of location as well as some free awareness building. Also, the cost of shipping products to customers and the cost of warehousing inventory.

A Moment To Rave About Server-less Computing

Knowledge Leaps now uses AWS Lambda. A Server-less compute technology to parallelize some of the more time-costly functions.

In layman's terms, servers are great but they have finite capacity for calculations, much like your own computer can get stuck when you have too many applications open at once, or that spreadsheet is just too large.

Server-less computing gives you the benefit of computing power without the capacity issues that a single server brings to the party. On AWS you can use up to 1024 server-less compute functions to speed up calculations. There are some limitations, which I won't go in to, but needless-to-say this technology has reduced Knowledge Leaps  compute times down by a factor of 50. Thank you Jeff!

Building An Asset, Being Strategic, Learning Important Lessons

Since shifting out of a pure-play service company to building a product-led  company, I am now seeing what it is to be strategic.

In building a product, you are investing in an asset. Investing in an asset forces you to make strategic decisions since the product features define the course and goals for a company. When resources are limited, decision-making needs to be better since the direction these decisions impose on your company's direction are costly to undo.

Bootstrapping the development Knowledge Leaps for the past three years has been eye-opening and a great learning opportunity. The top lessons learnt so far are:

  1. Don't invest money in features that don't make it easier to use the product, today.
  2. Use the product, experience the pain points, then write the scope for the next build.
  3. Get the basics done right before moving on to build more advanced features.
  4. Work with the right team.

Fundamentally, I have learnt that if I am  allocating finite resources that have a compounding effect on my company then I am making the right  strategic.

 

Patent Attorneys and Non-Euclidean Geometry

Now I know why patent attorneys are trained lawyers. A patent isn't so much about invention. Its about owning a territory and arguing that territory should be as large as possible.

Since invention-space is non-Euclidean, there is always some more space to occupy with your invention than is first apparent. Although, this largely depends  on your attorney. Finding this own-able invention-space has been an interesting journey these past few years.

Through working with my attorney, I also learnt that the USPTO also has deadlines and targets making them amenable to negotiation. Its in the USPTO's interests for patent applications to be abandoned or approved, the middle-ground of arguing back and forth is like purgatory for you, and the agent handling your application. Since the USPTO can't force you to abandon an application, they have to negotiate.

On this note, we've been negotiating with the USPTO and are looking to have some good news soon!

The Power of Non-Linear Analysis

A lot of what an analyst does is perform linear analysis. These analyses are guaranteed to produce human-readable stories, even if they aren't  insightful.

The world in which we live is not linear.  In the book, 17 Equations That Changed The World, the author, Ian Stewart,  selects only three equations that are linear, the rest are non-linear.

This shows the limitations of linear analysis in explaining the world around us. A lot of what we experience in life is non-linear, from the flight of a ball in the air (parabolic) to the growth of your savings (exponential).

What's true of the physical world is also true of the human brain too.  One example is the way in which our brains use non-linear relationships to evaluate choices. One of the foundational tenets in the field of Behavioral Economics is Kahneman and Tversky's Prospect Theory and Loss Aversion.

Loss aversion describes the non-linear relationship between the value associated with gain of an item versus the value associated with loss of the same item.  We would rather not lose something than find that same thing.

Whether we are conscious of this or not, our brains use it every day when we evaluate choices, protecting what we own is a greater driver of behavior than gaining new things,  it is one reason why the insurance market is so large.

A good analyst will over lay this non-linear understanding of the world when interpreting findings, however it would be useful if analytics software could allow for human-readable non-linear analytics (it's what makes Support Vector Machines so powerful, yet so indecipherable).

Parallelization Begins

Having built a bullet-proof k-fold analytics engine, we have begun the process of migrating it to a parallel computing framework. As the size of the datasets that Knowledge Leaps is processing has increased in terms of volume and quantity, switching to a parallel framework will add scalable improvements in speed and performance. While we had limited the number of cross validations (the k value) to a maximum of 10, we will be able to increase it further with a minimal increase in compute time and much improved accuracy calculations.

Adding parellel-ization to the batch data engineering functionality will also increase the data throughput of the application. Our aim is to deliver a 10X - 20X improvements data throughput on larger datasets.

Building the Future of Machine Learning and Analytics. Right Here, Right Now.

 

 

TechCrunch recently published an article which describes what I am building with the Knowledge Leaps platform (check out website here).

Knowledge Leaps, is a soup-to-nuts data management and analytics platform. With a focus on data engineering, the platform is aimed at helping people prepare data in readiness for predictive modeling.

The first step to incorporating AI in to an analytics process is to build an application that automates grunt work. The effort is in cleaning data, mapping it and converting it to the right structure for further manipulation. It's time-consuming but can be systematized. The Knowledge Leaps application does this, right now. It seamlessly converts any data structure into user-level data using a simple interface, perfect for those who aren't data scientists.

Any data can then be used in classification models using an unbiased algorithm combined with k-fold cross validation for rigorous,objective testing. This is just the tip of the iceberg of its current, and future, functionality.

Onward, to the future of analytics.

Driver-less Cars: Will They Really Help Uber and Lyft?

I have been thinking about why Uber is so keen to develop driver-less technology.

In their current business model, they recruit drivers who have cars (most of the time). These drivers show up with their cars on the app when there are riders to pick up.  As the rider volume ebbs and flows throughout the day so does the driver volume. It's near-perfect scale-able resource.

Once Uber adopts driver-less car technology, it will no longer need drivers. No more HR and employment headaches. However, it will still need cars.  To survive, Uber will become a fleet taxi operation, albeit a fleet of driver-less taxis. If Uber goes all-out driver-less a key question needs to be answered:

Just how many driver-less taxis will Uber need to have in its fleet?

Does Uber design around peak usage? And, as a result recreate the scarcity problem that it originally solved, a ride when you want it. Or does it design around total usage? In which case it will have costly excess capacity sitting idle.

At the moment Uber owns very few cars. If it buys 100,000 cars worldwide, the cost to the business over 5 years could be $300 million a year (Assuming it pays $15,000 per car fitted with driver-less technology) plus running expenses and insurance. These additional costs could be as $5000 a year or $500mn for a fleet this size, giving rise to annual costs of $800mn to operate this fleet. However, a fleet of 100,000 cars wouldn't give Uber sufficient density across the ~500 cities they operate in worldwide to maintain their market position.

The only scenario that is viable is if Uber were to replace all of its driver-owned fleet with driver-less cars. In this scenario it would need to purchase 1 million driver-less vehicles (one for each of the drivers who reportedly work for Uber). Using the same assumptions as before, this would cost Uber $3bn a year plus annual running costs and insurance of $5bn.  All in all, $8bn a year. This is quite a drag on the business that ride fees will need to cover.

Using publicly available data, Uber's gross bookings in 2016 were $20bn derived from 700mn rides. Each ride averaging at $28.73 - this might be slightly inflated because of UberEats revenue.  On this basis, the revenue from ~280mn rides each year would cover the ownership and running costs of the 1mn driver-less fleet, this represents 40% of gross booking revenues.

At this rate Uber could still be very profitable, but their business-model will shift from a gig-economy business to one with high capital costs baked-in, that is less insulated to changes in technology and culture, as well as being more exposed to threats from direct competitors.

In many ways, the strength of Uber, and similar companies, is their ability to recruit and manage a scale-able resource to meet demand and grow their businesses long term.  Although, as many gig economy businesses are realizing, it is hard to build a company using freelancers, so while the adoption of a driver-less fleet will change the nature of ride-hailing companies' economics, it might be the best next step for Uber, et al. in order to maintain growth.

When Do We Start Working For Computers?

I have done some quick back-of-envelope calculations on the progress of AI, trying to estimate how much progress has been made vs. how many job-related functions and activities there are left to automate.

On Angel List and Crunchbase there are a total of 4830 AI start-ups listed (assuming both lists contain zero duplicates). To figure out how many unique AI tools and capabilities there are, let's assume the following:

  1. All these companies have a working product,
  2. Their products are unique and have no competitors,
  3. They are all aimed at automating a specific job function, and
  4. These start-ups only represent 30% of all AI-focused company universe.

This gives us a pool of 16,100 unique, operational AI capabilities. These capabilities will be in deep domains (where current AI technology is most successful) such as booking a meeting between two people via email.

If we compare this to the number of domain specific activities in the world of work, we can see how far AI has come and how far it has to go before we are all working for the computers. Using US government data, there are 820 different occupations, and stock markets list 212 different industrial categories. If we make the following set of assumptions:

  1. 50% of all occupations exists in each industrial category,
  2. Each occupation has 50 discrete activities.

This gives us a total of 4.34 million different occupational activities that could be automated using AI. In other words, at its most optimistic, current AI tools and processes could automate 0.37% of our current job functions. We have come a long way, but there is still a long way to go before we are out of work.  As William Gibson said, "the future's here, it's just not widely distributed yet"