Building An Asset, Being Strategic, Learning Important Lessons

Since shifting out of a pure-play service company to building a product-led  company, I am now seeing what it is to be strategic.

In building a product, you are investing in an asset. Investing in an asset forces you to make strategic decisions since the product features define the course and goals for a company. When resources are limited, decision-making needs to be better since the direction these decisions impose on your company's direction are costly to undo.

Bootstrapping the development Knowledge Leaps for the past three years has been eye-opening and a great learning opportunity. The top lessons learnt so far are:

  1. Don't invest money in features that don't make it easier to use the product, today.
  2. Use the product, experience the pain points, then write the scope for the next build.
  3. Get the basics done right before moving on to build more advanced features.
  4. Work with the right team.

Fundamentally, I have learnt that if I am  allocating finite resources that have a compounding effect on my company then I am making the right  strategic.

 

Patent Attorneys and Non-Euclidean Geometry

Now I know why patent attorneys are trained lawyers. A patent isn't so much about invention. Its about owning a territory and arguing that territory should be as large as possible.

Since invention-space is non-Euclidean, there is always some more space to occupy with your invention than is first apparent. Although, this largely depends  on your attorney. Finding this own-able invention-space has been an interesting journey these past few years.

Through working with my attorney, I also learnt that the USPTO also has deadlines and targets making them amenable to negotiation. Its in the USPTO's interests for patent applications to be abandoned or approved, the middle-ground of arguing back and forth is like purgatory for you, and the agent handling your application. Since the USPTO can't force you to abandon an application, they have to negotiate.

On this note, we've been negotiating with the USPTO and are looking to have some good news soon!

The Power of Non-Linear Analysis

A lot of what an analyst does is perform linear analysis. These analyses are guaranteed to produce human-readable stories, even if they aren't  insightful.

The world in which we live is not linear.  In the book, 17 Equations That Changed The World, the author, Ian Stewart,  selects only three equations that are linear, the rest are non-linear.

This shows the limitations of linear analysis in explaining the world around us. A lot of what we experience in life is non-linear, from the flight of a ball in the air (parabolic) to the growth of your savings (exponential).

What's true of the physical world is also true of the human brain too.  One example is the way in which our brains use non-linear relationships to evaluate choices. One of the foundational tenets in the field of Behavioral Economics is Kahneman and Tversky's Prospect Theory and Loss Aversion.

Loss aversion describes the non-linear relationship between the value associated with gain of an item versus the value associated with loss of the same item.  We would rather not lose something than find that same thing.

Whether we are conscious of this or not, our brains use it every day when we evaluate choices, protecting what we own is a greater driver of behavior than gaining new things,  it is one reason why the insurance market is so large.

A good analyst will over lay this non-linear understanding of the world when interpreting findings, however it would be useful if analytics software could allow for human-readable non-linear analytics (it's what makes Support Vector Machines so powerful, yet so indecipherable).

Parallelization Begins

Having built a bullet-proof k-fold analytics engine, we have begun the process of migrating it to a parallel computing framework. As the size of the datasets that Knowledge Leaps is processing has increased in terms of volume and quantity, switching to a parallel framework will add scalable improvements in speed and performance. While we had limited the number of cross validations (the k value) to a maximum of 10, we will be able to increase it further with a minimal increase in compute time and much improved accuracy calculations.

Adding parellel-ization to the batch data engineering functionality will also increase the data throughput of the application. Our aim is to deliver a 10X - 20X improvements data throughput on larger datasets.

Building the Future of Machine Learning and Analytics. Right Here, Right Now.

 

 

TechCrunch recently published an article which describes what I am building with the Knowledge Leaps platform (check out website here).

Knowledge Leaps, is a soup-to-nuts data management and analytics platform. With a focus on data engineering, the platform is aimed at helping people prepare data in readiness for predictive modeling.

The first step to incorporating AI in to an analytics process is to build an application that automates grunt work. The effort is in cleaning data, mapping it and converting it to the right structure for further manipulation. It's time-consuming but can be systematized. The Knowledge Leaps application does this, right now. It seamlessly converts any data structure into user-level data using a simple interface, perfect for those who aren't data scientists.

Any data can then be used in classification models using an unbiased algorithm combined with k-fold cross validation for rigorous,objective testing. This is just the tip of the iceberg of its current, and future, functionality.

Onward, to the future of analytics.

Driver-less Cars: Will They Really Help Uber and Lyft?

I have been thinking about why Uber is so keen to develop driver-less technology.

In their current business model, they recruit drivers who have cars (most of the time). These drivers show up with their cars on the app when there are riders to pick up.  As the rider volume ebbs and flows throughout the day so does the driver volume. It's near-perfect scale-able resource.

Once Uber adopts driver-less car technology, it will no longer need drivers. No more HR and employment headaches. However, it will still need cars.  To survive, Uber will become a fleet taxi operation, albeit a fleet of driver-less taxis. If Uber goes all-out driver-less a key question needs to be answered:

Just how many driver-less taxis will Uber need to have in its fleet?

Does Uber design around peak usage? And, as a result recreate the scarcity problem that it originally solved, a ride when you want it. Or does it design around total usage? In which case it will have costly excess capacity sitting idle.

At the moment Uber owns very few cars. If it buys 100,000 cars worldwide, the cost to the business over 5 years could be $300 million a year (Assuming it pays $15,000 per car fitted with driver-less technology) plus running expenses and insurance. These additional costs could be as $5000 a year or $500mn for a fleet this size, giving rise to annual costs of $800mn to operate this fleet. However, a fleet of 100,000 cars wouldn't give Uber sufficient density across the ~500 cities they operate in worldwide to maintain their market position.

The only scenario that is viable is if Uber were to replace all of its driver-owned fleet with driver-less cars. In this scenario it would need to purchase 1 million driver-less vehicles (one for each of the drivers who reportedly work for Uber). Using the same assumptions as before, this would cost Uber $3bn a year plus annual running costs and insurance of $5bn.  All in all, $8bn a year. This is quite a drag on the business that ride fees will need to cover.

Using publicly available data, Uber's gross bookings in 2016 were $20bn derived from 700mn rides. Each ride averaging at $28.73 - this might be slightly inflated because of UberEats revenue.  On this basis, the revenue from ~280mn rides each year would cover the ownership and running costs of the 1mn driver-less fleet, this represents 40% of gross booking revenues.

At this rate Uber could still be very profitable, but their business-model will shift from a gig-economy business to one with high capital costs baked-in, that is less insulated to changes in technology and culture, as well as being more exposed to threats from direct competitors.

In many ways, the strength of Uber, and similar companies, is their ability to recruit and manage a scale-able resource to meet demand and grow their businesses long term.  Although, as many gig economy businesses are realizing, it is hard to build a company using freelancers, so while the adoption of a driver-less fleet will change the nature of ride-hailing companies' economics, it might be the best next step for Uber, et al. in order to maintain growth.

When Do We Start Working For Computers?

I have done some quick back-of-envelope calculations on the progress of AI, trying to estimate how much progress has been made vs. how many job-related functions and activities there are left to automate.

On Angel List and Crunchbase there are a total of 4830 AI start-ups listed (assuming both lists contain zero duplicates). To figure out how many unique AI tools and capabilities there are, let's assume the following:

  1. All these companies have a working product,
  2. Their products are unique and have no competitors,
  3. They are all aimed at automating a specific job function, and
  4. These start-ups only represent 30% of all AI-focused company universe.

This gives us a pool of 16,100 unique, operational AI capabilities. These capabilities will be in deep domains (where current AI technology is most successful) such as booking a meeting between two people via email.

If we compare this to the number of domain specific activities in the world of work, we can see how far AI has come and how far it has to go before we are all working for the computers. Using US government data, there are 820 different occupations, and stock markets list 212 different industrial categories. If we make the following set of assumptions:

  1. 50% of all occupations exists in each industrial category,
  2. Each occupation has 50 discrete activities.

This gives us a total of 4.34 million different occupational activities that could be automated using AI. In other words, at its most optimistic, current AI tools and processes could automate 0.37% of our current job functions. We have come a long way, but there is still a long way to go before we are out of work.  As William Gibson said, "the future's here, it's just not widely distributed yet"

Automation: A Bright Future

From reading many articles and posts about the threat of AI to the job market, I am coming to the view that any automation, whether or not it is as result of AI, is good for long term economic prospects. Like most economists I have painted a simplistic view of the economic cycle, none-the-less I have faith that automation is a force for good.

Automation will help decouple the relationship between reducing employment and increasing inflation, a relationship that can quickly turn an economic booms into a recession.

The accepted view is that rising demand not only increases companies' profits, it also raises inflation as prices rise in response to demand. Rising demand for a company's products and services will lead to more hiring to increase output. As economies approach full employment, the cost of labor for companies faces two inflationary pressures; the first is response to increased demand for labor, and the second is in response to increased prices lead to hire wage demands. This leads to a familiar cycle: boom -> increasing inflation -> correction in the economy -> increased unemployment and reduced inflation/prices -> boom -> etc.  

Inserting automation into this cycle will allow companies to increase productivity without increasing labor cost - which erode profits and break the growth cycle. Increasing company profits will lead to increased share prices for public companies. Since many people's retirement savings are invested in the stock market in one form or another, as companies profits grow, so will the value of people's retirement savings. This will help make it easier for people to make the decision to retire. In short, the right amount of automation could a) reduce an economy's overall demand for labor, and b) provide sufficient long term stock market gains to support a growing retired section of the population. This latter point is interesting since automation could reduce the overall demand for labor. If the pool of workers chasing fewer jobs is too large then wages would fall leading to deflation and a stagnated economy. The ideal outcome is that people remove themselves from the labor market, because they can afford to retire sooner, leaving the right balance between jobs and workers. The right balance of labor supply and demand will allow for moderate inflation, GDP growth, and a stock market that can support an growing number of liberated workers.

From an employment point of view, automation may also create the need for new jobs that do not currently exist. For example prior to 2007, a marketing department in a company did not need a Social Media Manager, similarly there were no Gas Station Attendants prior to the invention of the car. In other words, automation will reduce the need for labor in current roles, as companies look to increase productivity without baking in more labor costs, it will also create new roles as the labor force becomes liberated from repetitive tasks.

One area this is happening is in data analysis and data engineering.  My web app Knowledge Leaps is designed to automate the grunt and grind of data engineering. I am building it because I want people working in similar industries to be liberated from the chore of data management, so that they  can focus on interpretation and application of the findings.

Probabilities In Plain Sight Part One: Parking Tickets

Probability and chance are baked into a lot of our daily life. Most of the time they are understandable and related to pure random events. For example the odds of being struck by lightning in the USA at any given time is one in a million.

Increasingly, I have begun to think about probabilities that are related to human behaviors and are less obvious.  For example, the cost of a parking fine reflects a number of probabilities. The probability of committing a parking offence and the probability of being caught. This must help calculate the rate of capture, the cost of which must be paid for by the fines issued.

For a fixed cost of patrolling the streets, it makes sense that the higher the fine, the fewer offenders there are, or that they are harder to capture. Whereas a lower fine would indicate that lots of tickets are issued - a combination of more offenders and them being easier to identify and issue a parking ticket to.  The consequences could be counter-intuitive, higher fines should encourage people to park illegally (as it will be less likely they are issued a ticket) whereas lower fines should discourage people parking illegally, as it suggests that the rate of ticketing for offenders is much higher.