ALEXA: State-of-the-art AI

If you want to see how good Alexa is at answering people's questions you should sign on to Alexa Answers and see the questions Alexa cannot answer. This site has gamified helping Alexa answer these questions. I spent a week doing this and figured out a pretty good work flow to stay in the top 10 of the leader board.

The winning strategy is to use Google. You copy the question in to Google and paste the answer Google back in to the Alexa Answers website for it to played back to the person who asked it. The clever thing is that since it is impossible to legally web-scrape Google.com at a commercially viable rate, Amazon have found a way of harnessing the power of Google without a) having to pay, b) violating Google.com's TOS, and c) getting caught stealing Google's IP.

After doing this for a week, the interesting thing to note is why Alexa could not answer these questions. Most of them are interpretation errors. Alexa misheard the question (e..g connor virus, coronda virus, instead of coronavirus). The remainder of the errors are because the question assumes Alexa's knowledge of the context (e.g. Is fgtv dead? - he's a youtube star) and without the subject of the question being a known entity in Alexa's knowledge graph, the results are ambiguous. Rather than be wrong, Alexa declines to answer.

Obviously this is where the amazing pattern matching abilities of the human brain come in. We can look at the subject of the question and the results and choose the most probable correct answer. Amazon can then augment Alexa's knowledge graph using these results. This is probably in violation of Google's IP if Amazon intentionally set out to do this.

Having a human being perform the hard task in a learning loop is something that we have also employed in building our platform. Knowledge Leaps can take behavioral data and tease out price sensitivity signals, using purchase data, as well as semantic signals in survey data.

AI Developer, A Job For Life.

Last year we wrote about No Free Lunch Theory (NFLT) and how it relates to AI (among other things). In this recent Wired article, this seems to be coming true. Deep Learning, the technology that helped AI make significant leaps in performance has limitations. These limitations, as reported in the article, cannot necessarily be overcome with more compute power.

As NFLT states (paraphrased): being good at doing X means an algorithm cannot also be good at Doing Not X. Deep Learning models that have success in one area is not a guarantee they will have success in other areas. In fact the opposite tends to be true. This is the NFLT in action and in many ways, specialized-instances of AI-based systems was an inevitability of this.

This has implications for the broader adoption of AI. For example, there can be no out-of-the box AI "system". Implementing an AI solution based on the current-state-of-the-art is much like building a railway system. It needs to adapt to the local terrain. A firm can't take a system from another firm or AI-solutions provider and hope it will be a turn-key operation. I guess it's in the name, "Deep Learning". The "Deep" refers to deep domain, i.e. specific use-case, an not necessarily deep thinking.

This is great news if you are an AI developer or have experience in building AI-systems. You are the house builder of the modern age and your talents will always be in demand - unless someone automates AI-system implementation.

UPDATE: A16Z wrote this piece - which supports my thesis.

Market Research 3.0

In recent years, there has been lots of talk about incorporating Machine Learning and AI into market research. Back in 2015, I met someone at a firm who claimed to be able scale up market research survey results from a sample of 1,000 to samples as large as 100,000 using ML and AI.

Unfortunately that firm, Philometrics, was founded by Aleksandr Kogan - the person who wrote the app for Cambridge Analytica that scraped Facebook data using quizzes. Since then, the MR world has moved pretty slowly. I have a few theories but I will save those for later posts.

Back on topic, Knowledge Leaps got a head start on this six years ago when we filed our patent for technology that automatically analyzes survey data to draw out the story. We don't eliminate human input, we just make sure computers and humans are put to their best respective uses.

We have incorporated that technology into a web-based platform: www.knowledgeleaps.com. We still think we are a little early to market but there might be enough early adopters out there now around which we can build a business. 

As well as reinventing market research, we will also reinvent the market research business model. Rather than charge a service fee for analysis, we only charge a subscription for using the platform.

Obviously you still have to pay for interviews to gather the data, but you get the idea. Our new tech-enabled service will dramatically reduce the time-to-insight and the cost-of-insight in market research. If you want to be a part of this revolution, then please get in touch: Doug@knowledgeleaps.com.

There is no free lunch in AI

In conversations with a friend from university I learned about the No Free Lunch Theorem and how it affects the state-of-the-art of machine learning and artificial intelligence development.

Put simply, the No Free Lunch Theorem (NFL) proves that if an algorithm is good at solving a specific type of problem then it pays for this success by being less successful at solving other classes of problems.

In this regard, Algorithms, AI Loops and Machine Learning solutions are like people; training to achieve mastery in one discipline doesn't guarantee that same  person is a master in a related discipline without further training. However, unlike people, algorithm training might be a zero-sum game with further training likely to reduce the competency of a machine learning solution in an adjacent discipline. For example, while Google's AlphaZero can be trained to beat world champions at chess and Go,  this was achieved using separate instances of the technology. A new rule set was created to win at chess rather than adapting the Go rule set. Knowing how to win at Go doesn't guarantee being able to win at chess without retraining.

What does this mean for the development of AI? In my opinion while there are firms with early-mover advantage in the field, their viable AI solutions are in very deep domains that tend to be closed systems, e.g. board games, video games, and making calendar appointments. As the technology is developed, each new domain will require new effort, likely to lead to a high number of AI solutions/providers. So rather than an AI future dominated by corporate superpowers there will be many providers, each with a domain-distinct AI offerings.

 

 

Patented Technology

The patent that has just been awarded to Knowledge Leaps is for our continuous learning technology.  Whether it is survey data, purchase data or website traffic / usage data., the technology we have developed will automatically search these complex data spaces. The data spaces covers the price-demand space for packaged goods, or the attitudinal space of market research surveys and other data where there could be complex interactions.  In each case, as more data is gathered - more people shopping, more people completing a survey, more people using an app or website - the application updates its predictions and builds a better understanding of the space.

In the use-case for the price-demand for packaged goods, the updated predictions then alter the recommendations about price changes that are made. This feedback loop allows the application to update its beliefs about how shoppers are reacting to prices and make improved recommendations based on this knowledge.

In the survey data use-case, the technology will create an alert when the data set becomes self-predicting. At this point capturing further data is unnecessary to understand the data set and carries an additional expense.

The majority of statistical tools enable analysts to identify the relationships in data. In the hands of a human, this is a brute-force approach and is prone to human biases and time-constraints. The Knowledge Leaps technology allows for more systematic and parallelized approach - avoiding human bias and reducing human effort.

When Do We Start Working For Computers?

I have done some quick back-of-envelope calculations on the progress of AI, trying to estimate how much progress has been made vs. how many job-related functions and activities there are left to automate.

On Angel List and Crunchbase there are a total of 4830 AI start-ups listed (assuming both lists contain zero duplicates). To figure out how many unique AI tools and capabilities there are, let's assume the following:

  1. All these companies have a working product,
  2. Their products are unique and have no competitors,
  3. They are all aimed at automating a specific job function, and
  4. These start-ups only represent 30% of all AI-focused company universe.

This gives us a pool of 16,100 unique, operational AI capabilities. These capabilities will be in deep domains (where current AI technology is most successful) such as booking a meeting between two people via email.

If we compare this to the number of domain specific activities in the world of work, we can see how far AI has come and how far it has to go before we are all working for the computers. Using US government data, there are 820 different occupations, and stock markets list 212 different industrial categories. If we make the following set of assumptions:

  1. 50% of all occupations exists in each industrial category,
  2. Each occupation has 50 discrete activities.

This gives us a total of 4.34 million different occupational activities that could be automated using AI. In other words, at its most optimistic, current AI tools and processes could automate 0.37% of our current job functions. We have come a long way, but there is still a long way to go before we are out of work.  As William Gibson said, "the future's here, it's just not widely distributed yet"

Automation: A Bright Future

From reading many articles and posts about the threat of AI to the job market, I am coming to the view that any automation, whether or not it is as result of AI, is good for long term economic prospects. Like most economists I have painted a simplistic view of the economic cycle, none-the-less I have faith that automation is a force for good.

Automation will help decouple the relationship between reducing employment and increasing inflation, a relationship that can quickly turn an economic booms into a recession.

The accepted view is that rising demand not only increases companies' profits, it also raises inflation as prices rise in response to demand. Rising demand for a company's products and services will lead to more hiring to increase output. As economies approach full employment, the cost of labor for companies faces two inflationary pressures; the first is response to increased demand for labor, and the second is in response to increased prices lead to hire wage demands. This leads to a familiar cycle: boom -> increasing inflation -> correction in the economy -> increased unemployment and reduced inflation/prices -> boom -> etc.  

Inserting automation into this cycle will allow companies to increase productivity without increasing labor cost - which erode profits and break the growth cycle. Increasing company profits will lead to increased share prices for public companies. Since many people's retirement savings are invested in the stock market in one form or another, as companies profits grow, so will the value of people's retirement savings. This will help make it easier for people to make the decision to retire. In short, the right amount of automation could a) reduce an economy's overall demand for labor, and b) provide sufficient long term stock market gains to support a growing retired section of the population. This latter point is interesting since automation could reduce the overall demand for labor. If the pool of workers chasing fewer jobs is too large then wages would fall leading to deflation and a stagnated economy. The ideal outcome is that people remove themselves from the labor market, because they can afford to retire sooner, leaving the right balance between jobs and workers. The right balance of labor supply and demand will allow for moderate inflation, GDP growth, and a stock market that can support an growing number of liberated workers.

From an employment point of view, automation may also create the need for new jobs that do not currently exist. For example prior to 2007, a marketing department in a company did not need a Social Media Manager, similarly there were no Gas Station Attendants prior to the invention of the car. In other words, automation will reduce the need for labor in current roles, as companies look to increase productivity without baking in more labor costs, it will also create new roles as the labor force becomes liberated from repetitive tasks.

One area this is happening is in data analysis and data engineering.  My web app Knowledge Leaps is designed to automate the grunt and grind of data engineering. I am building it because I want people working in similar industries to be liberated from the chore of data management, so that they  can focus on interpretation and application of the findings.

In AI, We Trust.

I think this article sums up the challenges of facing the data science community and, by extension, all data analysts. While much of what we are doing isn't in the realms of AI, a lot of the algorithms that are being used are equally opaque and hard to comprehend with the human brain.  However, there is an allure in the power of these techniques but without easy comprehension I fear we are moving into an era of data distrust.