ALEXA: State-of-the-art AI

If you want to see how good Alexa is at answering people's questions you should sign on to Alexa Answers and see the questions Alexa cannot answer. This site has gamified helping Alexa answer these questions. I spent a week doing this and figured out a pretty good work flow to stay in the top 10 of the leader board.

The winning strategy is to use Google. You copy the question in to Google and paste the answer Google back in to the Alexa Answers website for it to played back to the person who asked it. The clever thing is that since it is impossible to legally web-scrape at a commercially viable rate, Amazon have found a way of harnessing the power of Google without a) having to pay, b) violating's TOS, and c) getting caught stealing Google's IP.

After doing this for a week, the interesting thing to note is why Alexa could not answer these questions. Most of them are interpretation errors. Alexa misheard the question (e..g connor virus, coronda virus, instead of coronavirus). The remainder of the errors are because the question assumes Alexa's knowledge of the context (e.g. Is fgtv dead? - he's a youtube star) and without the subject of the question being a known entity in Alexa's knowledge graph, the results are ambiguous. Rather than be wrong, Alexa declines to answer.

Obviously this is where the amazing pattern matching abilities of the human brain come in. We can look at the subject of the question and the results and choose the most probable correct answer. Amazon can then augment Alexa's knowledge graph using these results. This is probably in violation of Google's IP if Amazon intentionally set out to do this.

Having a human being perform the hard task in a learning loop is something that we have also employed in building our platform. Knowledge Leaps can take behavioral data and tease out price sensitivity signals, using purchase data, as well as semantic signals in survey data.

Science Fiction and the No-Free-Lunch Theory

In a lot of science fiction films one, or more, of the following are true:

  1. Technology exists that allows you to travel through the universe at the "speed of light."
  2. Technology that allows autonomous vehicles to navigate complicated 2-D and 3-D worlds exists.
  3. Technology exists that allows robots to communicate with humans in real-time detecting nuances in language.
  4. Handheld weapons have been developed that fire bursts of lethal high energy that require little to no charging.

Yet, despite these amazing technological advances the kill ratio is very low. While it is fiction, I find it puzzling that this innovation inconsistency persists in many films and stories.

This is the no-free-lunch theory in action. Machines are developed to be good at a specific task are not good at doing other tasks. This will have ramifications in many areas especially those that require solving multiple challenges. Autonomous vehicles for example need to be good at 3 things:

  1. Navigating from point A to B
  2. Complying with road rules and regulations.
  3. Negotiating position and priority with other vehicles on the road.
  4. Not killing, or harming, humans and animals.

Of this list 1) and 2) are low level. 3) is challenging to solve as it requires some programmed personality. Imagine if two cars using the same autonomous software meet at a junction at the very same time, one of them needs to give way to the other. This requires some degree of assertiveness to be built. I am not sure this is trivial to solve.

Finally, 4) is probably really hard to solve since it requires 99.99999% success in incidents that occur every million miles. There may never be enough training data.


This is the valuing of your own labor rate that takes place after your third or fourth trip back from Ikea. You know you have saved some money buying a set of shelves that you need to assemble but part of the decision was made by assessing how much time it would take to put the item together vs how much you saved.

Ikea is in the business in devaluing our self-perceived labor rate so that they can charge the most for a flatpack item such that the discount achieved justifies the hours needed to assemble the item. (This is the same model used by meal-kit businesses, or at least it should be.)

For items that do not require assembly then the trade-off people have to make is their willingness to buy products that do not comply with the standards for that type of product bought everywhere else. For example, a desk lamp from Ikea requires no assembly so there is no time-cost to save. To justify the lower price there must be an investment from the customer. In this case it is the willingness to accept a non-standard shade fitting. The same is true of other non-assembly products sold at Ikea.

For those folks interested in alternative data, there could be a macro signal regarding wage earnings and wage growth buried in this data. Comparing the price of an Ikea product with a similar (assembled) non-Ikea product over time could be a useful economic indicator.

Competing For Space vs. Competing For Resources

On recent visit to Southwest Utah I saw lots of pygmy forests containing pinyon pines and small oak trees, these forests are sparse and the trees no more than 8-10 feet tall. The National Park literature says that these trees have adapted to low water conditions. Contrast this with the Redwood forests of coastal California where resources (water & sunlight) are abundant. In this environment the trees are more densely packed and grow much taller.

Replace trees with firms and resources for customers, and this paragraph could describe a business landscape. Being binary for a moment, a new firm gets to choose between choosing to enter a market where resources (customers) are slim or to enter a market where there are lots of customers. Choosing a market with few customers, makes it easier to differentiate your firm but the odds of survival are worse. Choosing a market with more customers makes it harder differentiate your firm and therefore the survival odds are also tough.

Unless of course, your firm is first. In both instances you get to choose the best position and consume all available resources.

Giant Sequoia

Building A Product. Lessons Learned.

Some thoughts on what I have learnt by working in a new company that is building software. A lot of what you "should" do is the wrong thing to do. Here are some reflections on building a firm in San Francisco.

Prospects First

Speaking to prospect firms will get you further, faster than speaking to venture capital firms. Firms that have pain points will pay for solutions and they won't care so much how many other firms have the same pain point. Venture capital firms are interested in size of market, size of outcome, probability of success, experience of the team. Answering a VC's questions won't necessarily help you build a product and a business. If you can't afford to build the software that will answer the pain point you are trying to solve, then work out what you can build and how you can bridge the gap using other means.

Perform The Process By Hand, Before Writing Code

The best business software is first cut-by-hand like the first machine screw. If your software replaces a human-business-process and you can't afford to build the software,  ask yourself 'how much can my firm afford to build?'

Most processes have the same elements: Task Specification, Task Execution, Present Results. The most complex part of this is Task Execution as this will require a lot of code and a lot of investment. As your company speaks to firms work out if it is possible to use humans to perform the complex Task Execution element. If you think it is then you should build a software architecture and framework that allows humans to do the hard work at first. This will help you refine the use-case and build more effective and efficient code. This also wouldn't be the first time this has been done, see here and here for more background.

A useful piece of military wisdom is worth keeping in mind; no plan survives first contact with the enemy.  While customers are certainly not the enemy, the sentiment still holds. It's not until you put your plan in to action and have firms use your product that you realize its true strengths and weaknesses. Here begins the process of iterating product development.

"Speak to people, we might learn something"

This is what my business development lead says a lot. He also asks questions that get customers and prospects talking. In these moments you will learn about the firm, the buyer, the competition, and lots of other information that will make your product and service better.

"We are just starting out"

This is another useful mantra. In lots of ways we do not know where our journey will take us. It is part inspired by company vision but also customer feedback. In Eric Beinhocker's book, The Origin of Wealth, he likens innovation to the process of searching technology-solution-space, an innovation map, looking for high points (that correlate with company profits and growth). The important part of this search process is customer feedback. What your company does determines you starting point on the innovation map, how your firm reacts to customer and market feedback determines which direction you will go in, and ultimately will be a critical factor in its success.

Spam, Bots, and Turing Tests

Since I started my blog I have had 350+ spam comments. They tend to come in waves of similar types of comments. One week they might all be in Russian, the next week they all refer to Houdini and seatbelts. Broadly speaking they fall in to two categories, they are either flattering and wanting me to click on a malicious link or they make no sense whatsoever.

This latter class of comments are interesting because of their seeming pointlessness - there is no link for me to click on contained in the comment.  This got me thinking that when I set the comments to "spam", "trash" or "publish" the author of those comments would get a message back saying: " Your comment has been published/deleted". This could be useful feedback if these comments were generated by a computer and some one was trying to write a bot that could perform natural language processing and maybe even pass the Turing Test. To train the bot you would need lots of examples of text that can be easily parsed from a web page, and where better than a blog to get that sort of information. Each time I set the status of a comment to spam I am helping train a bot and  have become an unwitting servant of a malicious hackers.


A RoboCoworker for Analytics

You can have a pretty good guess at someone's age based on purely on the number of web domains they have purchased and keep up to date. I have 46 and I bought another one, the other day, I had in mind an automated coworker that could offer a sense of companionship to freelancers and solo start-up founders during their working day. It's semi-serious and I put these thoughts to one side as I got back to some real work.

Today, I had a call with a prospect for Knowledge Leaps. I gave them a demo and described the use-cases for their industry and role. It dawned on me, that I was describing and automated coworker, a RoboCoworker if you will.

This wouldn't be someone you can share a joke  or discuss work issues with, but would be another member of your analytics team that does the work while you are in meetings, fielding calls from stakeholders, or selling in the findings from the latest analysis. What I call real work that requires real people.

Data Moats and Brand Growth

19th C Print Brontosaurus Excelsus by Joseph Smit

Once brands and companies embrace data, the biggest challenge they face for long term growth, and survival, is ensuring the data they control has a broad scope - i.e. it allows them to look to their edges of their vertical, and beyond. When Warren Buffet chooses firms to invest in, he looks at what he describes as their moat, either an economic, ip-related or technological moat. A firm's moat helps insulate them from attack, and allows companies to weather economic downturns. In the data age, building broad scope data sources is an extra moat for firms.

Most companies in established verticals are either competing with Amazon or worried they will end up competing with Amazon. As with Google, Amazon gets a wide and detailed view of customer behaviors and trends from its many businesses. Just three of those business units, online store, web services and amazon video, provide a rich understanding of consumers and their preferences that Amazon can use to identify opportunities for launch new businesses.

Amazon is clearly ahead of the game, and if they don’t make a misstep the lack of real competition this early on will no doubt allow the power law of growth to take hold in many verticals. As Marc Andreessen wrote in 2011, “Software is eating the world”, Amazon is doing just this. No doubt Google and Facebook are on a similar path and are equally hungry.

For the rest of the commercial world, sage advice would be to build broad scope data sources that you control, rather than just enhancing analytics capabilities. Data sources that provide insight into incremental audiences are crucial to audience and sales growth.

Firms should invest in creating new behavioral datasets that they can control (analyze, shape, create experiments with). Ultimately future success will be determined by whether firms can create demand for their products by changing people’s behaviors and create incremental audiences. Looking beyond their verticals, and thinking about the growing their categories is key to this. This can only be achieved successfully by committing to a data strategy that encompasses, developing broad scope and deep data sources as well as advanced analytics.

Data Thoughts and Data Trends

I think a lot about data and about different types of data and who is using it. I produced this table that classifies the relationship with data for firms in different verticals.  It also shows the relationship between the use of survey data and the use of behavioral data to understand the performance of their business. Those firms that haven't traditionally required identifying a customer as an individual have had a greater reliance on survey data to understand their customer the those firms where you require an account to buy from  them.

All firms use $ sales and profit to measure their performance, but along the way firms need proxy measures - these are either going to come from behavioral data (clicks, foot traffic, etc) or from proxy measures (customer satisfaction gather in surveys).

As more businesses have become customer-data-centric if not fully customer-centric, there has been an increasingly reliance on customer behavioral data rather than survey data to measure performance.

Over the past decade this has been driven by many firms moving from the top-left quadrant to the bottom-right quadrant, notable examples are Apple and many video game developers and publishers.

I wonder if there is a performance inflection point? For example is there a threshold value for the proportion of your customer base that you have a relationship with, above business performance weakens, especially in markets where there isn't a monopoly and near-monopoly. Since customers are always churning, customer acquisition is key to driving sales growth. However, having a known pool of customers (current and lapsed) could breed a sense of complacency in firms. Contrast that with firms who do not have a direct relationship with their customers, this induces a certain degree of paranoia that puts customer retention and acquisition at the top of their agenda.