On Maxima: Search, Life and Software.

Until recently, I have wrestled with why people I knew growing up in a small village in the UK stayed in the village when there was a whole world of opportunities awaiting discovery. I have come to realize that life is a search process. A search for purpose, contentment and security. As with most search algorithms, some are better than others. Some peoples' search algorithm stop when they discover a local maxima - such as village life in the UK. Other algorithms drive people to travel much further.

Software development follows similar principles to a search algorithm. While we might think that we are heading towards a peak when we start out building an application, we soon discover that the landscape we are searching is evolving. If we rush too quickly to a peak we might find that we settle on a local rather than a global maxima. Facebook is a good example of the impact of search speed. The reason that Facebook prevailed is that the many social networking sites that came before it provided the company with a long list of technical and business mistakes to avoid. A major lesson was controlled growth - in other words, a slow search algorithm. Avoiding the strong temptation, especially when a social network is concerned, to grow very rapidly.

This is an example of a good search process and how it has to be a slow one for long term success. A slow search allows a process to find a stable solution. The Simulated Annealing Algorithm is a good example of this. The random perturbations applied to the search result diminish overtime as the solution gets closer to the optimum search result. The occasional randomness ensures the search doesn't get stuck on a solution.

We have also been running our own, slow search algorithm as we build Knowledge Leaps. We have been at this for a long time. App development began five years ago, but we started its design at least eight years ago. While we wanted to go faster, we have been resource constrained. The advantage of this is that we have built a resilient and fault-tolerant application. The slow-development process has also helped foster our design philosophy, when we began we wanted to be super-technical and build a new scripting language for data engineering. Over time our beliefs have mellowed as we have seen the benefits of a No Code / Low Code solution. Incorporating this philosophy into Knowledge Leaps has made the application that much more user friendly and stable.

Fear-Free Databases: No Code No SQL – Use Case

We rolled out our No Code Database feature today. Just plug in a data feed and add data to a customizable database with zero lines of code, and zero knowledge of the inner workings of databases. All this in under a minute.

Setting up a database in the cloud is confusing and complex for most people. Our technology puts the power of cloud-based databases at everyone's finger tips. No need for the IT team's intervention. No need to learn remote login protocols. No need to learn any code.

We have also added in some useful aggregation and summarization tools that let you extract data from databases straight into reports and charts. Again, no code required.

Code-free Data Science

There will always be a plentiful supply of data scientists on-hand to perform hand-cut custom data science. For what most businesses requirements, the typical data scientist is over-skilled. Only other data scientists can understand their work and, importantly, only other data scientists can check their work.

What businesses require for most tasks are people with the data-engineering skills of data scientists and not necessarily their statistical skills or their understanding of a scientific-method of analysis.

Data engineering on a big scale is fraught with challenges. While Excel and Google Sheets can handle relatively large (~1mn row) datasets there is not really a similar software solution that allows easy visualization and manipulation of larger data sets. NoSQL / SQL-databases are required for super-scale data engineering, but this requires skills of the super-user. As 'data-is-the-new-oil' mantra makes its way into businesses, people will become exposed to a growing number datasets that are beyond the realm of the software available to them and, potentially, their skill sets.

At Knowledge Leaps we are building a platform solution for this future audience and these future use-cases.The core of the platform are two important features: Visual Data Engineering pipelines and Code-Free Data Science.

The applications of these features are endless; from building a customer data lake, or building a custom-data-pipeline for report generation or even creating simple-to-evaluate predictive models.

Building A Product. Lessons Learned.

Some thoughts on what I have learnt by working in a new company that is building software. A lot of what you "should" do is the wrong thing to do. Here are some reflections on building a firm in San Francisco.

Prospects First

Speaking to prospect firms will get you further, faster than speaking to venture capital firms. Firms that have pain points will pay for solutions and they won't care so much how many other firms have the same pain point. Venture capital firms are interested in size of market, size of outcome, probability of success, experience of the team. Answering a VC's questions won't necessarily help you build a product and a business. If you can't afford to build the software that will answer the pain point you are trying to solve, then work out what you can build and how you can bridge the gap using other means.

Perform The Process By Hand, Before Writing Code

The best business software is first cut-by-hand like the first machine screw. If your software replaces a human-business-process and you can't afford to build the software,  ask yourself 'how much can my firm afford to build?'

Most processes have the same elements: Task Specification, Task Execution, Present Results. The most complex part of this is Task Execution as this will require a lot of code and a lot of investment. As your company speaks to firms work out if it is possible to use humans to perform the complex Task Execution element. If you think it is then you should build a software architecture and framework that allows humans to do the hard work at first. This will help you refine the use-case and build more effective and efficient code. This also wouldn't be the first time this has been done, see here and here for more background.

A useful piece of military wisdom is worth keeping in mind; no plan survives first contact with the enemy.  While customers are certainly not the enemy, the sentiment still holds. It's not until you put your plan in to action and have firms use your product that you realize its true strengths and weaknesses. Here begins the process of iterating product development.

"Speak to people, we might learn something"

This is what my business development lead says a lot. He also asks questions that get customers and prospects talking. In these moments you will learn about the firm, the buyer, the competition, and lots of other information that will make your product and service better.

"We are just starting out"

This is another useful mantra. In lots of ways we do not know where our journey will take us. It is part inspired by company vision but also customer feedback. In Eric Beinhocker's book, The Origin of Wealth, he likens innovation to the process of searching technology-solution-space, an innovation map, looking for high points (that correlate with company profits and growth). The important part of this search process is customer feedback. What your company does determines you starting point on the innovation map, how your firm reacts to customer and market feedback determines which direction you will go in, and ultimately will be a critical factor in its success.

Arrowheads Vs. Cave Paintings

Cave of Hands (13000 - 9000 BCE), Argentina.

Why Human Data Is More Powerful than Tools or Platforms.

At KL we realize the value of data is far greater than either analytic tools or platforms.  As a team, we spend a lot of our time discussing the topics of data and analytics, especially analytics tools. We used to devote more time to this latter topic in terms of selection of existing tools and development of new ones. We spent less time talking about platforms and data.  Overt time we have come to understand that all three of Data, Platform, Analytics are vital ingredients to what we do.  This is visualized in our logo, we are about the triangulation of all three.

On this journey, I have come to realize that some things take a long time to learn. In my case , when you study engineering, you realize that the desire to make tools (in the broadest sense) is in your DNA. Not just your own, in everyone's.

Building tools is what humans do, whether it's a flint arrowhead, the first machine screw or a self-driving car. It's what we have been doing for millennia and what we will continue to do.

As a species I think we are blind to tools because they are so abundant and seemingly easy to produce - because as a species we make so many of them.  In that sense they are not very interesting and those that are interesting are soon copied and made ubiquitous.

What is true of axes, arrowheads and pottery is also true of analytics businesses. The reason it is hard-to-build a tool-based business is that the competition is intense. As a species, this won't stop us trying.

In stark contrast to analytics tools, is the importance of data and platforms. If a flint arrowhead is a tool then the cave painting is data. When I look at images of cave paintings, such as the cave of hands shown, I am in awe.  A cave painting represents a data point of human history, the cave wall the platform that allows us to view it.

This is very relevant to building a data-driven business, those firms that have access to data and provide a platform to engage with it will always find more traction than those that build tools to work on top of platforms and data.

Human data points are hard to substitute and, as a result, are more interesting and have a greater commercial value than tools.

A RoboCoworker for Analytics

You can have a pretty good guess at someone's age based on purely on the number of web domains they have purchased and keep up to date. I have 46 and I bought another one, the other day, RoboCoworker.com. I had in mind an automated coworker that could offer a sense of companionship to freelancers and solo start-up founders during their working day. It's semi-serious and I put these thoughts to one side as I got back to some real work.

Today, I had a call with a prospect for Knowledge Leaps. I gave them a demo and described the use-cases for their industry and role. It dawned on me, that I was describing and automated coworker, a RoboCoworker if you will.

This wouldn't be someone you can share a joke  or discuss work issues with, but would be another member of your analytics team that does the work while you are in meetings, fielding calls from stakeholders, or selling in the findings from the latest analysis. What I call real work that requires real people.

Scientist Or Analyst?

I've been analyzing data for 30 years.

I've studied science and engineering.

I read about science.

I taught myself to code and created a data engineering and analytics web application (KnowledgeLeaps.com).

I have thought a lot about my work as I have developed my domain expertise.

One of the things I have come to realize is that if you are designing and running experiments to prove / disprove a hypothesis, then you are performing science. You may even call yourself a scientist.  Testing a hypothesis requires evidence, usually in the form of objective data. If you are a scientist, you use data, no matter what the discipline.  You can't be a scientist without data. The term data in Data Science is  redundant, like calling yourself a Religious Priest or an Oral Dentist.

In contrast, if you use data to look for a story or a correlation (causal or otherwise) and you aren't testing a hypothesis. You aren't a scientist, you are an analyst. In this case a qualifying noun is useful (data, systems, etc).

I suspect most people who call themselves data scientists are actually analysts that have taught themselves to write code. This isn't science. Science is the method by which we create persistent state knowledge, code is just a tool we should use to process data to test a hypothesis.

Silos: Bad For Business, People and Data

While keeping people in silos is a good thing for managing and directing them, it tends to be bad for business in the long run. Especially for businesses that rely on innovation for growth.

In the book, The Medici Effect, the author describes how the wealthy 14th century house of Medici created the conditions that led to the Renaissance - a period when there was an explosion of ideas across the arts and sciences.  This was only possible because the family's wealth was able to support artists from different disciplines who shared ideas, a lesson to companies that want to innovate.

What's true of people is also true of data. Not all data is created equally. As a result it tends to be put in silos determined by source (transactions, surveys, crm, etc). Different data has different degrees of meaningfulness;  transaction data tends to be narrow but very deep (telling you a lot about a very narrow field) whereas survey data tends to be broad but less deep. Combining data with different strengths can uncover new insights. Linking transaction data with survey data can identify broader behavior drivers, these can drive sales and increase customer engagement.

In our mind, silos are bad for data too. They prevent data owners from making new discoveries that arise from merging a customer's data.

Knowledge Leaps de-silos your data, creating a single-customer view. Allowing companies to look at the drivers, interactions and relationships across different types of data, whether its transactions,  surveys or CRM data.

 

Why Focus On Retail?

The main difference between a retailer built for the web and a physical retailer predating the internet, is that in any transaction in the former the concept of a unique customer is a necessary requirement to complete a sale. This isn't true of a traditional retailer. In most you can walk in, pay, and remain anonymous. In terms of growing an audience this puts the web-retailer at an advantage* over the traditional retailer.

What Knowledge Leaps can do is use data engineering and analytics to provide a customer view. A view of their business that is beyond the basket view. This allows retailers to manage and grow their audience in the same way the web-retailers do.

That's how Knowledge Leaps can help.

Why we want to focus on retail? We believe that retail is important to society. It puts people on the sidewalks and pavements of our cities,  making them vibrant places to live as well as attracting other businesses and services too.

*There are a few disadvantages for web businesses:  the cost of creating brand awareness and driving traffic, compare this to the traditional retailer that gets some free traffic by virtue of location as well as some free awareness building. Also, the cost of shipping products to customers and the cost of warehousing inventory.

Stories Vs. Predictions

Having worked within an industry (market research) for some time, I am intrigued how other occupations use data, particularly data scientists.  After a conversation with a new friend - a data scientist - last week I had a revelation.

Data scientists have created a new words to talk about data analysis. The ones that stand out are features and feature sets.   Quantitative market researchers talk about questions and surveys but never features. Essentially, they are the same thing; features are traits, attributes and behaviors of people that can be used to describe/predict specific outcomes.  The big difference is that data scientists don't care so much that the features are not human-readable (i.e. they can be read and understood like a book), as long as they help make a prediction.  For example, Random Forests make good predictors but aren't easily understandable. The same is true of Support Vector Machines. Excellent predictors but in higher dimensions they are hard to explain.

In contrast, market researchers are fixated on the predictive features being human-readable.  As data science has shown, a market researcher's predictions, their stories, will always be weaker than those of a data scientist. This in-part explains the continued trend of story-telling in market research circles.  Stories are popular, and contain some ambiguity, this ambiguity can allow people to take out from them what they wish. This is an expedient quality in the short term but damaging long term to the industry.

I think market researchers need to change, my aim with Knowledge Leaps is to try and bridge the gap between highly predictive features and human-readable stories.