‘Yesterday's big data is today's small data’

Add bookmark

Enda Ridge is the Head of Data Science and Algorithms at one of the largest supermarkets in the UK and the author of Guerrilla Analytics – a practical approach to working with data.

He is an accomplished data scientist who sets up and grows data science teams to help businesses leverage data to stay competitive.

Ridge’s experience spans 15 years of consulting, software pre-sales and academic research. He has consulted to clients in the public and private sectors including financial services, insurance, audit, IT security and retail.

SEE ALSO: The Big Book of Customer Insight, Data & Analytics 2017

He is an expert in the agile delivery of practical data science where data and requirements change often and results must be explainable to high profile businesses and regulatory stakeholders. In this interview with CX Network, he talks about why it’s an exciting time to work in data, the successes of artificial intelligence, and why buzzword big data is overrated.

data analytics

As an introduction, can you share a brief overview of your background in data science and data analytics?

I've worked in analytics and data science for almost 15 years. I got into data during my Computer Science PhD, where I was analysing artificial intelligence algorithms to understand the problems they work on, where they break down and what affects their performance.

This got me into experimental design methods, statistics and of course analysing the experimental data I collected. I then worked in software development, using social networks to find fraud, and in several global consultancies doing forensics analytics to investigate fraud, bankruptcies, miss-selling, etc.

 

There is no such thing as 'big data'.


Time and time again I have witnessed teams struggling with the same fast-paced, pressured work environments where the inherent complexity of analytics and data science led to confusion and management difficulty. So I wrote a book, Guerrilla Analytics, on how to operate these types of teams efficiently and effectively with light-weight principles.

Most recently I've moved into helping enterprises and start-ups build and operate data science teams that deliver practical and explainable benefits at pace, as this seems to be a huge blocker for organisations getting started on their journey.

First things first, what do you think is the most overused buzzword in data science & data analytics industry that we should leave behind in 2017 as we move into the new year?

For me, it has to be 'big data'. There is no such thing. There has always been a technological barrier to the size and complexity of data that can be analysed. And that barrier is always moving as technology advances.

Yesterday's big data is today's small data, so businesses need to understand that what matters is robust analytics and scientific approaches to analysing business data and drawing conclusions. Statistics and mathematics will take care of how confident we are with the data we are able to process today.

 

"There is more to understanding our customers than averages."


Of course there are genuine applications where data volumes are incredibly large but for the vast majority of businesses starting on a data science journey, they should first focus on finding value with the data they have.

And what is the most underrated tool, tech or metric that not enough people know about or are using but they should?

My observation across many years leading analytics and then data science teams is that there is still not enough rigour in terms of how data is interpreted. The fundamentals of statistics and particularly experiment design are essential when we wish to draw conclusions from data that can be applied with confidence to business problems.

With science and analytics teams, you sometimes find an over-reliance on the 'program libraries' and visualisations, but not enough interrogation of the factors that might affect a model, the noise that cannot be removed from data, the inherent bias in data, assumptions of linearity, etc.

These challenges have been recognised by statisticians for over a century and it's important that this knowledge is not lost amid the hype surrounding all things 'data' and big data technology.

"There are two important things when it comes to action. The first is being prescriptive. The second is being able to automate action in data products."


There is more to understanding our customers than averages, and ranges and trials that are not properly designed can lead a business to draw the wrong conclusions. So in this era of technology hype, artificial intelligence hype and so on we must not lose sight of the value in thinking hard about the problem we are solving and finding the right data and approach to solve that problem. This is where experiment design is invaluable.

As you already mentioned, one data topic that is often touched upon within the industry is big data. How can organisations ensure that the masses of information gathered is used to drive real value to the business? In other words: how do you turn insight into action?

There are two important things when it comes to action. The first is being prescriptive. The second is being able to automate action in data products. On the first, it always disappoints me that so much of data analytics and data science stops at a 'descriptive' insight.

An analytics team might report "this is the current sales trend" or a data science team might report that "this model predicts your customers' next shops", but what is really needed is to go further with your business stakeholder and say "now this is the action you should take".

If this is the trend, then what should we be doing about it? If this model were to make its predictions about customers, what would that be worth?

SEE ALSO: 'Data and Analytics is the Essential Fuel of Personalisation'

The second important aspect of turning insight into action is being able to develop data products. If an analytics insight or a data science model and algorithm is going to support decision making again and again, it needs to be made available in software.

That means having access to infrastructure and small development teams that can mobilise quickly, can work with analysts and data scientists, are customer-focused and can make user-friendly software that supports business decision making.

Big Book

You've previously spoken at CX Network Live about setting up a data science capability, and how to overcome the key challenges when doing so. What have been your biggest learnings when implementing this? And if you could go back in time, is there anything you would do differently?

Time travel! That would be interesting! I think my biggest learnings have been around getting the right mix of people and around picking the right battles. I've learned that data science and analytics have a large number of dependencies. They need a particular technology stack, they need access to data and, in some cases, they need software developed to automate their decision making and reporting.

These days I would always make sure that engineers, tech infrastructure owners and product owners work alongside data scientists and analysts. Having them in separate teams creates friction. You need to be able to assemble mixed teams that build capabilities instead of seeing work as distinct to 'engineering' or 'analytics'

 

"The biggest challenge is probably trust. What is the right balance between trusting the machine completely and having the human in the decision loop?"


Picking the right battles means being able to choose projects that will be successful, that will earn you cheerleaders in the business and so will help raise awareness of the capability you are trying to build. These projects are a tricky mix of importance to the business, technical complexity, business engagement, data availability and of course monetary benefit. You would be surprised at how many teams become mired in complex projects or even whimsical analyses that will never change the business.

Looking at the future, many people are talking about the trends such as artificial intelligence, machine learning and smart devices. But, realistically, how will they really impact the data industry in 2018 and, more specifically, your strategy?

It really is an exciting time to be working in data. The great thing about the success of artificial intelligence tools, like search engines and digital assistants, is that AI is now a regular talking point in the boardroom. Most forward-thinking business leaders recognise the need to get on the bandwagon but they struggle to get started.

In 2018, I hope to see businesses continue to take small steps into the world of machine learning and artificial intelligence. They should think about how devices can capture valuable data about their customers, their processes and their decision making.

They should then think about how much of the mundane work of their staff could be automated or augmented with machine learning. This should be seen as an opportunity to free staff up to focus on thinking about what the results mean rather than churning out analyses they don't have time to think through.

 

"The great thing about the success of artificial intelligence tools [...] is that AI is now a regular talking point in the boardroom."


The biggest challenge is probably trust. What is the right balance between trusting the machine completely and having the human in the decision loop? There will be early successes but also failures, and it is important these failures do not slow down progress.

Finally, what piece of advice can you give customer insight, data science & data analytics leaders that are nearer the start of their customer data journey to ensure they launch on the road to success?

Watch my webinar! I talk about the main challenges you will face in the first year and how to overcome them. If you have a team already, have a look at my book Guerrilla Analytics, which has almost 100 practice tips to help teams do analytics and data science in an agile, explainable way. And, finally, get started with something small – a few people, small amounts of data and minimal technology dependencies.

This interview is from The Big Book of Customer Insight, Data & Analytics 2017. Click here for your complimentary copy of the full report.


RECOMMENDED