What data can do, and what you will need to do yourself

January 2nd by Joe Cardillo 3 0

People constantly oversimplify the case for using data.

You wouldn’t know this by looking at headlines; the Big Data movement is championed by a variety of thought leaders as a cure-all…

The graph for people who actually use Big Data vs. those who just talk about it probably looks something like this.

The problem isn’t really Big Data itself, rather, it’s the general mindset that people have about what data means / what you can do with it (well, that and they want to look smart).

Simply put, the data doesn’t lie but the people using it do, or at the very least our belief systems influence what we think it means—something that’s dangerously at odds with the common mantra “the data will tell you what’s important.”

Working with data is about more than observing, it’s about understanding.

Sadly, as technologist Gunther Sonnenfeld notes, many people don’t know what an insight really is.

An observation, he says, looks something like this:

Whereas an insight looks more like this:

If you’re fairly intuitive it’ll be easy to see that the distinction between observation and insight is crucial, but in order for the latter to be valuable and useful, there’s one more thing that needs to happen: it needs to be directly relevant to you and/or your business.

Large scale studies on leadership and motivation may give you insight into what employees need and want, but they don’t automatically give you insight into what YOUR employees need and want…or why your users / customers are with you or headed elsewhere.

This suggests a couple of things:

Medium and small data are contextually relevant and probably more critical to the vast majority of businesses than massive data caches and analysis.
Knowing what questions you are trying to answer before you start is a must, not a nice to have.

I was reminded of this the other day driving back after an oil change.

The pickup truck next to me at a traffic light was leaking a fairly steady stream of liquid from just behind the front axle, right below the engine.

Since I’m at least a moderately good citizen I started to roll down my window to let the driver know, when I noticed that the front left tire was wet, which led me to quickly discover that the back tires were wet, and that the back bumper was dripping too, all signs that the truck and just been through a car-wash.

Consider that in the span of just a few seconds I went from interpreting one piece of data that appeared alarming, to understanding several, and then by considering them in context I was able to determine that not only was the initial information not dangerous, it was actually a positive sign, that of a clean car.

Not only is this type of scenario possible in a business environment, it happens all the time.

After 6 years of working in project management, product operations, and growth roles I constantly think about measuring and understanding nearly everything because I know that data at its best merely explains how elements interact within an ecosystem you’ve created, and gives you insight into how healthy that ecosystem is.

If you’re responsible for understanding and acting on data—particularly if you’re a CEO or founder—it’s absolutely critical that you understand this distinction, and it’s one of the reasons that our core capability and favorite thing to work on at ArCompany is helping the C-suite get the right kinds of information, pull out the insights, and apply them in a way that’s directly relevant.

With that in mind, here are a few ways you can have a healthy (and growth oriented) relationship with data:

Be curious not just about the data your business generates, but also how it’s created.

One of the realities in nearly any modern business is that we are quickly overwhelmed with data—things like average time to purchase, how often people come back, when / if they refer others to you, if they came directly from a competitor’s site; these are all data points that you can have with even a handful of customers.

But none of these things are automatically valuable. If your average time on page goes up it could mean that a good chunk of people are interested in something you have on the page, that a bunch of them simply left the window/tab open, or even that they hate you and are studying your website, platform, or product for weaknesses.

Think back to the example of the truck dripping water. Among the possible questions: how fast is the liquid dripping, what kind of liquid is it, where is it dripping from, is it increasing or decreasing, if it’s from a specific place does that place have a bunch of liquid available to lose or just a small amount? Figuring out where and how your data is created will keep you from making the mistake of thinking that the data will tell you why it’s important.

Don’t let anyone tell you what to measure.

Having a lot of data also means making choices about what to look for, and what to label as sustainable growth.

It’s easy to get stuck in one of two places:

The top of the funnel, where vanity metrics rule i.e., “We got 5,477 shares and 877 referrals to the trial subscription page but nothing happened after that.”
The bottom line, where revenue is the only thing that matters.

This is where focusing on relevant medium/small data, and defining your questions / behaviors beforehand can make a huge difference.

If you have insight into what matters to your users and customers within the context of the ecosystem you’ve created (content, platform, product, etc…) then you can set a clear growth metric that actually matters, and takes care of your revenue goals.

Which leads to the last thing…

Start with smaller, more concrete, and more iterative metrics.

At a recent edition of the Analyze Boulder Meetup I had the opportunity to see Owen Zhang of DataRobot talk about what he’s learned competing in data science competitions (he’s one of the most successful at it).

Most of the slides in his presentation looked like the below:

Honestly, I really only understood about 15–20% of Owen’s presentation, but a couple of slides and some things he said really stuck out.

In particular, he made a fairly subtle point about what we do when we get a set of data. In data science competitions the goal is to build a predictive model with a limited set of training data, that should hypothetically also work with a larger data set (basically, you are trying to find out the features/constraints of the data).

He pointed out that when he gets a set of training data, he doesn’t use all of it but instead halves it and uses the output from his 1st model as input for his 2nd model. He also deliberately avoids over-fit, instead focusing on quickly building out the first model that works until it doesn’t work anymore.

This reminded me of something I once heard from a founder that I used to work with…

I’d rather learn quickly and build on that than fail fast.

— Adam Breckler

One reality of building an ecosystem within a company (more on that later) is that you get a ton of contrary information. For all the hype about being data driven, the truth is that ANY start-up or company quickly runs into more data than it can possibly use, and as Adam notes it’s far more important to be data informed than data driven.

Being data informed of course doesn’t mean that you never fail, it means that you experience relevant failure, and focus on what works rather than what doesn’t. That’s what learning fast is all about.

Your company’s reason for existing, its business case, its DNA—whatever you want to call it— at heart it’s not about making a specific amount of money or achieving unlimited growth, it’s about doing something for people, to align with and perhaps change their lives in a meaningful way.

And that’s what data can help you do.

photo credit: Unhindered by Talent via photopin cc

Joe Cardillo

Joe is a product/ops guy working with the ArCompany team on content, growth, and analytics. He digs media, design, startups, data, rocanroll, anything science-y, and thinking about how to become a better human.

Big data, data insights

susansilver says:

January 5, 2015 at 11:19 am

Fantastic. I hate this assumption about Big Data about how more data means better or more relevant answers. It introduces more ambiguities too. It becomes harder and harder to isolate elements. So you are right, data informed and not data driven is a much better way to think about it.

Data reflects something in the world that is happening, but it is always the human looking at the data who will interpret what that means.

JoeCardillo says:

January 5, 2015 at 12:41 pm

susansilver Definitely agree re: ambiguities, I’ve done a lot of work for with startups and the point at which you stop having obvious cause & effect and start having to think about your data ecosystem is earlier than most people think, probably in the first 100 customers and/or users.

Rob_Shrake says:

January 7, 2015 at 11:52 pm

jeanniecw AmyMccTobin @ProEMarket What data can do, and what you will need to do yourself http://goo.gl/kbLXoy