8/19/19

Help Your Users Trust You

by Robert de Graaf

Trust is the core of most business relationships. At the very least, without it, most relationships fail.

In relation to data science projects, trust is crucial to both the data scientist and their client. The data scientist needs to win their client’s trust to be allowed to work on the client’s problem, and to convince the client to implement their proposed solution. From the client’s perspective, trust is all they have, as they don’t have the knowledge or skills to evaluate the data scientist’s work themselves.

More sensitive still is the question of data – many data science models require the clients to let data scientists have access to sensitive data – whether the sensitivity is due to privacy, commercial concerns or some other reason.

A classic example of what can happen when data scientists fail to establish trust was recently published on Medium in an article titled "Death of A Startup". Gokul Rajaram tells the story of a data startup that failed because the business couldn’t persuade its potential clients to share their data—a real and difficult problem that I have also faced, albeit with less severe consequences.

Trust is difficult to establish and can be destroyed easily. In any context, an easy way to destroy trust is to promise more than can be delivered. However, the margin of error is slimmer for a data scientist trying to convince a non-data scientist who is being asked to put faith in something they don’t fully understand.

In their book The Trusted Advisor, David Maister, Charles Green, and Robert Galford establish ‘the trust equation’ – the ingredients needed for a consultant (or really anyone offering their services as an expert) to maintain their clients’ trust. Their trust equation for advisers includes credibility, reliability, intimacy, and self-orientation (opposite of selflessness). Although intimacy may not be applicable to a model, credibility and reliability certainly are.

Overall, in fact, people are easier to trust than models, but there are fewer dimensions to trust in relation to a model. As a result, it is more important for a model – and the people presenting and promoting that model – to get everything right to maintain trust from users. Small failings in one of the dimensions can be fatal as there is little available as a counterweight.

There are two ways to over-promise that intuitively apply in the data scientist context, although the usual ways that you can over promise in any project (for example, promising delivery sooner than is really possible) also apply.

One way is to promise that your model works across more contexts than is realistic. In fairness, there is tremendous pressure to do this – business users want models that ‘scale’, a concept that is sometimes taken to mean that the model needs to work on every problem that the client can think of. If the data scientist can’t ensure the client has expectations that are in line with what is achievable, disappointment is inevitable.

In the wider science world there is an analog of this kind of over-generalisation in the form of the "In Mice" Twitter thread, where coverage of biological research done in mice, but reported as if it had already been proven to generalise to humans, is lampooned. For the original researchers, the fact of the research being done in mice probably isn’t itself an issue – the point is to make a discovery in relation to mice that has a tiny chance of generalising to humans. However, in the early stages, it’s not certain the findings will generalise to humans, so if someone purports that they can be, they can’t be trusted.

For data scientists, however, the idea should be that most of the time the research is done in its intended context and has a strong chance of succeeding in that context. Therefore, although researchers at the cutting edge sometimes need to start a long way from the ultimate target of the research, data scientists should be able to be more practical from the outset. You should be able to present a proof of concept that doesn’t require a leap of faith that it will work in the customer’s desired context.

The second way to over promise is to present model metrics that are too good to be true. In fact, if the model metrics seem too good to be true, it is very likely that it’s because you’ve made a mistake somewhere.

The idea that an impressive metric for a data analysis exercise often means that someone has done something wrong is called Twyman’s Law: "Any data that looks impressive is usually wrong." One common reason for Twyman’s Law is that data leaks are common – where the answer is mistakenly included as a model input – and genuinely difficult to eliminate. Therefore the most appropriate response when you see that you have beaten everyone else’s accuracy metric on a particular should be anxiety that a mistake has found its way into your work. If you can’t find the mistake, check with subject matter experts and anyone you can find who has worked on the problem themselves or just anyone who can cast a critical eye over what you’ve done.

Giving results from your analysis that are too good to be true will quickly undermine the credibility of your model. People may not challenge the results in front of you – they may even leave your presentation temporarily believing them. However, over time, the real level of performance will become clear as the real life error rate fails to achieve its promise.

These different kinds of modelling defects are really different symptoms of the same problem – basically intellectual overreach or hubris. The temptation is to get a better sales pitch by promising better results than are proven right now, usually with the sincere belief that the data scientist’s skills correctly applied will ensure the results come to fruition.

The reality is that like a student who cheats on the in-class tests and assignments - ensuring the teacher can’t discover their weaknesses - if you attempt to convince people by exaggerating the capabilities of your model, it will come back to bite you.

Some people want to trust you and your model – they don’t need to be convinced by exaggerated claims. Other people will treat your model more skeptically – they are actively looking for claims that don’t stack up, and if they find one, it will ruin any chance you had with them. Keep the claims you make about your models grounded, and you can avoid disappointing your customers while maintaining their trust.

About the Author

Robert de Graaf is currently a data scientist at RightShip, and was central to the development of the algorithm currently used in the Qi platform to predict maritime accidents, among other models. He initially began his career as an engineer, at different times working in quality assurance, project engineering, and design, but soon became interested in applying statistics to business problems and completed his education with a master's degree in statistics. He is passionate about producing data solutions that solve the right problem for the end user.

This article was contributed by Robert de Graaf, author of Managing Your Data Science Projects.