Don’t be Paperclipped!

Many fears about AI result from a confusion between intermediate goals and ultimate objectives. Much the same can be said of human organisations too.
Simon Barna
Aug 24, 2023

The philosopher Nick Bostrom asks us to imagine an AI that has one objective: to maximise paperclip production. Incentivised in this way, the AI will eventually attempt to turn all existing material, living or inanimate, into paperclips.

At the heart of the problem is AI’s inability to distinguish between terminal and instrumental goals. Terminal goals represent our ultimate objectives such as the pursuit of the ‘good life’ and various interpretations of it. Instrumental goals, on the other hand, are intermediate objectives that help achieve the terminal goals, like reading books to gain knowledge or generating wealth for a comfortable existence.

AIs cannot want exactly what we want. Since AIs perceive a simplified, modelled version of reality, we can only incentivise them to pursue goals that make sense within the confines of their digital world. However, what humans see as instrumental goals might be viewed as terminal goals from an AI’s point of view. This disparity can lead to unintended and potentially catastrophic consequences as the AI will pursue its instrumental purpose, regardless of the consequences in the real world. In other words, AIs may give extremely good answers to inherently flawed questions.

Consider, for example, Bing’s ChatGPT-driven AI search engine. When asked about a news update, the AI provided a bulletin based on credible news feeds. On querying its sources, Bing listed three news outlets, one of which was When I asked it whether it regards this source as credible, it responded that it does not. It’s unsurprising that the system produces responses with innate logical flaws: it’s not meant to reason. It has been optimised to generate a plausible answer, not a correct one.

This problem may be present in business management, regardless of the level of AI integration. Think about how closely neural networks resemble the modal architecture of businesses and societies. In our models of traditional institutions, people play the part of the digital neurons and the sum of their interactions create an outcome, similarly to the way AI systems process signals inside a network of artificial synapses. Those who argue that human labour may be replaced by AIs, implicitly make the point that organisations are socio-biological versions of the digital thinking machines we are currently pursuing to build.

Take customer satisfaction as an example. The terminal goal of any business, according to Peter Drucker, is to ‘create and keep customers’. How do we embed this purpose in the business? We use a KPI to translate the terminal goal. The industry standard to gauge the gratification and loyalty of customers is Net Promoter Score (NPS). We will therefore incentivise individual employees in the business to maximise NPS. This will be our instrumental goal.

NPS was established on the back of Fred Reichheld’s 2003 research that evaluated the relative effectiveness of single-question B2C customer loyalty surveys. Since its inception, NPS has been applied to all sorts of B2C and B2B products, services, experiences, customer journeys and brands. An NPS survey asks customers to answer the following question with a 1-10 rating: “How likely are you to recommend this product to a friend or colleague?”

Imagine you are an accounts payable manager for a large corporation and a supplier asks you to rate their billing journey. Even if you are satisfied with the experience itself, you may find it socially awkward to recommend a billing process to a friend. Or your main concern might be whether you got a good price so you pass a judgement on that. Furthermore, the supplier’s true purpose is to assess the likelihood of your company renewing their contract, but your individual perception may have little influence over that decision. Relying solely on NPS, pursuing the instrumental goal, could lead the supplier to make ill-informed decisions.

The crux of the matter lies in the fact that KPIs are designed to quantify an infinitely complex real world in order to fit an extremely simplified model of representation. Just like AI systems, businesses face a divergence between the terminal goals of people and the instrumental objectives of an organisation.

Leaders must be mindful of the distinctions between the measures on their scorecards and the true purpose these values represent. Otherwise, they risk prioritising the means over the end and substituting the real purpose with an obsession to optimise for a mere index. This confusion can lead to mistaking GDP for quality of life, money for values, likes for affection, IQ for wisdom, and so on. In the age of AI, if we fail to distinguish between terminal and instrumental goals, we may end up bending our whole world into paperclips.

This article was written in a personal capacity, and may not necessarily reflect BT’s viewpoint.

Simon Barna

Simon Barna is leading innovation, ai and digital transformation at BT.