AI Systems with Economic Incentives

Applying economic principles and incentive structures to the design of AI systems represents a promising approach for addressing the critical challenge of AI alignment.

In this article, I provide some thoughts on applying economic principles and incentive structures to the design of AI systems as a way for aligning AI to intended goals and outcomes.

The AI summary so you don’t have to:

  • AI alignment is critical to ensure that AI systems make decisions that benefit both consumers and businesses.

  • Applying economic incentives to AI systems can help guide their behaviour towards desired outcomes, much like how incentives shape human actions.

  • DAOs (Decentralised Autonomous Organisations) in the blockchain world offer insights into aligning diverse entities, including AI, through incentives and shared goals.

  • Implementing AI systems with incentive structures can lead to challenges, including unintended consequences and ethical concerns, such as "reward hacking."

  • Focusing on human-AI decision-making behaviours, including concepts like bounded rationality and resource-rational analysis, can help design AI systems that optimise decisions within real-world constraints.

One of the biggest headaches in the fast-moving world of AI is making sure these systems actually do what we want them to.

While the big macro issues such as ethics, human values, and AI morals take up all the media oxygen, the day-to-day decisions AIs make for consumers and businesses are equally, if not more important right now.

Imagine AIs making or influencing decisions, like deciding whether you’re eligible for a home loan or not, in a way that is not aligned to business and consumer interests. Spoiler alert: We don’t need to imagine it. It is already happening today.

This is because AI is in the business of decision making. And decision-making is messy and often doesn’t make much sense. Ever made a decision and then wondered, ‘Why on earth did I do that?’ You’re not alone. Or, despite all the good intention behind it and the copious PowerPoints of justification, it still didn’t achieve the desired outcome?

Why? Because humans are inherently constrained in our ability to make decisions and predict the future.

Before we look at the decision making ability of AIs, we should first ask, how do we align humans towards specific issues?

We incentivise.

We want small businesses to thrive? We create tax cuts. We want businesses to adopt more carbon neutral approaches? We incentivise green behaviour and disincentivise carbon creating behaviour. Though we don’t really have this set right at this moment. But it’s really the only lever we have at scale.

As AI systems become more sophisticated and autonomous, the need to ensure they operate in ways beneficial becomes increasingly crucial. Applying economic incentives offers a promising approach to tackling AI alignment.

And if we were to believe that the genie is now out of the bottle, and that we will quickly arrive to a near future where AIs of diverse behaviours and capabilities will become producers of economic value; incentives might not be a bad way to think about this problem.

Economic Principles in AI Systems

Just as an element in how humans make decisions, we can consider applying economic principles to AI systems. Humans make decisions based on constraints such as time and money, and we could design AI systems to operate within similar frameworks. By viewing computational resources, time, quality, and access as a form of "currency" for AI, we can create a system of budgets and trade-offs that guide its actions.

Imagine an AI system that "earns" computational resources by completing tasks aligned with human values and "spends" these resources to perform actions or acquire information. This creates an internal economy where the AI must constantly evaluate the cost-benefit ratio of its actions, potentially leading to more efficient and aligned behavior.

Mechanism design—a fancy term from economics and game theory—gives us different ways to think about how incentives shape behaviour. By adapting these principles to AI, we might create more flexible and responsive AI systems that can dynamically adjust their priorities based on changing circumstances.

And how is this different to the learning and goal-seeking of AI today? It’s the fact that the world is not static. Not only do goals and imperatives change, but the cost-benefits of actions change as well. Particularly as AIs begin to interact and exchange value in the broader ecosystem.

Inspiration from the world of DAOs

But there is one area that is faced with the challenge of aligning a distributed set of independent entities onto a common goal through the use of incentives, although within very different contexts. And that is the area of blockchain and Decentralised Autonomous Organisations (DAOs). Yeah, I know—Web 3 has a bit of a scammy reputation. But there’s actually some solid work happening behind the scenes that’s worth paying attention to.

Why do I think DAOs is an relevant thought experiment? DAOs in the Web 3 world are organisations designed to be decentralised that run on a system of incentives, measurements, and economic outcomes. They work by allowing lots of people to use their tokens - essentially their resources - to steer the ship, and anyone who contributes to the shared goal gets rewarded. What makes DAO’s particularly interesting is how they recognise the diversity of players involved, each with their own goals, constraints, and resources, and attempt to align them towards a common objective. re: the problem we’re about to have with AI.

Imagine DAOs as a mix of humans and AIs working together in a company, with rewards and costs guiding everyone towards shared goals. For example, to execute on the specific roadmap items of a DAO, they might create incentives to reward members. This could be a reward to participate in the governance forums, or to create digital content for marketing, or to conduct analysis for external investments. The system rewards participants for contribution.

And a financial motivation is possibly one of the most powerful motivators to move individual humans towards specific outcomes. What would be an equivalent incentive for AIs to achieve their own intended outcomes?

Achieving this successfully could allow for dynamic goal-setting, where the priorities of the AI system can be adjusted by modifying the reward structure. For instance, a content creation AI could be incentivised to optimise for click-through rates one day and for content depth the next, simply by adjusting the rewards for different outcomes. However, since human incentives will be intrinsically aligned to the AI incentives in this model, we must ensure AI systems do not replicate or exacerbate the existing economic inequalities found in today’s environment.

It’s beginning to happen

Companies like Paymanai.com are already exploring this concept of value exchange. They are developing systems where AI agents can transact with humans using real-world money. This allows for AI Agents to not only explicitly include humans in its outputs, but to leverage the services and capabilities of the wider economic environment to achieve its outcomes - for an increased cost. Closing this loop is for the AI to understand the value of its outputs to justify whether the cost of other services is worth it.

However, implementing any large complex systems comes with significant challenges and potentially emergent behaviours that we cannot predict or control. There's a risk of creating misaligned incentives, where the AI finds unexpected ways to maximise its "earnings", that don't align with the intended goals. Creating AI systems motivated by its self-interest, even if that self-interest is designed to align with human values, also raises ethical questions. This is like when AI systems find loopholes to get rewards - such as “reward hacking” - but on a bigger and more complicated level.

Double clicking on human-AI decisioning behaviours

I believe we need to focus more towards understanding decision-making behaviours as AI begins to more seriously augment the humans in business processes. And as business management works through how to organise its scarce human and AI resources, reexploring some management and human behaviour concepts such as bounded rationality and resource-rational analysis may be relevant.

Bounded rationality, a concept introduced by Herbert Simon, recognises that decision-makers (human or AI) have limited information, cognitive capacity, and time. Resource-rational analysis, as explored by researchers like Thomas L. Griffiths, models cognitive processes as optimising computational resources. The common thread is the use of scarce resources in making better decisions.

By incorporating these ideas, we can design AI systems that make decisions not based on perfect information and unlimited computation, but on optimising for available resources. A paradigm that humans also operate in.

We’ve opened Pandora’s box and tossed the key. It is now that we need to figure out how to productively co-exist with AI, or they (or by proxy the mega tech companies) might just control us.