A Beginner’s Guide to Deep Reinforcement Learning

The game of Go has simple rules but complex play. Each turn, a player has something in the vicinity of 2 x 10170 board positions to consider. Experienced players learn what is likely to work by trial and error over years of play in a process known as reinforcement learning.

So what do you get when you give artificial intelligence (AI) the data from thousands of games between professional Go players? AI that beats the top-ranked human Go player – AlphaGo. That is deep learning in action.

But what if instead you teach AI the rules of Go and let it play millions of games between itself? Deep reinforcement learning enables AI to teach itself by creating its own data (the millions of games) and analyzing the moves to arrive at the best one. Like a learning human, AI adjusts its responses according to failure or success to improve the outcome. It just does it at a scale and speed well beyond human capability.

Deep reinforcement learning needs to work inside a structure. This takes into account the context of the environment – whether that is the rules of Go, or the market in the case of your campaign – before you set a goal. An AI with deep reinforcement learning will then be able to help you with your strategy and actions based on the lessons learnt not only from previous campaigns, but also from the scenarios it has played out in its own internal iterations to give you an understanding of what is likely to happen.

It will also continue to learn as you roll your campaigns out, figuring out not just what works and what doesn’t but analyzing factors such as profitability, so you can optimize future campaigns by lowering your cost per lead, for example, or by targeting users who are likely to spend more.

Consider a campaign in which you want to maximize the number of app installations. You have a budget, so the aim is to obtain the highest number of installations for the amount you have (goal). You need to figure out where to allocate your budget and what to set as the bid price (actions). Using the deep reinforcement learning technique, AI will suggest a strategy based on its understanding of budgets and prices to find the best platforms and timing. It will suggest actions for you so you can leverage the most advantageous opportunities. And if the environment changes, it will learn what works and what doesn’t much quicker than a human marketer.

Deep reinforcement learning is ideal for complex environments where there are several alternative paths, such as marketing where you are dealing with human behavior. Its most valuable advantage is that it learns from mistakes in order to optimize quickly.