On Thursdays at noon Yhouse holds a lunch meeting at the Institute of Advanced Study, in Princeton. The format is a 15 minute informal talk by a speaker followed by a longer open-ended discussion among the participants, triggered by, but not necessarily confined to, the topic of the talk.  In order to share I am posting a synopsis of the weekly meetings.

Synopsis of Christoph Salge’s 11/16/17 YHouse Luncheon Talk

Presenter: Christoph Salge (New York University / University of Hertfordshire)

Title: A Short Introduction to Empowerment – an Information Theoretic, Intrinsic Motivation

Present: Olaf Witkowski, Ed Turner, Piet Hut, Michael Solomon, Stephen Lin, Renzo Comoletti, Christoph Salge.

Abstract: “Empowerment is a formalization of how much an agent is in control of its own perceivable future. This is captured by the channel capacity from an agent’s actuators to an agent’s sensor at a later point in time. In this short presentation, I will briefly introduce the formalism and idea behind empowerment. I will outline how empowerment relates to the concept of intrinsic motivation and show some recent applications that demonstrate the range of behaviors that can be created in different scenarios. In particular, I will talk about recent work looking at coupled empowerment maximisation in a human-AI system – and how this can be used to define some core companion duties.”

Christoph introduced himself as now working at NYU’s Game Computation Group.  He stated his goal for this talk is to introduce Empowerment and formalism.
     He began with Motivation – What should an agent do?  AI usually has a clear goal and asks how to achieve that goal.  But in nature, we don’t always see an immediate reward.  So, what drives non-goal directed behavior?
    He offered two quotes: Heinz von Foerster said, “I shall act always so as to increase the total number of choices.”  Rainer Oestereich said, “Agents should act so that their actions lead to perceivably different outcomes. (In German, Effizienzdivergenz).


     Empowerment is defined as an agent’s potential, perceivable influence on the world.  Formally, this means the maximum potential causal information flow (channel capacity) from activators to sensors at a later point in time.  He showed us the equation for channel capacity as

 

Screenshot 2018-01-04 22.38.47.png


 (p is probability, S is sensation at a given time, A is action at a given time.)
Olaf asked, Why define the channel as a probability?
Christoph answered that is one way to describe the channel.  Another way could be as a mutual distribution, but since we are changing the input distribution it is better to describe the channel as a conditional probability.
He showed us a diagram depicting the concept:
An Action at time t leads to a Result at time t+1 which is Sensed at time t+1 leading to another Action at t+1.  This leads to another Result at time t+2 again Sensed at t+2 resulting in an Action t+2 and an additional Result t+3. The Result t+3 is Sensed at t+3 and another Action t+3 leads to a further Result t+4 and the process goes on.  
Essentially, this says, an agent’s sequential activity choices lead to a variety of outcomes.
The diagram can be expanded to involve more than a single agent.
The result of this model is an equation that provides a formal mathematical and therefore programmable description of empowerment

Screenshot 2018-01-04 22.52.32.png


     This formal structure has some applications.  One is collision avoidance for unpredictable agents.  The example he gave was, you could easily avoid colliding with someone walking down the street towards you in a straight line, but if that person is staggering from side to side you would give them additional space when you passed by.  Other applications include providing different empowering perspectives for a bipedal robot, or empowerment enhanced navigation. Walking is empowering.
     He went on to describe Intrinsic Motivation, deriving this idea from psychology and from robotics. For psychology, intrinsic motivation means doing an activity for its inherent satisfaction returns rather than for some separate consequence.  In robotics, an activity is intrinsically motivated if its interest depends primarily on the collation or comparisons of information from different stimuli and independently of their semantics (i.e. of their symbolic meaning). (Oudeyer & Caplan 2008).  A motivation is inherent in being an agent.  If you changed motivations, you would be a different agent.
     He illustrated these ideas with a computer game.  In 3 dimensional Blockworld, you provide rules for what an agent (in this case a blue block) can do to move or to rearrange neighboring blocks. The agent acts to maximize choices (including those choices it would not chose to take). In one simulation of the game, the agent creates a staircase.  A flying agent forms a different structure. Different embodiment + empowerment = different structures.
     Expanding the game, he considered the perspective of a Player plus a Companion agent. In this game the player’s motivation is self-preservation.  The companion’s motivation is protection of others. Both agents require coordination of their actions.  So, in the simulation the companion sacrificed itself to protect the player.  These illustrated ways in which the formalization of empowerment can be applied to games, but the same theory might be applied to human behaviors in defined situations.

Q: Piet: The way you derive behavior from simple rules is very good. It reminds me of the Game of Life.
A: This applies to many models in Artificial Life – the idea that complex behavior arises from the interaction of simple rules. 
Q: Piet: Considering all the hours spent playing games, can you mine the data to identify underlying principles?  He is reminded of informatics in the 17th century. Between Galileo and Newton there was great progress.  But if you lived in that time, you would not see the progress. This is like the present state of Informatics or computer science.
A: Yes.  In Deep Learning, we really don’t know what is going on. What properties do neural networks need to have? This is being investigated in Google Deep Mind and elsewhere.
Q: Piet: You used ingredients for video games in your illustrations, like how many life points the agent has. The designers of video games are driven by what players prefer.  This is like evolutionary pressure.  So, perhaps we can look at what has driven video game successes.
A: A lot of games are constanctly adapted based on this kind of evoluationary pressure from the consumer. Take for example the game, Candy Crush,  where gems fall on the screen.  More than thirty designers of this game are looking at what makes players quit or continue the game. They are constantly refining the game to make it as attractive to play as possible.   The same process goes on in real-life in advertising.
Q: Piet: Can we look at the three or four most popular games to see what has evolved in the past twenty-five years?
A:  People want to do this but have not been successful in identifying the characteristics of a successful game so far.  He is presently reviewing this with human subjects as well as with adaptive AI subjects.
Q: Piet: In evolution of Life you can see landmarks – major transition in evolution.  Do these exist in gaming evolution?
A:  Yes, there are similar major transition in the development of games. But many were based on technology advances rather than on user preferences. For example, in the last decade we see many more multi-player games as the internet capacity grew to make this kind of game possible..
Q: Olaf: Could you distinguish what is technologically linked from what is intrinsic in the human mind that has not changed?
A:  I imagine that would be possible, but difficult. We do see a move towards and then away from very gruesome simulations, like shooting someone on the street in very realistic videos.  We also see the growth of episodic rewards. “Congratulations! You have done it! Now try this…” to keep players in the game.
Q: Michael:  In the example you showed, how do you program altruism versus self-preservation in the player and companion?
A: The goal is still mainly to preserve the player (as a companion) There is not really a trade off between altruism and self-preservation in the game theoretic sense. The companion is also interested in self-preservation, but mainly as an instrumental goal, so it can help the player later.
Q: Michael:  In my very limited experience with computer games I see a great deal of violence and aggression. Why is that necessary?
A:  Yes, games are often gruesome.  But other games are not.  Minecraft is about building something.  In another example, Civilization, you can take an environment from the stone age to the modern age by building cities and constructing other things. . There are games without combat, but combat is easy to model. 
Q: Ed: People who play games may try to satisfy needs they can’t get in other activities.  Some lack hand-eye coordination. Some have other restrictions.
A:  Also, games break boundaries.  You can do things in games that transgress normal behavior.
Q: Ed: Some games are sexual or aim at other motivations.  It is likely that there are so many varied human motivations in so many varied individuals that it would be impossible to mine the gaming data to identify them all. He is reminded of a study from the 70’s showing that cats have separate motivations satisfied by tracking prey, catching prey, eating prey, etc. There may not be a single goal oriented behavior but multiple modular satisfactions.
A: How do you learn to play a game you have never played before, like chess?  Humans pick out the salient rules much faster than current AIs can.
Q: There are games that accommodate what the player brings to it.
A: Currently developers make money on this, so they keep the information to themselves.  Companies are reluctant to share their data.
THE GAMING INDUSTRY MAKES MORE MONEY THAN MUSIC, TELEVISION, AND MOVIES! This has been true since about ten years ago.
Q: Steven: We now have competitions in teams and with professional game players that earn more than many professional athletes.
A: There are competitions in Korea where a hundred thousand spectators sit in a stadium and watch game players compete for a championship.  Most of the professionals are young men between the age of 16 and 19 as some of the necessary skills appear to wane after that.
Q: Olaf: Engagement is interactive, unlike some movies which are passive.
Michael: I have considered a spectrum in which reading was at one extreme, requiring a lot of activity on the part of the reader to imagine what the writer is trying to create, while TV is at the other extreme, where the viewer just sits on the couch passively and the medium passes by. In movies, the viewer can be involved passively or very actively in trying to predict the outcome or in reacting emotionally to the story, chase scenes, or characters.  But computer games, particularly as you get more realistic simulations and even Augmented Reality, can be much more interactive. 
A: Visual novels are considered a branch of games. 
Q: Renzo: There are implications of micromanaging the player’s pleasure.  The machine knows the preferences better than the player.  Maximizing choices is a different motivation.
A:  Maximizing choices remains a factor.  If you cannot progress in a game, what happens? You get stuck at that level and soon lose interest. Some games are very demanding, others less demanding. Micromanaging occurs outside of gaming, in advertising and in politics.
Q:  Some games are cultural expressions. Many are Californian from Silicon Valley. Minecraft is Swedish.
A: In the 1990’s Germans produced games that were played by Germans. Much of gaming is fashion. Games go in or out of fashion and new games are made to fit into cultures.

We ended our discussion here.

Respectfully,
Michael Solomon, MD

Possible Sources:

Intrinsically Motivated General Companion NPCs via Coupled Empowerment Maximisation
Guckelsberger, Christian; Salge, Christoph; Colton, Simon, 2016

Empowerment as Replacement for the three Laws of Robotics
Salge, Christoph; Polani, Daniel, Frontiers in Robotics and AI, 2017


Predicting Player Experience without the Player.: An Exploratory Study
Guckelsberger, Christian; Salge, Christoph; Gow, Jeremy; Cairns, Paul; Proceedings of the Annual Symposium on Computer-Human Interaction in Play, 2017

Comment