At a glance: AndroidEnv
AndroidEnv is a new layer to enable Reinforcement Learning agents to be trained on Android OS, developed by Deepmind, the quite famous AI-branch from Alphabet. It sits between Android and ADB on the one side and an agent that’s getting trained on the other side. If you don’t know, “agent” refers to a machine-learning application you can code and run. AndroidEnv is available as an open-source repository on Github.
A realistic access to Android
AndroidEnv aims at providing an environment that’s as realistic as possible for training, therefore the applied constraints match those we can expect for humans that interact with Android as well:
- an agent can primarily read the pixels that Android renders; for this feature, an interface that simulates a touchscreen is used for
- the action space (set of available options to interact with the environment) is composed of several simple actions, such as touch, lift or remove
The goal of AndroidEnv is to push the boundaries for Reinforcement Learning. Similar to OpenAI’s “Universe” platform, which enabled agents to interact with scenes via an interface for mouse and keyboard, Deepmind’s latest creation allows developers to train their models in one of the most challenging environments imaginable.
What I mean by that is that prior models had access to their targets via a special interface that simplified interaction a lot. Also, AndroidEnv doesn’t provide much additional information apart from the pixels of the screen (more on that later), which makes its use so realistic.
A closer look at interaction
As mentioned, AndroidEnv provides a set of raw actions that all simulate basic movements a human would also use when interacting with a touchscreen. These raw actions are complemented by a set of gestures that allow effective control by the agent, such as swiping or scrolling.
Interactions with AndroidEnv also outline one of the great challenges on this platform: depending on the open app (or task in general), they can differ drastically in meaning. A swipe in one app might be used in a completely different context inside another app.
Also important: agents on AndroidEnv have to deal with the real time nature of a real time OS, such as Android. This means that there’s no waiting from the application until the agent sets the next action - the OS simply keeps on going. This is a particular hard challenge for agents to overcome. AndroidEnv provides a small compatibility tool to avoid false positive inputs in case the agent takes too long to react (such as leaving a tap while processing, which leads to a long press interpreted by Android).
Defining tasks that have to be accomplished
A task defines a specific problem that has to be solved by an RL agent. To allow a proper training of your agent, tasks capture certain data aspects that are relevant:
- “episode termination conditions”: when is the task done or when has it definitely failed
- “rewards”: providing numerical feedback for accomplished goals
- other applications that can be interacted with for the task
To enable meaningful and relevant feedback, AndroidEnv has access to ADB, the “Android Device Bridge”, which is a stream of logs from the device and therefore allows a task to observe it and trigger predefined signals when necessary.
A sea of possibilities
The possible effect AndroidEnv can have on the overall development of AI and Reinforcement Learning in particular can’t be overstated. It not only allows agents to learn how to interact with Android (“open Maps and search for a nearby sushi restaurant”), but also to apply various other techniques for problem solving, depending on the app. For example, an agent can be trained on playing games (all games on Android are available!) for developing long term winning strategies.
AndroidEnv acts as the gate that has been opened for anyone to write ML models that can be trained on practically every possible task. It’s mind blowing!
Setting sail
As I’m not a Machine Learning engineer, my knowledge is only surface-level. This article acts as an introduction to AndroidEnv, but for more information please check out the addendum with all relevant links, including the Github-repository.