This page requires Javascript. Please enable it for `https://neuripsanon.github.io/results/`

Examples of permutation-invariant reinforcement learning agents

In this work, we investigate the properties of RL agents that treat their observations as an arbitrarily ordered, variable-length list of sensory inputs. Here, we partition the visual input from CarRacing (Left) and Atari Pong (right) into a 2D grid of small patches, and randomly permute their ordering. By processing each input stream independently, and consolidating the processed information using attention, these agents can still perform their tasks even if the ordering of the observations is shuffled several times during an episode.

The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning

This page contains an interactive demo and videos of experimental results that accompany our NeurIPS2021 submission. As the paper is under review, we'd appreciate it if you do not share this link with others.

Permutation Invariant Cart-Pole Swing Up Demo

A permutation invariant network performing CartpoleSwingupHarder. Shuffle the order of the 5 observations at any time, and see how the agent adapts to the new ordering of the observations.

Interactive Demo

Supplementary Video of PyBullet Ant Results

PyBullet Ant with a permutation invariant policy.
The ordering of the 28 observations is reshuffled every 100 frames.

Supplementary Videos of Atari Pong Results

Atari Pong base task (left). Modified shuffled-screen task (right).
No occlusion. Observations reshuffled every 500 frames.

70% Occluded, Shuffled-screen Atari Pong.
Observations reshuffled every 500 frames.

Supplementary Videos of Car Racing Results

CarRacing base task (left). Modified shuffled-screen task (right).

KOF background.

Mt. Fuji background.

DS background.