Boss Level#
Action Space |
|
Observation Space |
|
Reward Range |
|
Creation |
|
Description#
Command can be any sentence drawn from the Baby Language grammar. Union of all competencies. This level is a superset of all other levels.
Mission Space#
Action mission space:
“go to the {color} {type} {location}”
or
“pick up a/the {color} {type} {location}”
or
“open the {color} door {location}”
or
“put the {color} {type} {location} next to the {color} {type} {location}”
{color} is the color of the box. Can be “red”, “green”, “blue”, “purple”, “yellow” or “grey”.
{type} is the type of the object. Can be “ball”, “box” or “key”.
{location} can be “ “, “in front of you”, “behind you”, “on your left” or “on your right”
And mission space:
Two action missions concatenated with “and”
Example:
go to the green key and put the box next to the yellow ball
Sequence mission space:
Two missions, they can be action or and missions, concatenated with “, then” or “after you”.
Example:
open a red door and go to the ball on your left after you put the grey ball next to a door
Action Space#
Num |
Name |
Action |
---|---|---|
0 |
left |
Turn left |
1 |
right |
Turn right |
2 |
forward |
Move forward |
3 |
pickup |
Pick up an object |
4 |
drop |
Unused |
5 |
toggle |
Unused |
6 |
done |
Unused |
Observation Encoding#
Each tile is encoded as a 3 dimensional tuple:
(OBJECT_IDX, COLOR_IDX, STATE)
OBJECT_TO_IDX
andCOLOR_TO_IDX
mapping can be found in minigrid/core/constants.pySTATE
refers to the door state with 0=open, 1=closed and 2=locked
Rewards#
A reward of ‘1 - 0.9 * (step_count / max_steps)’ is given for success, and ‘0’ for failure.
Termination#
The episode ends if any one of the following conditions is met:
The agent achieves the task.
Timeout (see
max_steps
).
Registered Configurations#
BabyAI-BossLevel-v0