Action Bonus#

class minigrid.wrappers.ActionBonus(env)[source]#

Wrapper which adds an exploration bonus. This is a reward to encourage exploration of less visited (state,action) pairs.

Example

>>> import gymnasium as gym
>>> from minigrid.wrappers import ActionBonus
>>> env = gym.make("MiniGrid-Empty-5x5-v0")
>>> _, _ = env.reset(seed=0)
>>> _, reward, _, _, _ = env.step(1)
>>> print(reward)
0
>>> _, reward, _, _, _ = env.step(1)
>>> print(reward)
0
>>> env_bonus = ActionBonus(env)
>>> _, _ = env_bonus.reset(seed=0)
>>> _, reward, _, _, _ = env_bonus.step(1)
>>> print(reward)
1.0
>>> _, reward, _, _, _ = env_bonus.step(1)
>>> print(reward)
1.0

Dict Observation Space#

class minigrid.wrappers.DictObservationSpaceWrapper(env, max_words_in_mission=50, word_dict=None)[source]#

Transforms the observation space (that has a textual component) to a fully numerical observation space, where the textual instructions are replaced by arrays representing the indices of each word in a fixed vocabulary.

This wrapper is not applicable to BabyAI environments, given that these have their own language component.

Example

>>> import gymnasium as gym
>>> import matplotlib.pyplot as plt
>>> from minigrid.wrappers import DictObservationSpaceWrapper
>>> env = gym.make("MiniGrid-LavaCrossingS11N5-v0")
>>> obs, _ = env.reset()
>>> obs['mission']
'avoid the lava and get to the green goal square'
>>> env_obs = DictObservationSpaceWrapper(env)
>>> obs, _ = env_obs.reset()
>>> obs['mission'][:10]
[19, 31, 17, 36, 20, 38, 31, 2, 15, 35]

Direction Obs#

class minigrid.wrappers.DirectionObsWrapper(env, type='slope')[source]#

Provides the slope/angular direction to the goal with the observations as modeled by (y2 - y2 )/( x2 - x1) type = {slope , angle}

Example

>>> import gymnasium as gym
>>> import matplotlib.pyplot as plt
>>> from minigrid.wrappers import DirectionObsWrapper
>>> env = gym.make("MiniGrid-LavaCrossingS11N5-v0")
>>> env_obs = DirectionObsWrapper(env, type="slope")
>>> obs, _ = env_obs.reset()
>>> obs['goal_direction']
1.0

FlatObs#

class minigrid.wrappers.FlatObsWrapper(env, maxStrLen=96)[source]#

Encode mission strings using a one-hot scheme, and combine these with observed images into one flat array.

This wrapper is not applicable to BabyAI environments, given that these have their own language component.

Example

>>> import gymnasium as gym
>>> import matplotlib.pyplot as plt
>>> from minigrid.wrappers import FlatObsWrapper
>>> env = gym.make("MiniGrid-LavaCrossingS11N5-v0")
>>> env_obs = FlatObsWrapper(env)
>>> obs, _ = env_obs.reset()
>>> obs.shape
(2835,)

Fully Obs#

class minigrid.wrappers.FullyObsWrapper(env)[source]#

Fully observable gridworld using a compact grid encoding instead of the agent view.

Example

>>> import gymnasium as gym
>>> import matplotlib.pyplot as plt
>>> from minigrid.wrappers import FullyObsWrapper
>>> env = gym.make("MiniGrid-LavaCrossingS11N5-v0")
>>> obs, _ = env.reset()
>>> obs['image'].shape
(7, 7, 3)
>>> env_obs = FullyObsWrapper(env)
>>> obs, _ = env_obs.reset()
>>> obs['image'].shape
(11, 11, 3)

Image Observation#

class minigrid.wrappers.ImgObsWrapper(env)[source]#

Use the image as the only observation output, no language/mission.

Example

>>> import gymnasium as gym
>>> from minigrid.wrappers import ImgObsWrapper
>>> env = gym.make("MiniGrid-Empty-5x5-v0")
>>> obs, _ = env.reset()
>>> obs.keys()
dict_keys(['image', 'direction', 'mission'])
>>> env = ImgObsWrapper(env)
>>> obs, _ = env.reset()
>>> obs.shape
(7, 7, 3)

No Death#

class minigrid.wrappers.NoDeath(env, no_death_types: tuple[str, ...], death_cost: float = -1.0)[source]#

Wrapper to prevent death in specific cells (e.g., lava cells). Instead of dying, the agent will receive a negative reward.

Example

>>> import gymnasium as gym
>>> from minigrid.wrappers import NoDeath
>>>
>>> env = gym.make("MiniGrid-LavaCrossingS9N1-v0")
>>> _, _ = env.reset(seed=2)
>>> _, _, _, _, _ = env.step(1)
>>> _, reward, term, *_ = env.step(2)
>>> reward, term
(0, True)
>>>
>>> env = NoDeath(env, no_death_types=("lava",), death_cost=-1.0)
>>> _, _ = env.reset(seed=2)
>>> _, _, _, _, _ = env.step(1)
>>> _, reward, term, *_ = env.step(2)
>>> reward, term
(-1.0, False)
>>>
>>>
>>> env = gym.make("MiniGrid-Dynamic-Obstacles-5x5-v0")
>>> _, _ = env.reset(seed=2)
>>> _, reward, term, *_ = env.step(2)
>>> reward, term
(-1, True)
>>>
>>> env = NoDeath(env, no_death_types=("ball",), death_cost=-1.0)
>>> _, _ = env.reset(seed=2)
>>> _, reward, term, *_ = env.step(2)
>>> reward, term
(-2.0, False)

Observation#

class minigrid.wrappers.ObservationWrapper(env: Env[ObsType, ActType])[source]#

Superclass of wrappers that can modify observations using observation() for reset() and step().

If you would like to apply a function to only the observation before passing it to the learning code, you can simply inherit from ObservationWrapper and overwrite the method observation() to implement that transformation. The transformation defined in that method must be reflected by the env observation space. Otherwise, you need to specify the new observation space of the wrapper by setting self.observation_space in the __init__() method of your wrapper.

Among others, Gymnasium provides the observation wrapper TimeAwareObservation, which adds information about the index of the timestep to the observation.

One Hot Partial Obs#

class minigrid.wrappers.OneHotPartialObsWrapper(env, tile_size=8)[source]#

Wrapper to get a one-hot encoding of a partially observable agent view as observation.

Example

>>> import gymnasium as gym
>>> from minigrid.wrappers import OneHotPartialObsWrapper
>>> env = gym.make("MiniGrid-Empty-5x5-v0")
>>> obs, _ = env.reset()
>>> obs["image"][0, :, :]
array([[2, 5, 0],
       [2, 5, 0],
       [2, 5, 0],
       [2, 5, 0],
       [2, 5, 0],
       [2, 5, 0],
       [2, 5, 0]], dtype=uint8)
>>> env = OneHotPartialObsWrapper(env)
>>> obs, _ = env.reset()
>>> obs["image"][0, :, :]
array([[0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0]],
      dtype=uint8)

Reseed#

class minigrid.wrappers.ReseedWrapper(env, seeds=(0,), seed_idx=0)[source]#

Wrapper to always regenerate an environment with the same set of seeds. This can be used to force an environment to always keep the same configuration when reset.

Example

>>> import minigrid
>>> import gymnasium as gym
>>> from minigrid.wrappers import ReseedWrapper
>>> env = gym.make("MiniGrid-Empty-5x5-v0")
>>> _ = env.reset(seed=123)
>>> [env.np_random.integers(10) for i in range(10)]
[0, 6, 5, 0, 9, 2, 2, 1, 3, 1]
>>> env = ReseedWrapper(env, seeds=[0, 1], seed_idx=0)
>>> _, _ = env.reset()
>>> [env.np_random.integers(10) for i in range(10)]
[8, 6, 5, 2, 3, 0, 0, 0, 1, 8]
>>> _, _ = env.reset()
>>> [env.np_random.integers(10) for i in range(10)]
[4, 5, 7, 9, 0, 1, 8, 9, 2, 3]
>>> _, _ = env.reset()
>>> [env.np_random.integers(10) for i in range(10)]
[8, 6, 5, 2, 3, 0, 0, 0, 1, 8]
>>> _, _ = env.reset()
>>> [env.np_random.integers(10) for i in range(10)]
[4, 5, 7, 9, 0, 1, 8, 9, 2, 3]

RGB Img Obs#

class minigrid.wrappers.RGBImgObsWrapper(env, tile_size=8)[source]#

Wrapper to use fully observable RGB image as observation, This can be used to have the agent to solve the gridworld in pixel space.

Example

>>> import gymnasium as gym
>>> import matplotlib.pyplot as plt
>>> from minigrid.wrappers import RGBImgObsWrapper
>>> env = gym.make("MiniGrid-Empty-5x5-v0")
>>> obs, _ = env.reset()
>>> plt.imshow(obs['image'])  
![NoWrapper](../figures/lavacrossing_NoWrapper.png)
>>> env = RGBImgObsWrapper(env)
>>> obs, _ = env.reset()
>>> plt.imshow(obs['image'])  
![RGBImgObsWrapper](../figures/lavacrossing_RGBImgObsWrapper.png)

RGB Partial Img Obs#

class minigrid.wrappers.RGBImgPartialObsWrapper(env, tile_size=8)[source]#

Wrapper to use partially observable RGB image as observation. This can be used to have the agent to solve the gridworld in pixel space.

Example

>>> import gymnasium as gym
>>> import matplotlib.pyplot as plt
>>> from minigrid.wrappers import RGBImgObsWrapper, RGBImgPartialObsWrapper
>>> env = gym.make("MiniGrid-LavaCrossingS11N5-v0")
>>> obs, _ = env.reset()
>>> plt.imshow(obs["image"])  
![NoWrapper](../figures/lavacrossing_NoWrapper.png)
>>> env_obs = RGBImgObsWrapper(env)
>>> obs, _ = env_obs.reset()
>>> plt.imshow(obs["image"])  
![RGBImgObsWrapper](../figures/lavacrossing_RGBImgObsWrapper.png)
>>> env_obs = RGBImgPartialObsWrapper(env)
>>> obs, _ = env_obs.reset()
>>> plt.imshow(obs["image"])  
![RGBImgPartialObsWrapper](../figures/lavacrossing_RGBImgPartialObsWrapper.png)

Position Bonus#

class minigrid.wrappers.PositionBonus(env)[source]#

Adds an exploration bonus based on which positions are visited on the grid.

Note

This wrapper was previously called StateBonus.

Example

>>> import gymnasium as gym
>>> from minigrid.wrappers import PositionBonus
>>> env = gym.make("MiniGrid-Empty-5x5-v0")
>>> _, _ = env.reset(seed=0)
>>> _, reward, _, _, _ = env.step(1)
>>> print(reward)
0
>>> _, reward, _, _, _ = env.step(1)
>>> print(reward)
0
>>> env_bonus = PositionBonus(env)
>>> obs, _ = env_bonus.reset(seed=0)
>>> obs, reward, terminated, truncated, info = env_bonus.step(1)
>>> print(reward)
1.0
>>> obs, reward, terminated, truncated, info = env_bonus.step(1)
>>> print(reward)
0.7071067811865475

Stochastic Action#

class minigrid.wrappers.StochasticActionWrapper(env=None, prob=0.9, random_action=None)[source]#

Add stochasticity to the actions

If a random action is provided, it is returned with probability 1 - prob. Else, a random action is sampled from the action space.

Symbolic Obs#

class minigrid.wrappers.SymbolicObsWrapper(env)[source]#

Fully observable grid with a symbolic state representation. The symbol is a triple of (X, Y, IDX), where X and Y are the coordinates on the grid, and IDX is the id of the object.

Example

>>> import gymnasium as gym
>>> from minigrid.wrappers import SymbolicObsWrapper
>>> env = gym.make("MiniGrid-LavaCrossingS11N5-v0")
>>> obs, _ = env.reset()
>>> obs['image'].shape
(7, 7, 3)
>>> env_obs = SymbolicObsWrapper(env)
>>> obs, _ = env_obs.reset()
>>> obs['image'].shape
(11, 11, 3)

View Size#

class minigrid.wrappers.ViewSizeWrapper(env, agent_view_size=7)[source]#

Wrapper to customize the agent field of view size. This cannot be used with fully observable wrappers.

Example

>>> import gymnasium as gym
>>> from minigrid.wrappers import ViewSizeWrapper
>>> env = gym.make("MiniGrid-LavaCrossingS11N5-v0")
>>> obs, _ = env.reset()
>>> obs['image'].shape
(7, 7, 3)
>>> env_obs = ViewSizeWrapper(env, agent_view_size=5)
>>> obs, _ = env_obs.reset()
>>> obs['image'].shape
(5, 5, 3)