Reinforcement Learning using Gazebo


I’m new to Gazebo Fortress and my company is working on a Reinforcement Learning algorithm. We were using Unity to simulate our agent but we’ve reached the limitations of the game engine.

I was able to setup our robot and a world.

In order for RL to work, we need rewards and penalties, from what I understand the TriggeredPublisher plugin could be used to send a message on a topic when a collision with a reward is detected.

The issue I’m now faced with is how to remove rewards/penalties from the world when its touched by the robot?

Would anyone happen to have an idea?

1 Like

Can’t you also subscribe positions of the robot and make the reward masking in your processing code?

Otherwise, you could try using the breadcrumbs system like in subt/cave_circuit_practice_01.sdf at master · osrf/subt · GitHub and . The breadcrumb plugin has a config option telling how many deployments are allowed, and publishes a ~/remaining topic with the remaining number of deployments.

Sounds very interesting. Could you maybe write up a bit more about this?

1 Like

I found a way to make it work by compiling Gazebo from source to take advantage of the new feature of the TriggeredPublisher which allows to call a service gz-sim/pull/1611. By combining the contact sensor, the TouchPlugin and the TriggeredPublisher. I’m able to detect the collision with a reward model and then send a service request to delete the reward model from the world.

I’d post a video of the result, but I can’t upload it since I’m a new user.

[Edit: I can now add the video]


If it can be useful to anyone, a typical sphere reward would be defined like this:

<model name="reward_yellow">
      <pose>0 -2 0.05 0 0 0</pose>
      <link name="link">
        <visual name="v2">
            <ambiant>1 1 0 1</ambiant>
            <diffuse>1 1 0 1</diffuse>
            <specular>1 1 0 1</specular>
            <emissive>1 1 0 1</emissive>
        <collision name="c2">
        <sensor name='sensor_contact' type='contact'>
      <plugin filename="" name="ignition::gazebo::systems::TouchPlugin">
      <plugin filename="" name="ignition::gazebo::systems::TriggeredPublisher">
        <input type="ignition.msgs.Boolean" topic="/reward_yellow/touched">
          <match>data: true</match>
        <service name="/world/demo/remove" reqType="ignition.msgs.Entity" repType="ignition.msgs.Boolean" timeout="1000" reqMsg='name: "reward_yellow", type: 2'></service>
1 Like

Hello Alex_SSoM,

Maybe you have seen it already, but there is Gym-Ignition project that provides a programmatic Python interface for ~Ignition~ Gazebo with a focus on Reinforcement Learning. I have used it for Gazebo Fortress, and it works fine for my purposes. However, it might not work with Gazebo Garden straight away due to a different name and other breaking changes.

Collisions can be checked programmatically (by listing per-object/link contacts). If you want to give a reward upon reaching a target position in 2D/3D, you can also just compute the distance to the target and apply a threshold if desired.

By utilising the programmatic interface of the Gazebo server directly, it reduces some transport overhead while also increasing determinism by avoiding the stochastic nature of socket-based communication. A reliable interface might not be provided for all features, in which case you can still fallback on Gazebo Transport and/or ROS (2).


Hi Andre,

I wasn’t aware of Gym-Ignition (I’ll take a look at it). So far, the solution I found only uses the ros-ign-bridge and it’s working fine.

We’re actually developing code that will need to run on a robot so having to deal with stochastic communications is a requirement anyway.

1 Like