I’m new to Gazebo Fortress and my company is working on a Reinforcement Learning algorithm. We were using Unity to simulate our agent but we’ve reached the limitations of the game engine.
I was able to setup our robot and a world.
In order for RL to work, we need rewards and penalties, from what I understand the
TriggeredPublisher plugin could be used to send a message on a topic when a collision with a reward is detected.
The issue I’m now faced with is how to remove rewards/penalties from the world when its touched by the robot?
Would anyone happen to have an idea?
Can’t you also subscribe positions of the robot and make the reward masking in your processing code?
Otherwise, you could try using the breadcrumbs system like in subt/cave_circuit_practice_01.sdf at master · osrf/subt · GitHub and https://app.gazebosim.org/OpenRobotics/fuel/models/Medium%20Rock%20Fall . The breadcrumb plugin has a config option telling how many deployments are allowed, and publishes a
~/remaining topic with the remaining number of deployments.
Sounds very interesting. Could you maybe write up a bit more about this?
I found a way to make it work by compiling Gazebo from source to take advantage of the new feature of the
TriggeredPublisher which allows to call a service gz-sim/pull/1611. By combining the contact sensor, the
TouchPlugin and the
TriggeredPublisher. I’m able to detect the collision with a reward model and then send a service request to delete the reward model from the world.
I’d post a video of the result, but I can’t upload it since I’m a new user.
[Edit: I can now add the video]
If it can be useful to anyone, a typical sphere reward would be defined like this:
<pose>0 -2 0.05 0 0 0</pose>
<ambiant>1 1 0 1</ambiant>
<diffuse>1 1 0 1</diffuse>
<specular>1 1 0 1</specular>
<emissive>1 1 0 1</emissive>
<sensor name='sensor_contact' type='contact'>
<plugin filename="libignition-gazebo-touchplugin-system.so" name="ignition::gazebo::systems::TouchPlugin">
<plugin filename="libignition-gazebo-triggered-publisher-system.so" name="ignition::gazebo::systems::TriggeredPublisher">
<input type="ignition.msgs.Boolean" topic="/reward_yellow/touched">
<service name="/world/demo/remove" reqType="ignition.msgs.Entity" repType="ignition.msgs.Boolean" timeout="1000" reqMsg='name: "reward_yellow", type: 2'></service>
Maybe you have seen it already, but there is Gym-Ignition project that provides a programmatic Python interface for ~Ignition~ Gazebo with a focus on Reinforcement Learning. I have used it for Gazebo Fortress, and it works fine for my purposes. However, it might not work with Gazebo Garden straight away due to a different name and other breaking changes.
Collisions can be checked programmatically (by listing per-object/link contacts). If you want to give a reward upon reaching a target position in 2D/3D, you can also just compute the distance to the target and apply a threshold if desired.
By utilising the programmatic interface of the Gazebo server directly, it reduces some transport overhead while also increasing determinism by avoiding the stochastic nature of socket-based communication. A reliable interface might not be provided for all features, in which case you can still fallback on Gazebo Transport and/or ROS (2).
I wasn’t aware of Gym-Ignition (I’ll take a look at it). So far, the solution I found only uses the ros-ign-bridge and it’s working fine.
We’re actually developing code that will need to run on a robot so having to deal with stochastic communications is a requirement anyway.