Run Gazebo simulation on HPC cluster

I’m trying to run Gazebo simulation with ROS2 in a HPC cluster but I have problems running it properly. More in detail I’m using a Singularity/Apptainer container generated from a .def file and can run everything correctly on a personal laptop.

When I execute the command ign gazebo -s world_file.sdf I don’t get any error but at the same time I’m not able to see any Gazebo topic or get any feedback from the terminal.

Do you have any suggestions that can help me?

Sorry for the generic question but I didn’t figure out exactly where the problem is coming from

We ran that without big issues on our HPC cluster via singularity.

Few things to consider:

  1. Gazebo uses multicast for node discovery. HPC clusters usually don’t like it so don’t expect multiple Gazebo parts to communicate between HPC nodes. The amount of discovery traffic can grow quite large if you run multiple Gazebos in parallel. We were forced to limit the discovery to localhost only by our admin.
  2. The same holds for ROS 2 comms.
  3. When running on localhost, everything should work (i.e. gz topic on the HPC node where Gazebo is running).
  4. You should manually set GZ_PARTITION and ROS_DOMAIN_ID manually for each HPC task because normally they are based on hostname/username, which are the same on one HPC node.
  5. For Gazebo comms to work, all parts have to have the same value of GZ_TRANSPORT_TOPIC_STATISTICS (or the IGN counterpart if you’re still on Dome). E.g. the SubT simulator sets this variable to its nondefault value.

To debug the comms issue, add -v4 to your command when launching gazebo. Maybe you’ll see some errors.

You don’t tell whether you’re trying gazebo comms pr ros comms. So, which one?

1 Like

Thanks for the reply @peci1. Can you explain me which value should I set for IGN_TRANSPORT_TOPIC_STATISTICS? As far as I understood from the documentation this is just a parameter used to compute quantify transport performance between a publisher and subscriber. In my case I need that every simulation is completely isolate so I would like to have any communication between different simulations

It can have either value 0 or 1, but the value needs to be consistent among all Gazebo nodes that need to communicate (e.g. Gz server, Gz client/GUI, Gz-ROS bridge nodes etc.).

Disabling comms between different “instances” is done using GZ_PARTITION.

1 Like