Gazebo Infrastructure updates from June 2023

June was a month of behind the scenes improvements for the OSRF Infrastructure project committee but it was a full one nonetheless.

We’re also welcoming a new project committer: @marcogg is joining us primarily to resume the work on ROS packaging infrastructure for Rust-based packages. Welcome Marco!

Without further ado, here are the June updates


Re-run outdated CI jobs on build.osrfoundation.org

The infrastructure team helps keep an eye on the overall health of Gazebo CI builds. Jobs have previously been triggered only by changes in the target project repository and when things are very quiet, it’s possible for environmental changes to cause breakages which aren’t discovered or triaged until a new change is proposed. After determining that there is not sufficient need for full periodic builds of all packages, @Crola1702 set about automating the manual heuristic that he has been using to trigger new builds for jobs that have become very stale in https://github.com/gazebo-tooling/release-tools/pull/962 which will trigger a few outdated jobs every hour to avoid overwhelming the build queues until all builds are caught up to the threshold.


Unofficial Gazebo Garden packages available for ROS 2 Humble

Gazebo Garden is not currently available in any binary ROS 2 distribution although the ros_gz bridge supports from source compilations using Gazebo Garden for different ROS 2 distributions. Building from source is the recommended way for advanced users that really want to use Garden on ROS 2 Humble or ROS 2 Rolling.

Gazebo Classic version 11 and ROS 1 were in this situation previously and the Gazebo project generated unofficial packages named gazebo11_ros_pkgs and available in the OSRF repositories on packages.osrfoundation.org which could be used to install Gazebo11 instead of Gazebo9 on Melodic and others.

@jrivero has put together unofficial packages for the users of Gazebo Garden and ROS 2 Humble that could help those interested in this more advanced use case with ease-of-installation.

It’s worth pointing out a third time that these packages are unofficial and the ROS project only support using binary packages from the ROS repositories and the upstream dependencies used on the ROS build farms. The Gazebo project is maintaining these packages on a best-effort basis and support for them is not guaranteed.


Reducing hosting costs for OSRF build farm

@jrivero and I spent some time last month reviewing hosting resources for the build farms (both Gazebo and ROS) and were able to clean up a number of dangling, no longer needed, resources resulting in a significant cost savings on OSRF’s infrastructure budget. We’re using what we learned on this ad-hoc cleanup to revise our provisioning tools and enable automatic cleaning as we go and reduce this overhead in the future.

While we were performing this manual cleanup, we got a little over-zealous and de-provisioned some images which were still being used in production, this prevented new agents in some auto-scaling groups from back-filling after a spot capacity rebalances resulting in reduced agents in some pools. Luckily other members of the Infrastructure team spotted this and were able to quickly build new images using our internal infrastructure.

Another large savings for us is coming from the in-progress migration of the EBS volumes which provide the logical disk storage for our hosts in AWS from gp2 to the next generation gp3 storage.
Not only does gp3 have a lower cost per GB-month, but the IOPS allocation is no longer tied to volume size. Our gp2 instances were generally over-provisioned on storage in order to maximize our IOPS allocation and avoid disk IO bottlenecks during builds. With gp3, the maximum default IOPS allocation from gp2 is now the baseline for gp3 regardless of volume size. We were able to reduce the storage provisioned for some classes of build farm agents from 1000GB down to 250GB or 350GB without a penalty on IOPS. Together with the savings from the reduced cost per GB-month we were easily able to budget extra allocation to cover the reduction in baseline throughput from 250MiB/s to 125MiB/s between gp2 and gp3 and still net a significant cost reduction.


That’s all for now. Happy Friday!

1 Like