Learn about ECMWF, one of the 5 nominees for the Superuser Awards in 2025.

image

Who do you think should win the 2025 Superuser Awards? The annual Superuser Awards are to recognize organizations that have used open infrastructure to improve their business while contributing back to the community.

This year, the Superuser Awards winner will be announced at the OpenInfra Summit Europe, October 17-19! Join us at the annual OpenInfra Summit for an opportunity to collaborate directly with the international community of people building and running open source infrastructure using Linux, StarlingX, OpenStack, Kubernetes, Kata Containers and 30+ other technologies. Get your Summit tickets now!

ECMWF is one of the 5 nominees for the Superuser Awards 2025. Check out why its team is getting nominated:

ECMWF is the European Centre for Medium-Range Weather Forecasts.
We are both a research institute and a 24/7 operational service, producing global numerical weather predictions and other data for our Member and Co-operating States and the broader community.

The people behind the Cloud infrastructure are :

  • The Cloud Infrastructure Team: Charalampos Kominos, Cristina Duma, Enrico Favero, Manuel Viceiro, Ricardo Correa
  • The User Support Team: Xavier Abellan, Roberto Cuccu, Samuel Langlois
  • The Platform Engineering Team: Marcos Hemirda Mera, Douglas Lock, Manuel Martins, Rakesh Prithiviraj, Giuseppe Misurelli

How has open infrastructure transformed the organization’s business?

Open infrastructure is playing a growing role in how ECMWF delivers weather and climate data to the world. Through OpenStack and other open source technologies, ECMWF:

  1. Provides a community cloud as a founding partner of the European Weather Cloud, enabling collaboration across national meteorological services by Member and Cooperating states.
  2. Delivers over 3 PB of data per month worldwide, including datasets from the Copernicus Climate Change Service and Copernicus Atmosphere Monitoring Service.
  3. Strengthens collaboration with partners around the world, reducing barriers to data access and enabling shared innovation.

How has the organization participated in or contributed to an open source project?

Besides contributing to a number of public projects, most of the software, tools and resources developed at ECMWF are released as open source, as can be seen by the more than 140 public repositories. Many of those are a result of fruitful co-operation with its Member States, including Anemoi, an open source framework to build, train, and run machine learning data-driven weather forecast models f.e. As far as the OpenInfra community in concerned we participate through code contributions, bug reports, documentation feedback, upstream discussions, and presentations at OpenInfra Summit ECMWF’s teams have been actively contributing since 2019, and the organization became an OpenInfra Foundation Associate Member in 2021.

What open source technologies does the organization use in its open infrastructure environment?

Our cloud environment integrates a wide range of open source projects, including:

  1. Core Infrastructure: OpenStack, Kubernetes, Ceph
  2. Automation & Operations: OpenTofu, Ansible, Cluster API, Azimuth, Magnum
  3. Monitoring & Observability: Prometheus, Grafana, Alertmanager, OpenSearch

This open infrastructure stack supports both production workloads and scientific research at scale.

What is the scale of your open infrastructure environment?

Our Common Cloud Infrastructure (CCI) consists of:

  • Two production clouds (CCI-1 and CCI-2) hosted in separate datacenter halls for redundancy, plus a Test and Validation (TAV) environment.
  • 68 compute nodes with 19,456 cores and 117 TiB of memory.
  • 32 Nvidia Ampere A100 80 GB GPUs for AI/ML workloads.
  • 11 PB Ceph-backed storage.
  • Supporting around ~2,000 VMs and ~100 Kubernetes clusters in production

The infrastructure is continuously expanding to meet growing demands for data storage, computational power, and new services. This growth ensures ECMWF can keep pace with the increasing needs of the Copernicus Climate and Atmosphere Services (C3S, CAMS), as well as the broader scientific community relying on timely and reliable access to weather and climate data.

What kind of operational challenges have you overcome during your experience with open infrastructure?

Running a mission-critical OpenStack cloud for weather and climate data is challenging — downtime impacts thousands of scientists worldwide. Some key challenges we’ve overcome include:

  1. Massive data migration: Transferring petabytes of data from our UK site (Reading) to our new datacenter in Bologna — over the public internet — with minimal disruption to users.
  2. Rolling upgrades: Performing complex OpenStack upgrades while maintaining service continuity is always challenging. We rely heavily on our in-house expertise, as well as external partners for this.
  3. Maintaining operational excellence: Running a mission-critical OpenStack deployment for a global scientific community requires balancing stability, performance, and innovation.

How is this team innovating with open infrastructure?

Innovation is at the heart of ECMWF’s use of open infrastructure. Key examples include:

  1. European Weather Cloud founder: Providing this community cloud for use by member and cooperating states
  2. AI for weather forecasting: Providing cloud-based GPU resources that complement ECMWF’s HPC, enabling the development of next-generation AI/ML models (such as AIFS) alongside traditional numerical weather prediction.
  3. Global data distribution: Scaling our infrastructure to handle exponential growth in demand for open Copernicus data (CAMS, C3S, ERA5), ensuring global accessibility to petabyte-scale datasets.

 

Allison Price
Latest posts by Allison Price (see all)