The following is a list of events at the 2025 ESA Living Planet Symposium that involve cloud-native geospatial technologies.

Sunday 22 June

5 events

Sunday 22 June 15:30 - 16:50 (Hall L3)

Tutorial: D.03.17 TUTORIAL - Cloud-Native Earth Observation Processing with SNAP and Copernicus Data Space Ecosystem CDSE

#cloud-native

This tutorial will provide participants with practical skills for deploying ESA’s SNAP in cloud environments, leveraging containerization, Python integration, and the Copernicus Data Space Ecosystem (CDSE). The 90-minute session combines conceptual foundations, live demonstrations, and guided exercises to enable operational EO data analysis directly within cloud infrastructure.

1. Introduction to SNAP and CDSE (15 minutes)
• SNAP Overview: Highlight new features, including enhanced Python support via snappy and SNAPISTA, containerized deployment options, dand hyperspectral ata support.
• CDSE Architecture: Explore the CDSE’s data catalog, processing tools, and Jupyter environment, emphasizing its role in reducing data transfer costs through in-situ analysis.

2. Containerized SNAP Deployment (15 minutes)
• Container Fundamentals: Contrast Docker containers with SNAP’s snap packaging, addressing isolation challenges (e.g., subprocess confinement) and scalability.
• Cloud Deployment: Walk through launching pre-configured SNAP containers on CDSE, including resource allocation and persistent storage setup.

3. Python-Driven Processing with SNAPISTA and Snappy (25 minutes)
• Snappy and SNAPISTA: Understand the low-level Java-Python bridge (snappy) and SNAPISTA’s high-level API for graph generation, including performance trade-offs.
• Operational Workflows: Build a Python script using SNAPISTA to batch-process Sentinel data on CDSE, incorporating cloud-optimized I/O and error handling.
• Integration with CDSE APIs: Retrieve CDSE catalog metadata, subset spatial/temporal ranges, and pipe results directly into SNAP operators without local downloads.

4. Jupyter-Based Analytics and Collaboration (20 minutes)
• Jupyter Lab on CDSE: Navigate the pre-installed environment, accessing SNAP kernels, GPU resources, and shared datasets.
• Reproducible Workflows: Convert SNAP Graph Processing Tool (GPT) XML workflows into Jupyter notebooks, leveraging snapista for modular code generation.
• Collaboration Features: Demonstrate version control, real-time co-editing, and result sharing via CDSE’s portal.

5. Best Practices and Q&A (15 minutes)
• Q&A: Address participant challenges in adapting legacy SNAP workflows to cloud environments.

Learning Outcomes: Participants will gain proficiency in deploying SNAP on CDSE, designing Python-driven EO pipelines, and executing scalable analyses without data migration. The tutorial bridges ESA’s desktop-oriented SNAP tradition with modern cloud paradigms, empowering users to operationalize workflows in alignment with CDSE’s roadmap.

Speaker


  • Pontus Lurcock - Brockmann
Add to Google Calendar

Sunday 22 June 17:00 - 18:20 (Hall L3)

Tutorial: D.04.12 TUTORIAL - Cloud optimized way to explore, access, analyze and visualize Copernicus data sets

#stac

This Tutorial will be present how to leverage various APIs provided by the Copernicus Data Space Ecosystem (CDSE) to process Copernicus data in a cloud computing environment using JupyterLab notebooks. In the beginning, it will be shown how to efficiently filter data collections using the SpatioTemporal Asset Catalog (STAC) catalogue API and how to make use of the STAC API extensions to enable advanced functionalities such as filtering, sorting, pagination etc. Secondly, it will be presented how to access parts of Earth Observiation (EO) products using STAC assets endpoint and byte range requests issued to the CDSE S3 interface. In this respect, it will be discussed in details how to do it using the Geospatial Data Abstraction Library (GDAL) and how to properly setup GDAL setting to maximize the performance of data access via the GDAL vsis3 virtual file system. Further, it will be presented how to leverage the STAC API to build a data cube for the sake of the spatio-temporal analysis. Ultimately, it will be show how to analyse the data cube using an open-source foundation model coupled with freely accessible embeddings generated from the Sentinel EO data and how to visualize and publish results using the Web Map Service (WMS) service. The ultimate goal of this Tutorial is to empower users with the novel EO analytical tools that are provided by the CDSE platform.

Speaker:


  • Jan Musial, CloudFerro

Add to Google Calendar

Sunday 22 June 17:00 - 18:20 (Room 0.11/0.12)

Tutorial: D.03.15 TUTORIAL - FAIR and Open Science with EarthCODE Integrated Platforms

#pangeo

This hands-on tutorial introduces participants to FAIR (Findable, Accessible, Interoperable, Reusable) and Open Science principles through EarthCODE integrated platforms, using real-world Earth Observation datasets and workflows. We will begin by exploring the fundamentals of FAIR, explore the EarthCODE catalog, and apply a checklist-based FAIRness assessment to datasets hosted on EarthCODE. Participants will evaluate current implementations, identify gaps, and discuss possible improvements. Building on this foundation, we will demonstrate how integrated platforms such as DeepESDL, OpenEO, and Euro Data Cube (Polar TEP, Pangeo & CoCalc) can be used to create reproducible EO workflows. Participants will create and publish open science experiments and products using these tools, applying FAIR principles throughout the process. The tutorial concludes with publishing results to the EarthCODE catalog, showcasing how EarthCODE facilitates FAIR-aligned, cloud-based EO research. By the end of the session, attendees will have practical experience in assessing and improving FAIRness, developing open workflows, and using EarthCODE platforms to enable reproducible, FAIR and Open Science. Please register your interest for this tutorial by filling in this form: https://forms.office.com/e/yKPJpKV0KX before the session.

Speakers:


  • Samardzhiev Deyan - Lampata
  • Anne Fouilloux - Simula Labs
  • Dobrowolska Ewelina Agnieszka - Serco
  • Stephan Meissl - EOX IT Services GmbH
  • Gunnar Brandt - Brockmann Consult
  • Bram Janssen - Vito
Add to Google Calendar

Sunday 22 June 15:30 - 16:50 (Room 1.31/1.32)

Tutorial: D.02.18 TUTORIAL - Mastering EOTDL: A Tutorial on crafting Training Datasets and developing Machine Learning Models

#stac

In this tutorial session, participants will dive deep into the world of machine learning in Earth observation. Designed for both beginners and seasoned practitioners, this tutorial will guide you through the comprehensive workflow of using the Earth Observation Training Data Lab (EOTDL) to build, manage, and deploy high-quality training datasets and machine learning models.

Throughout the session, you will begin with an introduction to the fundamentals of EOTDL, exploring its datasets, models, and the different accesibility layers. We will then move into a detailed walkthrough of EOTDL’s capabilities, where you’ll learn how to efficiently ingest raw satellite data and transform it into structured, usable datasets. Emphasis will be placed on practical techniques for data curation, including the utilization of STAC metadata standards, ensuring your datasets are both discoverable and interoperable.

Next, the session will focus on model development, showcasing the process of training and validating machine learning models using curated datasets, including feature engineering. Real-world examples and case studies will be presented to illustrate how EOTDL can be leveraged to solve complex problems in fields such as environmental monitoring, urban planning, and disaster management.

By the end of the tutorial, you will have gained valuable insights into the complete data pipeline—from dataset creation to model deployment—and the skills necessary to apply these techniques in your own projects. Join us to unlock the potential of Earth observation data and drive innovation in your machine learning endeavors.

Speakers:


  • Juan B. Pedro Costa - CTO@Earthpulse, Technical Lead of EOTDL
Add to Google Calendar

Sunday 22 June 14:00 - 15:20 (Room 1.85/1.86)

Tutorial: D.02.22 TUTORIAL - Geospatial Machine Learning Libraries and the Road to TorchGeo 1.0

#stac

The growth of machine learning frameworks like PyTorch, TensorFlow, and scikit-learn has also sparked the development of a number of geospatial domain libraries. In this talk, we break down popular geospatial machine learning libraries, including:



TorchGeo (PyTorch)
Eo-learn (scikit-learn)
Raster Vision (PyTorch, TensorFlow*)
DeepForest (PyTorch, TensorFlow*)
Samgeo (PyTorch)
TerraTorch (PyTorch)
SITS (R Torch)
Srai (PyTorch)
Scikit-eo (scikit-learn, TensorFlow)
Geo-bench (PyTorch)
GeoAI (PyTorch)
OTBTF (TensorFlow)
GeoDeep (ONNX)


For each library, we compare the features they have as well as various GitHub and download metrics that emphasize the relative popularity and growth of each library. In particular, we promote metrics including the number of contributors, forks, and test coverage as useful for gauging the long-term health of each software community. Among these libraries, TorchGeo stands out with more builtin data loaders and pre-trained model weights than all other libraries combined. TorchGeo also boasts the highest number of contributors, forks, stars, and test coverage. We highlight particularly desirable features of these libraries, including a command-line or graphical user interface, the ability to automatically reproject and resample geospatial data, support for the spatio-temporal asset catalog (STAC), and time series support. The results of this literature review are regularly updated with input from the developers of each software library and can be found here: https://torchgeo.readthedocs.io/en/stable/user/alternatives.html



Among the above highly desirable features, the one TorchGeo would most benefit from adding is better time series support. Geotemporal data (time series data that is coupled with geospatial information) is a growing trend in Earth Observation, and is crucial for a number of important applications, including weather and climate forecasting, air quality monitoring, crop yield prediction, and natural disaster response. However, TorchGeo has only partial support for geotemporal data, and lacks the data loaders or models to make effective use of geotemporal metadata. In this talk, we highlight steps TorchGeo is taking to revolutionize how geospatial machine learning libraries handle spatiotemporal information. In addition to the preprocessing transforms, time series models, and change detection trainers required for this effort, there is also the need to replace TorchGeo's R-tree spatiotemporal backend. We present a literature review of several promising geospatial metadata indexing solutions and data cubes, including:



R-tree
Shapely
Geopandas
STAC
Numpy
PyTorch
Pandas
Xarray
Geopandas
Datacube


For each spatiotemporal backend, we compare the array, list, set, and database features available. We also compare performance benchmarks on scaling experiments for common operations. TorchGeo requires support for geospatial and geotemporal indexing, slicing, and iteration. The library with the best spatiotemporal support will be chosen to replace R-tree in the coming TorchGeo 1.0 release, marking a large change in the TorchGeo API as well as a promise of future stability and backwards compatibility for one of the most popular geospatial machine learning libraries. TorchGeo development is led by the Technical University of Munich, with incubation by the AI for Good Research Lab at Microsoft, and contributions from 100 contributors from around the world. TorchGeo is also a member of the OSGeo foundation, and is widely used throughout academia, industry, and government laboratories. Check out TorchGeo here: https://www.osgeo.org/projects/torchgeo/

Speakers:


  • Adam J. Stewart - TUM
  • Nils Lehmann - TUM
  • Burak Ekim - UniBw
Add to Google Calendar

Monday 23 June

34 events

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Global distribution of livestock densities (2000–2022) at 1 km resolution based on spatiotemporal machine learning and irregular census data

#stac

Authors: Dr. Leandro Parente, Dr. Carmelo Bonannella, Dr. Steffen Erhmann, Tomislav Hengl, Radost Stanimirova, Dr. Katya Perez Guzman, Steffen Fritz, Dr. Carlos Gonzalez Fischer, Lindsey Sloat
Affiliations: Opengeohub Foundation, German Centre for Integrative Biodiverity Research (iDiv) Halle-Jena-Leipzig, International Insitute for Applied Systems Analysis (IIASA), Insitute of Biology, Leipzig University, Land & Carbon Lab, World Resource Insitute, Department of Global Development, College of Agriculture and Life Sciences, Cornell University, Cornell Atkinson Center for Sustainability, Cornell Universit
This study presents a novel framework for spatiotemporal mapping of livestock densities at a 1 km resolution, developed as part of the Land & Carbon Lab’s Global Pasture Watch (GPW) initiative. GPW is a collaborative effort to enhance global grassland monitoring and agricultural dynamics through the production of medium- to high-resolution datasets that inform sustainable agricultural systems and environmental policies. As part of these efforts, this research estimates annual densities of buffalo, cattle, goats, horses, and sheep for the period 2000 to 2022, addressing critical gaps in the temporal and spatial precision of livestock distribution data. A cornerstone of this work is the integration of GPW’s grassland extent maps, which provide annual classifications of cultivated and natural/semi-natural grasslands at a 30 m spatial resolution. These high-resolution grassland products are then used to distribute spatially the irregular and incomplete livestock census data within administrative polygons, weighted according to forage availability. This approach ensures that the livestock density estimates account for both ecological and management contexts. The methodological framework combines a point-sampling approach with ensemble machine learning models, specifically Random Forest and Gradient Boosting Trees. Census data were transformed into spatially distributed point samples, with weights assigned based on grassland proportions. These were further combined with dynamic environmental and socioeconomic covariates, such as climate indicators, and accessibility metrics, to model livestock densities. The reliability of the predictions was assessed through both internal validation, using cross-validation with spatial blocking, and external validation, employing a hold-out validation approach with 20% of the data. Uncertainty quantification was performed to provide users with confidence intervals, and predictions were standardized to match national statistics for consistency. The resulting dataset provides a globally consistent, medium-resolution time series of livestock densities, spanning more than two decades. This fills a critical gap, as previous efforts have largely been limited to static snapshots with coarse spatial resolutions and limited temporal depth. These new outputs enable improved analyses of livestock dynamics over time and across diverse regions, supporting applications in greenhouse gas emissions accounting, land-use optimization, and sustainable agricultural planning. Data will be made freely available as Google Earth Engine assets and in a STAC catalog. As open-access products (mapping products and reference harmonized census data) they align with GPW’s mission to provide tools that address the growing need for sustainable food production under changing climatic and socioeconomic conditions.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Evolution of the CEOS-ARD Optical Product Family Specifications

#stac

Authors: Jonathon Ross, Christopher Barnes, Matthew Steventon, Rosenqvist Rosenqvist, Peter Strobel, Andreia Siqueira, Takeo Tadono
Affiliations: Geoscience Australia, KBR contractor to the USGS, Symbios Communications, solo Earth Observation (soloEO), Japan Aerospace Exploration Agency, European Commission
The CEOS Land Surface Imagining Virtual Constellation (LSI-VC) has over 20 members representing 12 government agencies and has served as the forum for developing the CEOS Analysis Ready Data (ARD) compliant initiative since 2016. In 2017, LSI-VC defined CEOS-ARD Product Family Specification (PFS) optical metadata requirements for Surface Reflectance and Surface Temperature that reduced the barrier for successful utilization of space-based data to improve understanding of natural and human-induced changes on the Earth’s system. This resulted in CEOS-ARD compliant datasets becoming some of the most popular types of satellite-derived optical products generated by CEOS agencies (e.g., USGS Landsat Collection 2, Copernicus Sentinel-2 Collection 1, the German Aerospace Center) and commercial data providers (e.g., Catalyst/PCI, Sinergise). Since 2022, LSI-VC has led the definition of two new optical PFSs (i.e., Aquatic Reflectance and Nighttime Lights Surface Radiance) and four Synthetic Aperture Radar (SAR) PFSs (i.e., Normalised Radar Backscatter, Polarimetric Radar, Ocean Radar Backscatter, and Geocoded Single-Look Complex), signifying the recognition in importance of providing satellite Earth observation data in a format that allows for immediate analysis. As of December 2024, eleven data providers have successfully achieved CEOS-ARD compliance with a further 12 organizations either in peer-review or underdevelopment for future endorsement. However, this has engendered a need for transparency, version control, and (most importantly) a method to facilitate consistency across the different PFSs and alignment with SpatioTemporal Asset Catalogs (STAC). Thus, all future PFS development will be migrated into a CEOS-ARD GitHub repository. This will facilitate broader input from the user community which is critical for the optical specification to meet real-world user needs and ensures broader data provider adoption. CEOS agencies have concurred that now is the time with increased traceability and version control offered by GitHub, to seek to parameterise the CEOS-ARD specifications and introduce an inherent consistency across all optical and SAR PFS requirements while benefiting from active user feedback. In this presentation, we will share a status on the optical PFS transition to GitHub, as well as a set of implementation practices/guidelines and a governance framework that will broaden the portfolio of CEOS-ARD compliant products so they can become easily discoverable, accessible, and publicly used.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: HIGHWAY – Bridging Earth Observation and Digital Twin Technologies

#cloud-native #stac #zarr #cog

Authors: Henry de Waziers, Giovanni Corato, Simone Mantovani, Mohamed Boukhebouze, Christophe Lerebourg
Affiliations: Adwaiseo, MEEO, EarthLab Lu, ACRI-ST
The DTE high performance earth observation (EO) digital twin components ready data (hereafter referred as HIGHWAY) is the ESA DTE component to enable a high-performance and efficient access to EO data and processing capabilities to Digital Twins under the Destination Earth (DestinE) initiative. The scope of the service is to offer, within the DestinE platform, seamless access to: - ESA (such as Earth Explorer, Earth Watch and Heritage) and Third-party Mission data in native and Digital Twin Analysis Ready (DT-ARCO) format. - Processing capabilities in a cloud environment; High-Performance Computing (HPC) is under development. Role and Core Mission HIGHWAY acts as a vital link between EO data, processing resources and DestinE, addressing the growing need for efficient data exchange and processing to power Digital Twins. It is integrated within DestinE platform, ensuring interoperability and delivering advanced services that accelerate the creation and utilization of Digital Twin models for monitoring, analysis, and decision-making. Infrastructure Services Highway relies on a secure, resilient and scalable multi-cloud infrastructure: 1. OVH Cloud: As the primary site, OVH ensures proximity to DESP (DestinE Core Service Platform), reducing latency and enhancing data exchange performance. This location also supports environmentally friendly operations by reducing egress and energy consumption, aligning with ESA's green initiatives. 2. Terra Adwäis: Serving as the Disaster Recovery (DR) site and DT-ARCO production site, Terra Adwäis ensures the resilience of HIGHWAY operations. The Highway infrastructure and platform leverage cloud-native technologies to ensure flexibility and elasticity ensuring that the service remains robust and efficient regardless of technological shifts or operational demands. In a near future, to cope the most demanding Digital Twins AI needs it will be connected to HPC. Data Services HIGHWAY delivers efficient, high-performance data access tailored to the needs of Digital Twins, enabling advanced Earth observation capabilities. Its key data services include: • DT-ARCO Data Production: HIGHWAY produces data in DT-ARCO format, package following the EOPF data model and advanced formats such as ZARR and COG (for specific datasets). These formats align with Copernicus roadmaps and the latest Digital Twin standards, ensuring compatibility and forward-looking integration. Supported Missions include SMOS, CryoSat, Proba-V, Aeolus, SWARM, and EarthCARE. • Quality Assurance: HIGHWAY integrates rigorous systematic and manual quality assessments to ensure data accuracy and reliability. Processes include original-to-final pixel comparisons and non-regression loops to prevent any quality degradation during the ARCO transformation. • Advance Data Access and APIs: Seamless data integration is provided through advanced APIs, including WMS, WCS, OpenSearch, and STAC. Data can be accessed both in native and DT-ARCO format. Advance data access offer data-cube features to explore and access DT-ARCO. Processing Services In addition to its robust data services, HIGHWAY provides scalable processing capabilities through cloud. The integration with high-performance computing (HPC) is currently under agreement. The access point to process service is Max-ICS platform. Max-ICS enables state-of-the-art AI modeling tools and supports the creation of AI pipelines for Earth observation applications. It is coupled with an HPC broker that allows dispatching and managing processing requests to a HPC, enhancing processing power and scalability for complex Earth observation tasks. Currently, this interface allows HIGHWAY to leverage Luxembourg's HPC capabilities through its initial integration with MeluXina HPC. Enabling Digital Twin Innovation HIGHWAY’s services are specifically designed to support Digital Twin applications, which require vast amounts of accurate, high-resolution Earth observation data. By delivering data aligned with the latest standards, ensuring rigorous quality assurance, and providing high-performance processing capabilities, HIGHWAY empowers scientists and decision-makers to model and simulate Earth’s systems with unprecedented precision. Conclusion HIGHWAY represents a pivotal step forward in the integration of Earth observation and Digital Twin technologies. It provides a seamless ecosystem of infrastructure, data, and processing services to DestinE. Its scalable, flexible and elastic approach ensures the long-term success of ESA’s Digital Twin initiatives, fostering innovation in Earth Observation and advancing our understanding of the planet. This service sets a new standard for Earth observation data management and processing, positioning ESA as a leader in providing essential tools for climate monitoring, environmental management, and sustainable development.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: EDEN: seamless access to the Destination Earth data portfolio

#cloud-native #stac

Authors: Moris Pozzati, Alessia Cattozzo, Damiano Barboni, Federico Cappelletti, Simone Mantovani
Affiliations: MEEO
Destination Earth (DestinE) is a major initiative of the European Commission which aims to develop high-precision digital models of Earth (referred to as ‘Digital Twins’) to monitor the effects of natural and human activity on our planet, predict extreme events, and adapt policies to climate-related challenges. The European Space Agency leads the implementation of the DestinE Platform, the cloud-based ecosystem enabling users to exploit a wide range of applications and services, including direct access to the data provided by the Digital Twin Engine and DestinE Data Lake. Among the core services of the Platform, EDEN offers seamless access to the DestinE Data Portfolio. The infrastructure based on FAIR principles enables users to search and retrieve geospatial data from federated sources, local data caches, and Digital Twin simulations. The service also delivers Analysis-Ready Cloud-Optimised (ARCO) products for advanced research. Key technological components include: - Harmonised Data Access is the core API allowing for data discovery and access across heterogeneous data sources. - Open Geospatial Consortium services (i.e. OpenSearch, STAC, WMS) enhance cloud-native resource indexing and exploitation within the Platform. - Data cache management enables access to DestinE Platform’s local cache with pre-fetched data. - Finder is the user-friendly webGIS providing data discovery, visualisation, and access capabilities. By bridging machine- and human-readable interfaces, EDEN strives to create an seamless environment for researchers, policymakers, and scientists to integrate satellite data, in-situ observations, and model outputs, advancing DestinE's mission of comprehensive Earth system understanding.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Data cubes – approaches to exploit parts of DestinE Digital Twin outputs

#cloud-native #zarr

Authors: Patryk Grzybowski, Michael Schick, Miruna Stoicescu, Aubin Lambare, Christoph Reimer
Affiliations: CloudFerro S.A., EUMETSAT, CS Sopra Steria, EODC
European Commission’s flagship initiative Destination Earth (DestinE), driven by the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT), the European Space Agency (ESA) and the European Centre for Medium-Range Weather Forecasts (ECMWF), provides unique data outputs generated by Digital Twins: the Weather Extremes Digital Twin (ExtremeDT) and the Climate Change Adaptation Digital Twin (ClimateDT). These systems deliver meteorological and climate scenarios with very high detail, thanks to their globally uniform, high spatial resolutions (~5km grid spacing). However, this high level of detail comes with significant storage requirements, reaching several petabytes. This work aims to demonstrate the possibilities and best practices related to dealing with DTs’ outputs using Destination Earth Data Lake (DEDL) near data process services (EDGE). The Destination Earth Data Lake (DEDL) offers unique opportunities and approaches that enable high-demand processing directly near the data. What sets DEDL apart is its ability to provide services in close proximity to its data holdings, leveraging a distributed infrastructure receiving DT Outputs from interconnected High-Performance Computing (HPC). This capability is made possible through data bridges — edge clouds that facilitate operations with large volumes of data initially produced by the computing power of HPC systems combined with a data cube accessible object store provided by ECMWF for its digital twin data. That joint effort of EUMETSAT and ECMWF depicted by collocating OpenStack cloud infrastructure with EuroHPC, allows for innovative and exceptional data handling. While the infrastructure and computing power are provided through cloud-edge technology, additional services, applications, and practices are essential for effective data handling and the creation of valuable products. To address these needs, DEDL supports and provides capabilities to setup data cubes and workflows for the generation of cloud-native data formats like “.zarr”. Specifically, DEDL offers for parts of the DT Outputs pre-built data cubes, as well as a virtual environment designed for creating Open Data Cubes. However, with more than 150 collections available in DEDL data portfolio, users’ needs may go beyond ready-to-use solutions. To meet these expectations, it is crucial to demonstrate various approaches to handling such extensive and diverse datasets. The aim of this work is to evaluate and demonstrate three distinct approaches to processing and generating “.zarr” format data cubes. These methods utilize different tools and frameworks to handle DT’s data. The cases include: 1) Utility of Climate Data Operators (CDO) with python libraries: xarray and dask; 2) Using python packages: dask, xarray and ECMWF developed earthkit; 3) Using xdggs with xarray for Discrete Global Grid Systems (DGGS) based ClimateDT data. The first case will introduce the process of converting ExtremeDT data from an octahedral reduced Gaussian grid to a regular Gaussian grid and writing it as a NetCDF file using CDO. It will also demonstrate how to generate a “.zarr” data cube output using xarray and dask to handle multidimensional data and enable parallelization. The second case will provide an example of how to work with ClimateDT data, which is delivered on a Hierarchical Equal Area isoLatitude Pixelation grid (HEALPix). It will demonstrate the conversion of data from HEALPix to a regular Gaussian grid using the earthkit tool provided by ECMWF and the preparation of data for further analysis. The third case will demonstrate the use of xdggs, an extension of xarray that provides tools for handling geospatial data using Discrete Global Grid Systems (DGGS). This case will focus on working with native ClimateDT data in the HEALPix grid format. For each case, both advantages and disadvantages were identified. The first case preserves the native spatial resolution and provides data on a regular grid, enabling straightforward analysis and visualization. However, the native grid is lost, which could impact the accuracy of the delivered. simulated data. What is more, regular grid impact on distortions due to the convergence of meridians at the poles. It implies that distance between gird points decreases significantly near the poles. The second case also provides data on a regular grid, facilitating easier analysis. However, not only is the native grid lost and distortions near poles occurred, but the spatial resolution is also reduced to 10 km x 10 km. The third case retains both the native resolution (5 km x 5 km) and the native grid, HEALPix. This ensures that the data remains undistorted. However, HEALPix, as a relatively new method of data provision, is not yet widely supported by existing software and communities. Fortunately, tools like healpy and xdggs are making it significantly easier to work with such data. To sum up, handling Digital Twin outputs efficiently is crucial, especially when working with data in a DGGS format. Proximity to data infrastructure, such as data bridges and pre-prepared environments provided by DEDL, enables users to manage large volumes of data effectively. This work demonstrates a proposed approach to handling Digital Twin outputs using three different data cubes’ generation methods, along with their respective pros and cons. For the first two cases, integration with existing tools and simplicity were highlighted as advantages, while the loss of the native grid was identified as a disadvantage. In contrast, the third approach preserved the native grid and resolution but posed challenges due to limited support for HEALPix in existing tools. There are also approaches on the native data without the .”zarr” transformation, accessing the data via ECMWF’s polytope and using the python packages earthkit and healpy. Whichever method users of Destination Earth choose, the foundational principles behind DEDL and DTs coupled with novel cloud solutions, will enable data processing and analysis on a scale that surpasses current capabilities.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: D.03.02 - POSTER -Free Open Source Software for the Geospatial Domain: current status & evolution

#cloud-native

Free and Open Source Software is has a key role in the geospatial and EO communities, fostered by organizations such as OSGeo, Cloud Native Computing Foundation, Apache and space agencies such as ESA and NASA. This session showcases the status of OSS tools and applications in the EO domain, and their foreseen evolution, with a focus on innovation and support to open science challenges.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Processing geospatial data at scale in geoscience: taking advantage of open-source tools.

#parquet

Authors: Jean Baptiste Barré, Romain Millan, Sylvain Dupire, Bernd Scheuchl, Eric Rignot
Affiliations: BARRE, Institute of Environmental Geosciences, National Research Institute for Agriculture, Food and Environment, University of California
The field of Earth Observation (EO) is experiencing an era of unprecedented data proliferation, driven by the increasing availability and diversity of multimodal sensor data. This surge in geospatial data volume directly responds to the growing need to study territories globally with greater precision. Simultaneously, the scientific community and funding agencies advocate for the widespread adoption of Open Source technologies, aligning with the FAIR (Findable, Accessible, Interoperable, Reusable) principles. These technologies are transforming the construction of processing chains and ensuring data accessibility for scientific research, education, and outreach purposes. Additionally, practitioners across public and private sectors are increasingly embracing open-source ecosystems to foster stronger connections with research communities, thereby driving collaborative innovation. As a result, open-source tools have become a cornerstone in advancing scientific computing and addressing the challenges of big geospatial data. In this presentation, we explore the transformative impact of open scientific principles applied to geospatial data through four diverse projects in geosciences: Polartopo: Integrates multi-sensor satellite data to explore changes in Earth's cryosphere. Leveraging modern geospatial formats like GeoParquet, coupled with DuckDB and high-level tools such as Xarray/Dask, Polartopo demonstrates how heterogeneous altimetric datasets can be processed uniformly, enhancing efficiency and scalability. Ice Velocities Workflow: Monitors large-scale ice mass flow changes in Antarctica using a blend of open-source and proprietary tools for interferometric satellite radar data processing. It enhances computational flexibility and performance by integrating SQL databases and wrapping Fortran in Python, showing the ongoing reliance on proprietary tools for complex processes like interferometry. Fireaccess: In collaboration with firefighting professionals, this project uses the open-source Python geospatial stack for large-scale processing of dense LiDAR data in fire protection contexts. It exemplifies the design of fully open-source operational tools tailored for direct application in critical services, bridging scientific research and practical implementation. 3D Worldwide Glaciers Map: Highlights the role of open-source visualization tools in translating complex multidimensional scientific datasets into user-friendly platforms. It underscores the potential of open-source technologies in enhancing communication, outreach, and education by simplifying the representation of intricate geospatial data. Spanning 1 to 15 years, these projects showcase the adaptability, longevity, and transformative potential of open-source solutions in EO. They also illustrate the continuing interplay between open-source and proprietary tools, underscoring the need for ongoing development of open-source alternatives. Ultimately, this presentation emphasizes the pivotal role of open-source tools in fostering collaboration, driving innovation, and advancing knowledge sharing within the EO community.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: The Earth Observation DataHub - Using Open Source Software to Make EO and Climate Data More Accessible and Usable, Supporting the Creation of New Applications and Open Science

#stac

Authors: Richard Conway, James Hinton, Alex Hayward, Philip Kershaw, Alasdair Kyle
Affiliations: Telespazio UK Ltd, Centre for Environmental Data Analysis
Telespazio UK are delivering the Platform for the Earth Observation DataHub (EODH) Platform which aims to aid the federation of national data sets, information sources and processing facilities in order to enable the growth of the Space / Earth Observation economy. The Platform is open source by design and uses, where possible, common open interface / API standards for software services, enabling improved interoperability and federation between platforms. The EODH Platform also re-uses some open-source component solutions, such as building blocks from the Earth Observation Exploitation Platform Common Architecture (EOEPCA) Reference Implementation (an ESA funded initiative), most of which are widely used in the EO community. The EODH Platform is the core part of the Earth Observation DataHub UK Pathfinder project which is delivering improved access to Earth Observation (EO) and Climate data to support effective decision-making. The project is supported by UKRI NERC, Department for Science Innovation and Technology (DSIT) and the UK Space Agency. Telespazio UK would like to present the EODH Platform which is currently in an early adopter operational phase, to demo current functionality and outline planned future functionality, to potential users attending the Living Planet Symposium to generate user uptake and gather user feedback. The EODH Platform components are built using a wide range of open-source software as well as open-source standards which include: - Identify and Access Management (aligned to OAuth2 standard): Keycloak, Nginx, OIDC - Resource Catalog: Stac-fastapi, stac-fastapi-elasticsearch - Data visualisation: TiTiler - Workflow Execution: EOEPCA ADES, JupyterHub/JupyterLab, Ploomber, Calrissian - Event generation: Argo Events - Messaging system: Apache Pulsar - Web Presence: Wagtail CMS - Supporting: ArgoCD, Kubernetes, Testkube etc. - OGC Standards: OGC Records API, OGC Best Practice for Application Packages, OGC WMTS The presentation will focus on how open-source software and standards have been utilised in the EODH Platform and will consider their fitness-for-purpose in terms of delivering operational functionality for EO and Climate users. Additionally, the presentation will also outline the future evolution of the EODH Platform as well as plans to incorporate new open-source software and pushing EODH Platform bespoke enhancements upstream. The creation of a financially and operationally sustainable EODH Platform will break down data silos and allow stakeholders from government, industry and academia to work together in a centralised manner, offering a better model for research and commercial service delivery, which will support a range of sectors including green finance, energy, infrastructure, and climate change monitoring. The EODH Platform’s open-source design will also allow federation opportunities with other aligned initiatives such as EarthCODE, Open Science Persistent Demonstrator (OSPD) and Application Propagation Environments (APEx).
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Overview of geospatial tools stack through Earth Observation API (eoAPI)

#stac

Authors: Emmanuel Mathot, Jonas Sølvsteen
Affiliations: Development Seed
The Earth Observation API (eoAPI) represents a modern community-standard approach to managing and analyzing complex Earth observation data, addressing critical challenges in geospatial technology and environmental monitoring. Free and open source, maintained by Development Seed,, eoAPI emerges as a comprehensive solution to the challenges of satellite imagery and planetary data hosting and access at scale. At its core, eoAPI consists of modular open-source software components designed to simplify the intricate process of earth observation data management. Recognizing the fundamental challenge of underutilized imagery—where traditional methods are time-consuming and require specialized expertise—the API provides a unified, customizable solution that seamlessly integrates with existing cloud environments. This approach democratizes access to sophisticated earth observation technologies, breaking down barriers for researchers, environmental scientists, agricultural specialists, and technology companies. The individual components and their combined use has already gained significant traction among global technology leaders, with implementations by NASA IMPACT, AWS Sustainability, Microsoft Planetary Computer and Planet. eoAPI for Kubernetes constitutes an essential block in the EO Exploitation Platform Common Architecture (EOEPCA) initiative led by the European Space Agency. These use cases demonstrate eoAPI's versatility across multiple domains, from climate research and environmental monitoring to agricultural land management and geospatial data services. Technically, eoAPI combines several state-of-the-art open-source projects from many community contributors to create a full Earth Observation API. Each service can be used and deployed independently, but eoAPI creates the interconnections between each service. ● Database: The STAC database is at the heart of eoAPI and is the only mandatory service. We use PgSTAC Postgres schema and functions, which provides functionality for STAC Filters, CQL2 search, and utilities to help manage the indexing and partitioning of STAC Collections and Items. ● Metadata: The Metadata service deployed in eoAPI is built on stac-fastapi.pgstac application. By default, the STAC metadata service will have a set of endpoints to search and list STAC collections and items. All reformatted and re-engineered data will be registered via this service using extensively the relevant STAC extensions for characterising and searching the Sentinels data. ● Raster: The Raster service deployed in eoAPI is built on top of titiler-pgstac. It enables Raster visualisation for a single STAC Item and large-scale (multi collections/items) mosaic based on STAC search queries. ● Vector: The OGC Features and (Mapbox Vector) Tiles API service deployed in eoAPI is built on top of TiPg. It enables vector Features/Features Collection exploration and visualisation for Tables stored in the Postgres database (in the public schema). It is not strictly necessary in the scope of the reformatted and re-engineered data but that could be used for specific User Adoption activities. ● Browsing UI: The browsing UI deployed in eoAPI is built on the Radiant Earth STAC Browser, and provides a configurable, user-friendly interface to search across and within collections and quickly visualise single items assets.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: On-demand data cubes – knowledge-based, semantic querying of multimodal Earth observation data for mesoscale analyses anywhere on Earth

#stac

Authors: Felix Kröber, Martin Sudmanns, Dirk Tiede
Affiliations: Department of Geoinformatics, University of Salzburg, Forschungszentrum Jülich, Institute of Bio- and Geosciences, IBG-2: Plant Sciences
Introduction In recent years, Earth observation (EO) data access and processing have undergone a transformative shift, driven by the advent of novel big EO data paradigms [1,2]. With the increasing volume and variety of EO data, the significance of cloud-enabled processing frameworks that allow users to focus on the actual analysis of data with an abstraction of technical complexities is growing. However, many cloud-based platforms are proprietary or closed-source [3,4], imposing costs and service uncertainties, as illustrated by the unexpected shutdown of the Microsoft's Planetary Computer Hub in June 2024. However, open-source, free alternatives like the Open Data Cube [5] may require significant setup effort, with dataset indexing being one reason. This effort is only justified for larger data cubes with long-term infrastructure goals, whereas for shorter-term projects the practicality is restricted. Moreover, while current systems offer technical solutions in terms of data access and scalability of analyses, many approaches are still lacking in image-understanding capabilities. A modern processing framework needs to provide adequate means to address the semantic complexity of EO data [6]. Analysts still grapple with raw data structures, rather than having frameworks at hand to focus on data meaning. In brief, there is a pressing need for open-source EO data processing frameworks that are both user-friendly and capable of representing the semantics of EO data. To this end, we introduce a novel Python package (gsemantique) for building ad hoc data cubes for semantic EO analyses. We demonstrate its utility for querying multi-modal data by focussing on the use case of forest disturbance modelling. Design choices & Technical implementation The technical foundations for the gsemantique package are threefold: First, data in cloud-optimised formats is fetched on-demand to regularised three-dimensional data cubes. The SpatioTemporal Asset Catalog (STAC) [7], fostering standardisation in the structuring of geospatial metadata, is leveraged to facilitate data access. A pre-defined suite of STAC endpoints including several common EO datasets such as Landsat, Sentinel-1 and Sentinel-2 along with additional datasets such as a global DEM is part of the package. The way data access is modelled easily allows to extend the set of pre-defined datasets for custom ones. Second, the creation of comprehensible, knowledge-based, transparent models is supported by providing a semantic querying language to address and model the data. Here, we build on the foundation laid by the semantique package [8], which introduced a structured approach to semantic querying of EO data. This can be used to supplement conventional, non-semantic approaches. To facilitate effective exchanges with end users and domain experts regarding the design of analyses, graphical visualisation options for models are integrated. Specifically, the model coded in a python structure can be represented using graphical blocks as defined by Google’s Blockly library [9]. Third, the scalable execution of the models is enabled by an internal tiling of the queried spatio-temporal extent into smaller chunks. The complexity of the chunking mechanism with the decision on the dimension (i.e. chunk-by-space or chunk-by-time), the execution of the recipe and the merging of the individual chunks into a single result is abstracted from the user. Focussing on efficiency, the chunked execution of the model supports multiprocessing. As data dependencies are not fixed or can be replaced and extended, the presented python package offers a very flexible and portable way of performing data analyses. Big EO data archives can be analysed both on local, consumer-grade devices, and on cloud-based, high-performance processing platforms without being tied to a specific platform. Application case: Forest disturbance analyses To prove the value of the proposed package, we focus on the use case of analysing forest disturbances via remote sensing data. The focus here is deliberately not on the optimisation of the model, i.e. to create the best performing forest disturbance model. Instead, we intentionally choose to address the example with a simple but still effective analysis model aiming at highlighting the conceptual advantages of our approach. Specifically, three beneficial properties of the processing framework are showcased. First, the entity of interest (forest) is a 4D real-world phenomenon, that needs to be translated to features in the 2D image domain. This is indeed not unique to the forest entity but applies to all entities in the 4D world (including their relationships). However, in the image domain, entities such as water bodies are spectrally distinct relative to other objects. This makes the selection of useful image features a straightforward task, even without an explicit model that translates properties of the entity to features of the object. Forests, on the other hand, represent a type of vegetation, which is more challenging to distinguish in the image domain. Similar image features may be observed for other vegetated surfaces such as meadows or bogs. Here, the advantage of knowledge-based, semantic modelling with the possibility of an explicit definition of multiple relevant entity properties (and their translation into object features) becomes clear. We create such a model for the entity forest in an undisturbed state by defining the properties of temporal stability (translated to low radar coherence), vitality (translated to a positive NDVI) and altitude below the tree line (translated to an elevation below a thresholded DEM level). We compare this entity definition with a pre-defined one that was derived in a data-driven way. Both entity definitions can be generated without further effort leveraging the data connections pre-implemented in the package. The comparison of both definitions allows an estimation of the uncertainty when modelling the entity forest based on different data sets and approaches. Second, the phenomenon of disturbance is an ambiguous concept. There is no unique, crisp definition of forest disturbances such that a remote sensing expert needs to make his/her specific assumptions in modelling the phenomenon explicit and transparent in order to discuss it further with other domain experts. Also, there is no simple data-driven way to solve the task of disturbance modelling since there is a lack of available label data. Hence, this example is well suited to be approached by a semantic, knowledge-based modelling approach that allows to visualise and communicate the resulting human-readable model to others. Third, forest disturbances are inherently process-based, i.e. they are characterised by a temporal change in the forest’s status. A datacube-based approach is therefore well positioned to approach this task, as it allows to query every single observation through time instead of relying on pre-processed, aggregated EO products. Using a multiannual use case design incorporating Sentinel-1, Sentinel-2 and DEM data for a spatial extent of more than 1000 km2, we demonstrate the usability of our package for meso-scale analyses querying all available data references through time. [1] H. Guo, Z. Liu, H. Jiang, C. Wang, J. Liu, and D. Liang, ‘Big Earth Data: a new challenge and opportunity for Digital Earth’s development’, International Journal of Digital Earth, vol. 10, no. 1, pp. 1–12, Jan. 2017, doi: 10.1080/17538947.2016.1264490. [2] M. Sudmanns et al., ‘Big Earth data: disruptive changes in Earth observation data management and analysis?’, International Journal of Digital Earth, vol. 13, no. 7, pp. 832–850, Jul. 2020, doi: 10.1080/17538947.2019.1585976. [3] N. Gorelick, M. Hancher, M. Dixon, S. Ilyushchenko, D. Thau, and R. Moore, ‘Google Earth Engine: Planetary-scale geospatial analysis for everyone’, Remote Sensing of Environment, vol. 202, pp. 18–27, Dec. 2017, doi: 10.1016/j.rse.2017.06.031. [4] Microsoft Open Source, R. Emanuele, D. Morris, T. Augspurger, and McFarland, Matt, microsoft/PlanetaryComputer: October 2022. (Oct. 2022). Zenodo. doi: 10.5281/zenodo.7261897. [5] B. Killough, ‘Overview of the Open Data Cube Initiative’, in IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium, Valencia: IEEE, Jul. 2018, pp. 8629–8632. doi: 10.1109/IGARSS.2018.8517694. [6] H. Augustin, M. Sudmanns, D. Tiede, S. Lang, and A. Baraldi, ‘Semantic Earth Observation Data Cubes’, Data, vol. 4, no. 3, p. 102, Jul. 2019, doi: 10.3390/data4030102. [7] ‘STAC: SpatioTemporal Asset Catalogs’. Accessed: Nov. 24, 2024. [Online]. Available: https://stacspec.org/en/ [8] L. Van Der Meer, M. Sudmanns, H. Augustin, A. Baraldi, and D. Tiede, ‘Semantic Querying in Earth Observation Data Cubes’, Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., vol. XLVIII-4/W1-2022, pp. 503–510, Aug. 2022, doi: 10.5194/isprs-archives-XLVIII-4-W1-2022-503-2022. [9] Google, Google blockly - The web-based visual programming editor. (Nov. 25, 2024). TypeScript. Google. Accessed: Nov. 25, 2024. [Online]. Available: https://github.com/google/blockly
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: High Performance Desert Analytics: Characterizing Earth Surface Dynamics in Arid Regions Through ‘terrabyte’ and Multi-Sensor Earth Observation Archives

#stac

Authors: Baturalp Arisoy, Florian Betz, Univ.-Prof. Dr. Georg Stauch, Dr. Doris Klein, Univ.-Prof. Dr. Stefan Dech, Prof. Dr. Tobias Ullmann
Affiliations: Earth Observation Research Cluster, University of Würzburg, Chair of Geomorphology, University of Würzburg, German Remote Sensing Data Center, DLR
Cloud-based Earth Observation (EO) analysis has become increasingly accessible due to the rapid growth in EO datasets. By leveraging cloud-stored satellite imagery and high-speed processing capabilities, Earth Observation Data Cubes (EODCs) are emerging as a new paradigm, eliminating the need for high-end personal systems and bulky image download methods. A critical consideration in EODC development is the selection of cloud computing platforms, given the diverse range of available services. While many well-known platforms are popular within the EO community, they are often commercially driven, subjecting users to platform-specific packages and policies. This dependency poses risks, including restricted free access for academics and service discontinuations due to policy or privacy concerns. DLR’s terrabyte is a scientific, non-commercial alternative High Performance Data Analytics (HPDA) platform accessible for partner scientists that offers openly available STAC catalogs with diverse missions, SLURM-based open-source HPC cluster management and the most importantly, a customizable development infrastructure and direct access to the German Remote Sensing Data Center (DFD). This enables users to bring their own code and environments, supporting Python, R, and open-source tools such as Open Data Cube, Xarray, and DASK for parallel computing. Therefore, terrabyte is a well-suited platform to handle large scale geospatial vector and raster dataset with the support of supercomputing facility, ensuring independent scientific work without any commercial affiliation. Additionally, the generated results can be transferred and stored in an allocated container. This contribution highlights the use of terrabyte for analyzing land surface dynamics at high spatial and temporal resolution for selected study sites in Mongolia and Kyrgyzstan. In this exemplary project, ready-to-analysis data cubes will be used as the main data infrastructure for the analysis of surface dynamics. The main research question of the dryland project aims to improve the understanding of frequency-magnitude relationship between surface dynamics and climate change. Therefore, different time series patterns will be generated from different EO data; trends will be provided to reveal erosion and sedimentation trends, seasonal patterns for the vegetation development and cyclic patterns for fluctuations in river discharge over multi-year cycles. The advantage of storing extensive temporal and spectral data will be beneficial in distinguishing the different time series patterns mentioned above. Moreover, storing such amount of diverse and long-term data along with the computation power of the HPC and recent advancements in machine learning such as foundational models can significantly enhance our capability to understand extract accurate information on the Earth Surface over vast areas and at high temporal frequency. As such, terrabyte plays a central role in two key stages: designing EODCs for both dryland river systems in Mongolia and Kyrgyzstan, secondly, based on the ready-to-analysis data stored in EODCs, applying Machine Learning (ML) and Deep Neural Network (DNN) as well as time series analysis algorithms to assess surface dynamics. Our code repository automates the retrieval of satellite imagery based on mission specifications, spectral/radar bands, date ranges, and areas of interest (AOIs). It processes the data by stacking raw bands, calculating spectral indices, and applying mission-specific scale factors to produce a single complex multidimensional file with band and time dimensions. Furthermore, the automated workflow addresses and handles several challenges which are not sufficiently addressed in existing cloud computing systems. These are for example as co-registration of Sentinel-2 scenes, advanced cloud masking options, particularly important in river systems, and also geodesy and GIS related tasks. In addition to ready-to-analysis optical and Synthetic Aperture Radar (SAR) imagery, the system integrates various digital elevation models (DEMs) available through STAC APIs, processed using open geomorphometry tools like Whitebox Geospatial Analysis Tools. These outputs are consecutively stacked as individual layers within the data cube. UAV field missions will complement this work by contributing LiDAR point cloud results and very high-resolution multispectral bands to the data cube. The data cubes also function as continuous services, automatically updating existing stacks with newly available scenes through periodic execution of the repository. A caching mechanism ensures that identical AOI requests are not redundantly processed, significantly improving workflow efficiency. Finally, the repository's dependencies are fully defined and easy to install by other users, without worrying about overwhelming IT workload, reducing technical barriers for new users and offering a scalable, reproducible framework for remote sensing professionals.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Efficient Satellite Data Management: The Role of the STAC Standard and EOmetadatatool in Open-Source Metadata Harmonization for the Geospatial Domain

#stac

Authors: Michał Bojko, Christoph Reck, Jacek Chojnacki, Jędrzej Bojanowski, Wojciech Bylica, Marcin Niemyjski, Jonas Eberle, Kamil Monicz, Jan Musiał, Tomasz Furtak
Affiliations: CloudFerro S.A., German Aerospace Center (DLR)
The modern era of satellite exploration generates vast amounts of data, playing a pivotal role in science, business, and public administration. The diversity of sources and the growing number of satellite missions have made efficient data management increasingly dependent on advanced solutions. The Copernicus program and its Sentinel missions are prime examples of rapidly expanding sources of satellite data, supporting numerous vital initiatives. One of the main challenges is standardizing metadata from various sources to enable effective comparison and analysis. The STAC (SpatioTemporal Asset Catalog) standard addresses this challenge by providing consistent and harmonized metadata catalogs. Built on JSON, STAC offers an open framework for describing satellite and geospatial data. Its structure comprising collections, items, and assets, ensures clarity and flexibility. Users can search products based on various metadata, such as sensor parameters, spectral range, or spatial resolution, improving resource identification and comparison. By eliminating disparate formats, STAC simplifies navigation through satellite data, benefiting researchers and enterprises alike. To address the challenge of populating STAC-compliant catalogs with standardized data, the German Aerospace Center (DLR) initially developed EOmetadatatool. This tool was later adapted for the needs of the Copernicus Data Space Ecosystem (CDSE), tailored to meet end-user requirements, and released under an open license. EOmetadatatool is designed to integrate seamlessly with modern data infrastructures, allowing it to read data directly from S3 buckets using user credentials and process metadata asynchronously. This capability makes it ideal for handling large-scale datasets generated by satellite missions. The tool extracts and maps metadata from various formats and structures into the STAC format, ensuring compatibility and consistency. It also supports formatting metadata into custom templates, offering flexibility to cater to specific operational requirements. EOmetadatatool further provides validation modes for STAC compliance and supports direct loading into Postgres/PostGIS stacks via DNS configurations. Supporting over 60 Sentinel mission product types, it simplifies harmonization and enables the creation or updating of catalogs. Built on stactool, a Python library and CLI for working with STAC, it inherits a robust framework from PySTAC for handling geospatial metadata. The presentation will showcase EOmetadatatool's role in metadata harmonization and its integration into STAC-compliant catalogs, emphasizing STAC’s flexible structure for efficient discovery and management. Together, these solutions advance satellite data management, creating a more accessible ecosystem for the ever-growing volume of satellite resources.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Availability and use of Copernicus data in the commercial ArcGIS Platform

#stac

Authors: Guenter Doerffel
Affiliations: Esri Europe
Services based on Copernicus data in the ArcGIS Living Atlas are available since 2018 and receive millions of requests annually. They have become an important asset for Users/Organizations in many industries, but similar in Education and Research. This presentation will focus on the flavors the data and access to it is offered today and the knowledge we have about the usage. Capabilities through User Interfaces (Mobile, Web, Desktop) and Developer Tools (Python, JavaScript, REST) will be used as samples. Generic viewer- and educational apps and apps for derived results (like global Land Use time series) will be referenced. Main purpose of the presentation is sharing Industry requirements and experiences obtained from scaling multi-modal Enterprise implementations that are not tightly linked with the EO domain as such. On demand Deep-Learning and analytics against the services will be discussed. Experiences with integrations into platforms like Creodias and CODE-DE will be summarized. Latest additions, like accessing the same resources through the Copernicus Data Spaces infrastructure (via STAC and the s3-compatible cloud stores) or Offerings like the Discomap EEA portal will complete this overview. A note for the Science committee and any reader of this: Easy-to-use integration into a standardized and scaling system that offers access to Copernicus data from "any desk" is massively requested and very much appreciated by our customers. This presentation is more of a Experience summary from many years of engagement with these commercial enterprise users ranging from heavy EO users (like the oil and gas industry) to complete "newbie's" who actually only start engaging because they have access to pre-processed data and methodical descriptions how to use it. Insights offered through this presentation might be valuable for attending EO companies plus offer us the ability to learn more about the needs of the community at Living Planet 2025.
Add to Google Calendar

Monday 23 June 17:45 - 19:00 (X5 - Poster Area)

Poster: IRIDE Marketplace, a cloud-native data platform to manage the ecosystem of IRIDE Satellite Data and Services in a scalable cloud environment

#cloud-native #stac #parquet #cog

Authors: Mr. Antonio Giancaspro, Mr. Marco Corsi, Mr. Alessio Cruciani, Claudio Scarsella, Mr Fabrice Brito, Mr Davide D'Asaro, Mr Antonio Vollono, Mr Fabio Lo Zito
Affiliations: e-GEOS S.p.a., Terradue S.r.l., Lutech S.p.A, Exprivia S.p.A., Serco Italia S.p.A.
IRIDE Marketplace, a cloud-native data platform to manage the ecosystem of IRIDE Satellite Data and Services in a scalable cloud environment A. Giancaspro, M. Corsi, A. Cruciani, C. Scarsella, D. Grandoni (e-GEOS), F. Brito (Terradue), D. D’Asaro (Lutech), F. Vollono (Exprivia), F. Lo Zito (SERCO) In the framework of the Italian PNRR IRIDE Programme implemented by ESA under the mandate of the Italian Government, the IRIDE Marketplace is the downstream component acting as a) the unique access point to the whole IRIDE offering (IRIDE multi-constellation EO Data, geospatial products produced by the IRIDE Service Segment), b) hosting platform for the IRIDE Service Segment, providing infrastructure and platform services for more than 60 geospatial applications (Service Value Chains, in IRIDE language) and c) Integrated Development environment and Marketplace and for the onboarding of third-party providers in line with the Space Economy trends. Within IRIDE System overall architecture, IRIDE Marketplace has been conceived and developed as cloud-native platform leveraging on industry standards both from the geospatial domain (for example, in terms of data formats) and from the ICT domain (for example, infrastructure provisioning, DevSecOps pipelines, Security Operations). The IRIDE Marketplace system is designed to allow Users to access Data, Services and Applications/Toolboxes online with simple models like DaaS (Data as a Service) and SaaS (Software as a Service). The business model for the end-user is based on configurable subscription and a credit system that allow flexibility in management of product offerings including data, services, applications and toolboxes. The IRIDE Marketplace will additionally support both the category of Institutional end-users (e.g. free quota access on the basis of service policy) and private end-users (e.g. fee based access based on consumption of credits). As anticipated, the IRIDE Marketplace has been designed as a cloud-native geospatial data and service platform focusing on scalability, portability and interoperability. This system serves as a unified access point for IRIDE multi-constellation EO data, geospatial services, and third-party applications, adopting open standards to ensure compatibility and flexibility. The platform leverages the OGC API processes for seamless interoperability, enhancing modular integration through decoupled API layers. Metadata and data discovery are optimized using the Spatio Temporal Asset Catalog (STAC) specification, enabling efficient cataloging and querying of geospatial data. Through the adoption of STAC and cloud-native formats such as Cloud Optimized GeoTIFF (COG) and GeoParquet the platform supports interoperability and extensibility, minimizing vendor lock-in. Furthermore, standard API services facilitate external M2M interactions, identity integrations, and the connection of common services. The STAC-based Data Catalog employs a stac-fastapi-pgstac backend for efficient metadata organization and query handling, significantly improving the discoverability and usability of multi-source geospatial data. Furthermore, COG and GeoParquet enable efficient data access, storage, and processing, with COG supporting scalable raster visualization and GeoParquet optimizing vector data queries in cloud environments. To support large-scale geospatial data handling, the Marketplace integrates Data Lake capabilities for lifecycle management, from ingestion to transformation, enabling systematic and user-driven workflows for Analysis Ready Data (ARD) production. The platform also facilitates third-party integration through flexible frameworks for hosted applications, algorithm deployment, and public APIs. In particular, the IRIDE Marketplace further supports its third-party users with a console allowing the management of their own product catalog, with the possibility to select from Earth Observation (EO) product configurations templates covering: 1. DaaS products structured as STAC collections, accessible directly to end-users; 2. SaaS products include applications such as geospatial analytics tools and user interfaces. In this way, third-party users can onboard new products, configuring them within the Marketplace to extend offerings to end-users dynamically. At the same time, they can exploit IRIDE Marketplace cloud-native technologies and become a provider of scalable, and interoperable solutions. The IRIDE Marketplace, developed under the Italian PNRR IRIDE Program by ESA, serves as a unified platform for IRIDE offerings enabling the onboarding and the diffusion of EO services, applications and toolboxes with a consistent and simple approach. As shown, its design prioritizes scalability, portability, and interoperability, supporting diverse use cases and fostering innovation in the geospatial ecosystem through advanced technologies.
Add to Google Calendar

Monday 23 June 13:15 - 13:35 (EO Arena)

Demo: D.02.23 DEMO - Machine Learning API for Earth Observation Data Cubes

#stac

The Copernicus Data Space Ecosystem (CDSE) and the federated openEO platform have adopted the openEO specification, which was developed to standardize access to cloud-based processing of satellite imagery big data. This specification simplifies data retrieval via STAC and analysis via standardized processes across various applications.
Building on this foundation, we propose a Machine Learning (ML) API for Satellite Image Time Series Analysis, extending the openEO API to integrate ML workflows. This extension allows users to leverage openEO client libraries in R, Python, Julia, and JavaScript while utilizing the R SITS package, which provides specialized ML tools for satellite image time series analysis.
Our ML API supports both traditional ML algorithms (e.g., Random Forest, SVM, XGBoost) and advanced deep learning models (e.g., Temporal Convolutional Networks (TempCNN), Lightweight Temporal Attention Encoder (LightTAE)). A core focus is reproducibility, ensuring transparent tracking of data provenance, model parameters, and workflows. By integrating ML into the openEO specification, we provide scalable, flexible, and interoperable ML tools for Earth Observation (EO) data analysis.
We encapsulated SITS within the openEO ecosystem using a new R package called openeocraft. This empowers scientific communities to efficiently analyze EO data cubes using advanced ML concepts in a simplified manner across multiple programming languages. This work aims to demonstrate the democratization of access to ML workflows for satellite image time series analysis.

Speakers:


  • Brian Pondi - Institute for Geoinformatics, University of Munster
  • Rolf Simoes - OpenGeoHub Foundation
Add to Google Calendar

Monday 23 June 17:00 - 17:20 (EO Arena)

Demo: D.03.25 DEMO - The WorldCereal Reference Data Module: An open harmonized repository of global crop data

#parquet

The ESA-funded WorldCereal system produces seasonal global cropland and crop type maps at 10 m resolution. To support the system, the Reference Data Module (RDM) was created as a repository of harmonized in-situ reference data. The RDM can be accessed through an API and through a web user interface (UI). Through the RDM UI and API, users can browse a collection of over 130 public data set collections. Furthermore, users can upload their own data sets (geo-parquet and shapefiles) and these can then be used as reference data to run the WorldCereal system. The upload process through the RDM UI is straightforward, with AI-assisted legend mapping to match the WorldCereal standards. If a user decides to make a data set public, they will receive support throughout the interface and from the WorldCereal staff to e.g. select the most appropriate license to make a data set public, check data quality, etc. The WorldCereal RDM will additionally run quality checks to ensure the quality of the data shared. High quality contributed data sets improve the accuracy of the maps being produced in the regions covered by these and the overall quality of the WorldCereal system.
The demonstration will include a short introduction to the RDM API and UI, including the AI-assisted legend mapping, but also the process to make a data set public and what quality checks are necessary. Participants can try the system on the spot via web browser.

Speaker:


  • Juan Carlos - IIASA
Add to Google Calendar

Monday 23 June 15:07 - 15:27 (EO Arena)

Demo: D.04.23 DEMO - Leveraging Sentinel Zarr Data

#stac #zarr

The EOPF Sample Service generates and publishes new Sentinel products in Zarr format, enabling scalable and efficient Earth observation analysis. This demosntration introduces tools developed in the activity that allow users to fully exploit these new data products: an xarray EOPF backend and an xcube EOPF data store.
The xarray EOPF backend provides seamless access to individual Sentinel Zarr data products, with additional features to enhance usability, such as aligning of all Sentinel-2 bands to a common grid. The xcube EOPF data store builds on this by using the STAC API to locate relevant observations and leveraging the xarray backend to open and process the data. It mosaics and stacks Sentinel tiles along the time axis, creating an analysis-ready data cube for advanced geospatial analysis.
Beyond simple data access, xcube offers powerful processing capabilities, including sub-setting, resampling, and reprojection. It also includes an integrated server and a visualisation tool, xcube Viewer, which efficiently renders multi-resolution data pyramids for fast, interactive exploration. The viewer supports basic data analytics, such as polygon-based statistics, band math, and time series visualisation.
This demonstration will show how to access and process Sentinel Zarr data using these tools. We will introduce the xarray backend, explore the EOPF xcube data store, and showcase how xcube enables the creation and visualisation of analysis-ready data cubes. Participants will learn how to perform efficient geospatial analysis with Sentinel Zarr products in a Python environment.

Point of Contact:
Konstantin Ntokas (available on site 23-26 of June)
konstantin.ntokas@brockmann-consult.de
Brockmann Consult GmbH

Speakers:


  • Konstantin Ntokas - Brockmann Consult
Add to Google Calendar

Monday 23 June 16:15 - 17:45 (Hall G2)

Presentation: Cloud Native Copernicus Platform for Latin America and Caribbean (LAC) region

#cloud-native #stac

Authors: Pedro Goncalves, Fabrizio Pacini, Florencio Utreras, Uwe Marquard
Affiliations: Terradue Srl, Universidad de Chile, T-Systems
The CopernicusLAC Platform is a regional Earth Observation (EO) infrastructure designed to address the unique challenges of the Latin America and Caribbean (LAC) region. Its primary objectives include facilitating easy access to Copernicus data, providing robust data processing capabilities, enabling seamless data exchange through federation, and supporting local institutions in achieving independent operational capacity. Developed under a European Space Agency (ESA) contract, the platform is a collaborative effort between Terradue, T-Systems, and the Universidad de Chile. By combining advanced cloud-native technologies with open-source solutions, the platform supports disaster risk reduction, environmental monitoring, and climate resilience efforts. The platform is designed around a cloud-native architecture that emphasizes scalability, flexibility, and interoperability. Kubernetes serves as the backbone for containerized application orchestration, enabling dynamic resource allocation and automated scaling to manage growing data demands, projected to exceed 22 petabytes by 2027. Metadata management adheres to the SpatioTemporal Asset Catalog (STAC) standard, providing efficient and user-friendly data discovery capabilities. Processing workflows leverage Common Workflow Language (CWL) and Argo Workflows to deliver systematic and on-demand solutions for transforming satellite data into actionable information. The platform’s design is aligned with the Earth Observation Exploitation Platform Common Architecture (EOEPCA) and compliant with Open Geospatial Consortium (OGC) standards and best practices. Middleware developed within the platform enables scalable data hosting, processing, and exchange, with all components released as open-source to promote reusability and customization by other initiatives. The platform’s data processing tools are packaged following OGC Best Practices for EO Application Packages, ensuring portability and adaptability for diverse use cases. Beyond its technical innovations, the CopernicusLAC Platform places a strong emphasis on empowering local institutions through capacity building and knowledge transfer. By providing open-source middleware and applications, the platform enables local operators to avoid reliance on expensive commercial software licenses, reducing recurring costs while enhancing autonomy. Open-source solutions offer the flexibility to customize and adapt tools to meet specific regional needs, fostering local ownership and building long-term capacity for independent operation. This approach ensures that institutions in the LAC region can develop a sustainable and resilient ecosystem for EO applications, aligned with their priorities. This presentation will detail the architectural blueprint and system objectives of the CopernicusLAC Platform, highlighting its cloud-native design and integration of open-source technologies within a replicable framework. We will present how the platform addresses objectives such as data accessibility, robust processing, and interoperability, alongside the operational strategies such as pre-operational demonstrations and training programs. By sharing lessons learned and challenges encountered, with this presentation we aim to contribute to the global conversation on advancing Earth Observation platforms through effective capacity building and technology transfer. The CopernicusLAC Platform demonstrates how regional needs can be addressed through innovative design, international collaboration, and adherence to open standards, while empowering local stakeholders with sustainable, adaptable tools that prioritize their independence and growth.
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall K1)

Presentation: DestinE Data Lake – AI-Driven Insights on Edge Services

#stac

Authors: Oriol Hinojo Comellas, Miruna Stoicescu, Dr. Sina Montazeri, Michael Schick, Danaële
Affiliations: EUMETSAT
This paper describes how the Destination Earth Data Lake (DEDL) is enhanced to allow users to exploit cutting-edge artificial intelligence (AI) and machine learning (ML) capabilities for scientific and policy applications. These advancements focus on creating AI-ready data, developing AI application demonstrators, and providing Machine Learning Operations (MLOps) tooling coupled with robust infrastructure. By unlocking AI/ML capabilities close to the data, DestinE empowers users with advanced tools for processing and analyzing large-scale datasets efficiently and innovatively. The European Commission’s Destination Earth (DestinE) initiative is creating precise digital replicas of the Earth, known as Digital Twins, to monitor and simulate natural and human activities and their interactions. Services/ Tools available within DestinE facilitate end-users and policymakers to develop/ execute “what-if” scenarios to evaluate the impacts of environmental challenges, such as extreme weather events and climate adaptation changes, as well as the effectiveness of proposed solutions. Focusing on the Data Lake component, DestinE provides users with unparalleled access to large-scale and diverse datasets, along with a dynamic suite of big data processing services that operate close to these massive data repositories. In practice, DestinE Data Lake provides a portfolio of services, consisting of Harmonised Data Access (HDA), which enables users to access the diverse datasets defined in the DestinE Data Porfolio using a unified STAC API, regardless of data location and underlying access protocol and Edge services: Islet (compute, storage and network resources that users can instantiate and manage), Stack (ready-to-use applications such as JupyterHub and DASK) and Hooks (ready-to-use or user-defined functions). These services are powered by a geographically distributed cloud infrastructure consisting of a Central Site (Warsaw) and Data Bridges, co-located with EuroHPC sites where the DestinE Digital Twins are running (Kajaani, Bologna and Barcelona) or large data providers (EUMETSAT). To harness the full potential of artificial intelligence (AI) next to the data, DestinE Data Lake is evolving its service offerings built on three pillars: AI-ready data, AI applications Demonstrators, and MLOps infrastructure. 1. AI-Ready Data: DestinE’s Data Lake is being equipped with an open-source framework which will facilitate the transformation of the DestinE Data Portfolio diverse datasets into AI-ready formats. The framework will provide preprocessing capabilities such as data collocation, reprojection, regridding, resampling, data cleaning, and metadata handling. By tailoring and combining, this framework will facilitate the seamless integration of diverse datasets and usage of outputs within user-defined AI workflows powered by DEDL Edge Services. 2. AI Application Demonstrators: A set of scientific applications in the field of AI/ML, making use of DEDL data and services, in combination with EUMETSAT satellite data, aiming to gain more insights into specific parameters (e.g. clouds and cloud types, fire risk etc.) or to explore potential improvements of existing data products. End users are involved during the development of the applications, to ensure that their expectations are being met. Mature applications will be incorporated in the operational DestinE service offering. The applications, outputs and deliverables from the work undertaken are open source to stimulate uptake and re-use. 3. MLOps Infrastructure: AI/ML preparedness. DestinE DataLake with GPU-accelerated infrastructure offers flexible edge computing capabilities. This hybrid setup empowers users to train, evaluate, refine, and deploy AI/ML models of small to medium size. Tools like OpenStack IaaS and pre-deployment resources on Destination Earth bridges lower the barrier to entry for model experimentation and operationalization. In summary, these advancements establish a robust foundation for integrating AI/ML capabilities into the DestinE ecosystem. By streamlining workflows, saving time, and providing scalable tools, DestinE enables users to make data-driven decisions more effectively. Moving forward, the initiative will focus on fostering broader collaboration among stakeholders and expanding the portfolio of AI-ready tools and services, ensuring that DestinE continues to drive innovation and sustainable development.
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall K1)

Presentation: The Earth Data Hub: redefining access to massive climate and Earth observation datasets using Zarr and Xarray

#stac #zarr

Authors: Alessandro Amici, Nicola Masotti, Luca Fabbri, Francesco Nazzaro, Benedetta Cerruti, Cristiano Carpi
Affiliations: B-Open
The growing need for efficient and rapid access to climate and Earth observation datasets poses several challenges for data providers. As the volume and complexity of data grows, existing infrastructures struggle to handle the demands for high-performance, massive data access. Traditional systems represent a barrier between data and users, who often find themselves struggling between download queues and a growing number of non-standard retrieve APIs. Traditional systems also fail to optimize for specific needs, such as regional and time-series data access. For this reason many organizations choose to maintain private copies of datasets resulting in unnecessary effort and storage costs. These inefficiencies hinder data-driven research and delay actionable insights. There is an urgent need for a solution that addresses these issues while providing a seamless, efficient, and user-friendly experience. This solution must support diverse workflows, reduce data transfer overheads, and enable scalable analysis of large, multi-dimensional datasets. This session will introduce Earth Data Hub, a cutting-edge data distribution service that leverages cloud-optimized, analysis-ready technologies to provide researchers, policymakers, and technologists with a fast and easy access to more than 3 PB of Earth related data including ERA5 reanalysis, Copernicus DEM and Destination Earth Climate digital twin. By storing data in a heavily compressed Zarr format and organizing it into optimally sized chunks, Earth Data Hub ensures that users retrieve only the data they need, minimizing unnecessary data transfer. Our design favours workflows involving regional and time series analysis, making it extraordinarily fast to work with geographically limited data or time series on point locations. This efficiency is a key enabler for data-driven and AI research, as it reduces computational overheads and accelerates the time to insight. Earth Data Hub also eliminates many other traditional bottlenecks associated with accessing and analyzing climate datasets, such as download queues and cumbersome retrieve APIs. Instead, Earth Data Hub users can leverage tools like Xarray and Dask to perform distributed computations directly on cloud-hosted data, streamlining workflows and unlocking new possibilities for data exploration and modeling. The platform’s catalogue is organized as a SpatioTemporal Asset Catalog (STAC), which ensures discoverability and standardization. Users can search, filter, and retrieve metadata for a wide array of datasets. By adhering to open standards, Earth Data Hub empowers diverse user communities to integrate these datasets into custom workflows. A crucial innovation underpinning Earth Data Hub is its use of a serverless architecture. Serverless data distribution is a data distribution approach in which the distributor does not have to think about or manage backend servers, clusters, caches, queues, requests, adaptors or other infrastructure. This is all done automatically by plugging modern tools such as Xarray and Dask to an object storage instance where the data is kept in Zarr format. This serverless paradigm simplifies maintenance, reduces operational costs, and provides high availability, all while accommodating large-scale, on-demand data access. Functions are executed in response to user requests, ensuring efficient utilization of resources. The absence of traditional servers reduces latency and ensures that users experience consistent performance regardless of the size or complexity of their queries. This session will showcase the practical usage of Earth Data Hub, starting from the catalogue exploration up to the actual data usage in a Dask and Xarray powered environment (Jupyter notebook). The session will also showcase Earth Data Hub integration with other Destination Earth services such as Insula Code. Industry experts will present insights into the practical benefits of adopting Earth Data Hub services, fostering collaboration among stakeholders and the Destination Earth ecosystem. In summary, Earth Data Hub exemplifies the fusion of advanced data management techniques, cloud-optimized formats, and serverless technologies to create a robust platform for accessing and analyzing climate and Earth observation data. Its innovative design supports scalable, efficient, and user-friendly workflows, making it an indispensable resource for anyone working with complex Earth system datasets. The inclusion of cutting-edge datasets such as the Climate Adaptation Digital Twin within Earth Data Hub underscores its commitment to supporting global climate adaptation and mitigation efforts. By providing easy access to high-resolution, up-to-date, and scientifically rigorous data, the platform plays a critical role in empowering stakeholders to make informed decisions in response to the challenges of climate change.
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall K1)

Presentation: Leveraging Insula for Advanced Earth Observation Data Processing: Use Cases in Atmospheric Correction and Evapotranspiration Estimation

#zarr

Authors: Cesare Rossi, Beatrice Gottardi, Davide Foschi, Davide Giorgiutti, Stefano Marra, Alessandro Marin, Gaetano Pace
Affiliations: CGI
Insula - the hub between data and decisions, is an advanced platform for Earth Observation (EO) analytics that leverages powerful big data capabilities to provide users with deep understanding of Earth-related phenomena. Insula integrates cutting-edge and production-ready technologies, such as Kubernetes and Argo Workflows, offering a seamless user experience. Its features enable them to perform complex analyses, such as trend analysis, anomaly detection, predictive modeling, and much more, by harnessing harmonized and integrated datasets. With customizable visualization analytics, autoscaling processing campaigns, and real-time monitoring, Insula empowers different actors throughout the value chain extracting actionable insights efficiently, aligning with their background and objectives. The Insula's flexible environment supports the integration of new data sources and services, ensuring that researchers can adapt the platform to their specific needs. The availability of Python libraries such as GDAL, Rasterio, and xarray, and the incorporation of AI and machine learning (ML) capabilities, with popular libraries like TensorFlow, PyTorch, and scikit-learn, further enhances the platform’s versatility, enabling expert users to perform coding activities directly within Insula. Two use cases demonstrate the powerful applications of Insula in EO analysis: Sentinel-3 OLCI (Ocean and Land Color Instrument) L1 to L2 C2RCC conversion and Standard Evapotranspiration using ERA5. In the first use case, Insula was employed to process Sentinel-3 OLCI data and perform atmospheric correction using the C2RCC algorithm and multi-sensor pixel classification with IdePix tool. Insula’s integrated environment allowed for seamless execution of this complex process, enabling the extraction of key physical variables such as chlorophyll concentration and water quality parameters. By leveraging the platform’s autoscaling capabilities, the processing of large datasets can be optimized for efficiency, ensuring timely results. The final outputs were used for classifying oceanic and coastal regions, providing valuable insights for marine research and management. In the second use case, Insula was used to calculate Standard Evapotranspiration (ET0) using the Penman-Monteith method, a widely accepted approach for estimating water loss from soil and vegetation. The platform facilitated the integration of newly optimized ERA5 hourly reanalysis dataset (in ZARR format) to compute ET0 maps at regional scales. Insula’s processing environment allowed for the smooth execution of the complex calculations while incorporating real-time data updates. A standout feature of Insula is “Insula Perception”, which excels in its visualization and analytics capabilities, empowering effective collaboration between teams and organizations. These features were instrumental in both use cases, enabling a clearer understanding and communication of results. These use cases highlighted Insula’s ability to manage diverse datasets which are critical for climate studies, water resource management, agriculture to mention a few. Moreover, the ability to collaborate and share findings fosters a data-driven approach to Earth observation, facilitating decision-making at various scales and enhancing interdisciplinary research across domains. In summary, Insula – the hub between data and decisions, is a powerful tool for researchers, policymakers, and industries reliant on EO for informed decision-making.
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall K1)

Presentation: Global Fish Tracking System (GFTS): Harnessing Technological Innovations for Conservation and Sustainable Resource Management

#pangeo #zarr

Authors: Daniel Wiesmann, Anne Fouilloux, Tina Odaka, Benjamin Ragan-Kelley, Mathieu Woillez, Quentin Mazouni, Emmanuelle Autret
Affiliations: Development Seed, Simula Research Laboratory, LOPS (Laboratory for Ocean Physics and Satellite remote sensing), DECOD (Ecosystem Dynamics and Sustainability)
Advancing our understanding of marine ecosystems and fostering sustainable resource use requires innovative technological applications. The Global Fish Tracking System (GFTS) represents a pioneering initiative that harnesses the power of advanced technologies to model fish movements and migration patterns, facilitating informed conservation strategies, and sustainable management practices. GFTS operates within the European Union's Destination Earth (DestinE) initiative, integrating high-resolution digital replicas of Earth systems and diverse datasets from DestinE, Copernicus Marine Services, and biologging data stored in the European Tracking Networks (ETN) database. The GFTS leverages the Pangeo ecosystem and a suite of technologies, such as Pangeo-fish, Jupyter, HEALPix, xDGGS, Xarray, and Zarr, for cloud-based data processing and analysis. In line with the session's emphasis on next-generation Earth observation and high-resolution system modeling, the GFTS employs the Climate Change Adaptation Digital Twin to evaluate future environmental conditions for essential fish habitats. By integrating ocean physics with fish ecology, the GFTS provides new perspectives on essential fish habitats such as migration swimways and spawning grounds, facilitating effective conservation strategies. The GFTS addresses the session's focus on big data management and integration, utilizing advanced data management techniques for handling vast amounts of biologging data. The system's use of high-performance computing and cloud platforms aligns with the session's focus on leveraging these technologies for large-scale data handling and computational demands. Through the development of decision support tools, GFTS transforms complex datasets into actionable insights, highlighting the importance of interactive and intuitive visualization tools in making data accessible for scientists and policymakers. These technologies embody the session's emphasis on visualization and user interaction, demonstrating the potential of digital twins in enhancing accessibility and reproducibility of scientific findings. The GFTS initiative demonstrates the crucial role of technological innovations in addressing environmental data challenges, offering a practical example of the Digital Twin of the Earth system application for sustainable resource management and conservation. This project underscores the potential of technological advancements in revolutionizing our understanding and management of marine ecosystems, presenting a compelling case study for discussion in this session.
Add to Google Calendar

Monday 23 June 16:15 - 17:45 (Hall K1)

Presentation: Environmental Digital Twins Based on the interTwin DTE Blue-Print Architecture

#stac

Authors: Juraj Zvolensky, Michele Claus, Iacopo Ferrario, Andrea Manzi, Matteo Bunino, Maria Girone, Miguel Caballer, Sean Hoyal, Alexander Jacob
Affiliations: Eurac Research, EGI, CERN, UPV, EODC
The Horizon Europe interTwin project is developing a highly generic yet powerful Digital Twin Engine (DTE) to support interdisciplinary Digital Twins (DT). Comprising thirty-one high-profile scientific partner institutions, the project brings together infrastructure providers, technology providers, and DT use cases from Climate Research and Environmental Monitoring, High Energy and Astro Particle Physics, and Radio Astronomy. This group of experts enables the co-design of the DTE Blueprint Architecture and the prototype platform benefiting end users like scientists and policymakers but also DT developers. It achieves this by significantly simplifying the process of creating and managing complex Digital Twins workflows. There are 6 use cases co-designing and validating the interTwin DTE in Climate Reaearch and Environmental Monitoring, ranging from Early Warning for floods and droughts to climate impact assessment, The DTs exploit one of the most promising applications of digital twins of the Earth, the simulation of user-defined what-if scenarios by allowing the selection of a high number of different input datasets, models, models’ parameters, regions and periods of time. This talk will highlight in more technical detail the implementation of drought early warning DT and the utilized components from the digital twin engine. Motivated by the goal of contributing to climate change adaptation measures and recognizing the importance of seasonal forecasts as a crucial tool for early warning systems and disaster preparedness, we are developing a hydrological seasonal forecasting digital twin for the Alpine region to tackle the critical challenge of drought risk management. The analysis of historical observations shows that the pattern and intensity of precipitation and temperature trends are changing over the European Alpine region (Brunner et al. 2023), with important consequences for the management of water resources in the Alpine and downstream basins. The modelling workflow of the proposed forecasting system is based on the integration of physical-based models, artificial intelligence, climate forcings and satellite-based estimates. We believe that the complexity of such workflow can effectively showcase the benefits of developing a digital twin. The project emphasizes reproducibility and portability, adhering to the principles of FAIR (Findable, Accessible, Interoperable, Reusable) and open science to ensure transparency, usability, and widespread applicability of the results. All software components are built as new open-source software (https://github.com/orgs/interTwin-eu/ ) or contributing to existing open-source projects. To achieve these goals, we adopt cutting-edge technologies widely recognized within the Earth Observation (EO) and environmental modeling communities. The openEO API, a standardized interface for processing large geospatial datasets, enables seamless integration of remote sensing data, while the SpatioTemporal Asset Catalog (STAC) API facilitates efficient data discovery and management. Together, these technologies form the backbone of our data pipeline, enabling scalable and efficient workflows. A distinguishing feature of our approach is the use of containerized workflows, implemented using the Common Workflow Language (CWL) (Amstutz et al. 2018). CWL provides a standardized, flexible framework for defining and executing computational workflows, ensuring consistency and repeatability across different computing environments. However, the integration of CWL with APIs like openEO and STAC in the Earth Observation domain presents unique challenges. Real-world examples of such integrations are sparse, requiring us to pioneer innovative solutions that bridge these technologies. This involves addressing complexities in workflow orchestration, data handling, and inter-API communication to build a robust and interoperable system. The ITwinAI core module of intertwin allows seamless integration of data driven modelling with our workflows. It enables development and deployment of complex deep learning models in scalable HPC and cloud environments. The DT is deployable using standard TOSCA templates and utilizes High-Performance Computing (HPC) instances to accommodate the computational demands of large-scale simulations and data processing. This deployment ensures scalability, enabling the system to handle extensive datasets and support a diverse range of applications. By leveraging distributed computing resources, we aim to create a responsive and adaptive framework capable of addressing dynamic environmental challenges. The interTwin project represents a significant step forward in the application of Digital Twins to environmental monitoring and prediction. By integrating state-of-the-art technologies with open science principles, we aim to deliver a powerful tool for drought prediction that is not only accurate and reliable but also accessible to researchers and policymakers. Our work paves the way for broader adoption of Digital Twin technologies in the Earth Observation community, offering a replicable and scalable model for tackling global environmental issues. In doing so, we hope to contribute to the development of resilient and sustainable systems capable of mitigating the impacts of climate change and environmental degradation. Amstutz, P. (Ed.), Crusoe, M. R. (Ed.), Tijanić, N. (Ed.), Chapman, B., Chilton, J., Heuer, M., Kartashov, A., Leehr, D., Ménager, H., Nedeljkovich, M., Scales, M., Soiland-Reyes, S., & Stojanovic, L. (2016). Common Workflow Language, v1.0. figshare . https://doi.org/10.6084/m9.figshare.3115156.v2 Brunner, M. I., Götte, J., Schlemper, C., & van Loon, A. F. (2023). Hydrological Drought Generation Processes and Severity Are Changing in the Alps. Geophysical Research Letters, 50(2). https://doi.org/10.1029/2022GL101776
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall N1/N2)

Session: D.03.02 Free Open Source Software for the Geospatial Domain: current status & evolution - PART 1

#cloud-native

Free and Open Source Software is has a key role in the geospatial and EO communities, fostered by organizations such as OSGeo, Cloud Native Computing Foundation, Apache and space agencies such as ESA and NASA. This session showcases the status of OSS tools and applications in the EO domain, and their foreseen evolution, with a focus on innovation and support to open science challenges.
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall N1/N2)

Presentation: Pangeo Europe: A Community-Driven Approach to Advancing Open Source Earth Observation Tools Across Disciplines

#pangeo #zarr

Authors: Deyan Samardzhiev, Anne Fouilloux, Tina Odaka, Benjamin Ragan-Kelley
Affiliations: Lampata, Simula Research Laboratory, IFREMER
The Pangeo Europe community embodies the collaborative movement of free and open-source software (FOSS), transforming Earth Observation (EO) by fostering innovation and inclusivity. As an open platform, Pangeo Europe not only empowers researchers, industry professionals, and software developers within geosciences but also extends its reach to other disciplines, creating synergies that enhance the robustness and adaptability of its tools and workflows. This presentation will explore how Pangeo Europe is advancing the state of open source EO tools through community engagement and cross-disciplinary collaboration, and the integration of flagship European initiatives such as EarthCODE: - Engaging the Community for Feedback and Roadmaps: Pangeo Europe actively involves users and developers in shaping its roadmap for future developments, including innovations like xDGGS, advanced Zarr implementations, and benchmarking tools. This participatory approach ensures the tools meet real-world needs and adapt to emerging scientific challenges. - Facilitating End-User Adoption: Through training programs, onboarding resources, and addressing bottlenecks in technology adoption, Pangeo Europe makes it easier for researchers and practitioners to adopt and use advanced open-source EO tools effectively. - Providing a Forum for Open Source Developers: The community creates opportunities for developers to collaborate, discuss best practices, and seek funding for advancing open-source software. By harnessing Europe’s collective expertise, Pangeo Europe strengthens the ecosystem for open geospatial innovation. - Promoting FAIR Principles and Performance Benchmarks: The integration of FAIR (Findable, Accessible, Interoperable, Reusable) principles, alongside efforts to benchmark and optimise tool performance, ensures that Pangeo Europe delivers transparent, scalable, and efficient solutions. - Showcasing EarthCODE for Collaborative Innovation: EarthCODE, the Earth Science Collaborative Open Development Environment, exemplifies the principles of Open Science and FAIR data management. Developed in response to FutureEO Independent Science Review 2022 recommendations and feedback from the 2023 Science Strategy Workshop, EarthCODE provides a unified, open-access environment for ESA-funded Earth System Science activities. It integrates key scalable cloud computing platforms and ecosystems for Earth Observation (EO) analysis such as Pangeo. It provides secure, long-term storage for research data and enables scientists to seamlessly share their research data and workflow outputs while adhering to FAIR and open science principles. Additionally, EarthCODE offers robust community support and comprehensive training, led by Pangeo. EarthCODE empowers scientists to share and reuse research outputs more effectively, facilitating collaboration and innovation across disciplines. - Driving Cross-Disciplinary Synergies: By promoting the Pangeo software stack beyond geosciences, such as in bioimaging and cosmology, the community identifies shared challenges and solutions across domains. For instance, collaborations around data formats and metadata, such as OME-Zarr versus GEO-Zarr, foster innovation and help build generic, scalable, and reusable workflows for diverse scientific applications. - Collaborating on Global Challenges: The cross-disciplinary outreach of Pangeo Europe not only broadens the applicability of its tools but also encourages a culture of co-creation where best practices from different domains converge to tackle challenges like climate change, biodiversity monitoring, and cosmological data analysis. For instance, the Global Fish Tracking System (GFTS), an ESA-funded use case of the Destination Earth initiative, uses the Pangeo ecosystem to model fish movement and develop a decision support tool in support of conservation policies. By showcasing real-world examples and the impact of its community-driven model, this session will highlight how Pangeo Europe fosters open innovation across disciplines, making EO tools more robust, generic, and adaptable while advancing the frontiers of open science.
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall N1/N2)

Presentation: Scalable Workflows for Remote Sensing Data Analysis in Julia

#zarr

Authors: Lazaro Alonso, Fabian Gans, Felix Cremer
Affiliations: Max-Planck Institute for Biogeochemistry
Analysis-ready remote sensing data cubes provide a basis for a variety of scientific applications. In essence, they allow users to access remote sensing data as very large multi-dimensional arrays stored either on traditional hard drives or in object stores in cloud environments [1-2]. However, to actually work with the analysis-ready data, convenient software for accessing and analyzing these large datasets in an efficient and distributed way is required. Many data processing tasks are I/O-limited, so to maximize efficiency, data processing tools need to be aware of the internal chunking structure of the datasets. The goal is to apply user-defined functions over arbitrary dimension slices of large datasets or to perform groupby-combine operations along dimensions. A very popular tool for these tasks in the Python programming language is xarray, using Dask for delayed and distributed evaluation, by mapping storage chunks to nodes of processing chunks. However, for very large problems or some mapreduce-like operations, task graphs can become very large or hard to resolve, so users trying to scale their algorithms to the multi-terabyte scale regularly run into issues of unresolvable task graphs when applying their user-defined algorithms. Specialized computing backends for certain tasks, like flox for mapreduce, have been developed to mitigate these problems. The Julia programming language is designed for scientific computing and is known for its good performance and scalability in scientific applications. Here we present a family of Julia libraries that are designed to work together and cover the full stack of data cube analysis, such as data access across different gridded data formats, data models for labeled dimensional arrays, and tooling for distributed large-scale processing of data cubes. DiskArrays.jl provides a common interface for dealing with high-latency multidimensional array data sources and provides the array interface for I/O packages like NetCDF.jl, Zarr.jl, and ArchGDAL.jl. DimensionalData.jl wraps DiskArrays into arrays with labeled dimensions and very fast dimension-lookup based indexing. DiskArrayEngine is built on top of DiskArrays.jl and provides efficient applications of user-defined functions in a moving-window fashion over huge disk-based or remote arrays. It is aware of the data's underlying chunking structure when scheduling distributed computations but does not do a strict mapping between chunks storage units and DAG graph nodes and therefore avoids materializing huge graphs in memory and the scaling issues that follow from this. It forms the basis of YAXArrays.jl, which provides a user-friendly interface to the functionality provided by DiskArrayEngine.jl to enable users to run their custom analysis functions over combinations of slices and windows in different dimensions as well as flexible reductions. Distributed computing is supported by Distributed.jl for embarrassingly parallel operations or Dagger.jl for mapreduce-based operations. These operations scale to very large multi-terabyte arrays and don't break even when provided arrays with millions of small chunks. We present benchmarks showing that YAXArrays.jl can compete with state-of-the-art Python libraries like flox for mapreduce operations. Although some parts of the software stack presented here are still under development, they have been successfully used in a list of scientific applications, such as extreme event detection[3], characterization of vegetation-climate dynamics[4], or Sentinel-1 based forest change detection on the European continental scale[5], with very promising results. These libraries might become an important building block to enable the Julia remote sensing community to bring their high-resolution remote sensing applications to the continental and global scale. [1] https://esd.copernicus.org/articles/11/201/2020/ [2] https://arxiv.org/abs/2404.13105 [3] https://bg.copernicus.org/articles/18/39/2021/ [4] https://bg.copernicus.org/articles/17/945/2020/bg-17-945-2020.html
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Hall N1/N2)

Presentation: Enabling Large-Scale Earth Observation Data Analytics with Open Source Software

#stac

Authors: Gilberto Camara, Dr Rolf Simoes, Felipe Souza, Felipe Carlos
Affiliations: National Institute For Space Research (inpe), Brazil, Open Geo Hub Foundation, Menino Software Crafter
The current standard approach for open source software for big Earth observation (EO) data analytics uses cloud computing services that allow data access and processing [1]. Given such large data availability, the EO community could benefit significantly from open-source solutions that access data from cloud providers and provide comprehensive data analytics support, including machine learning and deep learning methods. The authors have developed the R package sits, an end-to-end environment for big EO data analytics based on a user-focused API to fill this gap. Using a time-first, space-latter approach, it supports land classification with deep learning methods. It offers several capabilities not currently available together in other EO open-source analytics software. The sits package focuses on time series analytics. The aim is to use as much data as possible from the big EO data collections. Many EO analysis packages only support the classification of single-date or seasonal composites. In such cases, most of the temporal land use and land cover variation produced by the repeated coverage of remote sensing satellites is not used. Time series are a powerful tool for monitoring change, providing insights and information that single snapshots cannot achieve. Spatiotemporal data analysis is an innovative way to address global challenges like climate change, biodiversity preservation, and sustainable agriculture [2-4]. The sits package uses a "time-first, space-later" approach [5] that takes image time series as the first step to analyse remote sensing data. Time series classification produces a matrix of probability values for each class. In the "space-later" part of the method, a Bayesian smoothing algorithm improves these classification results by considering each pixel's spatial neighbourhood. Thus, the result combines spatial and temporal information [6]. An essential capability of the package is its support for multiple EO cloud services. Using the STAC protocol [7], users can access services such as Copernicus Data Space Ecosystem (CDSE), Amazon Web Services (AWS), Microsoft Planetary Computer (MPC), Digital Earth Africa, Digital Earth Australia, NASA's Harmonized Landsat-Sentinel collection, the Brazil Data Cube (BDC). However, each provider has a particular implementation of STAC. Dealing with such a lack of complete standardisation required substantial work by the authors. Machine learning and deep learning algorithms for spatiotemporal data require that analysis-ready data (ARD) from EO cloud services be converted to regular data cubes. Appel and Pebesma [8] define a data cube as an n-dimensional matrix of cells combining a 2D geographical location, a 1D set of temporal intervals, and a k-dimensional set of attributes. For each position in space, the data cube should provide a multidimensional time series. The data cube should give a valid 2D image for each time interval. Their definition is the basis for software design in packages such as OpenEO [9] and OpenDataCube [10]. In sits, we have extended the data cube definition by Appel and Pebesma [8] to include a further spatial dimension related to the spatial organisation used by ARD image collections. For example, Sentinel-2 images are organised in the MGRS tiling system, which follows the UTM grid. Thus, to process data spanning multiple UTM grid zones, EO data cubes need an extra dimension, given by the ARD tile. This extension enables sits to process large-scale data, unlike systems that adopt a more restricted data cube definition. For classification, sits supports a range of machine learning and deep learning algorithms, including support vector machines, random forests, temporal convolutional neural networks, and temporal attention encoders. It also includes object-based time series classification methods and spatial-temporal segmentation, allowing for detailed analysis of land cover changes over time and space. One relevant feature of sits is its support for active learning [11], combining SOM (self-organised maps) and uncertainty estimation. SOM is a technique where high-dimensional data is mapped into a two-dimensional map [12]. The neighbours of each neuron of a SOM map provide information on intraclass and interclass variability, which helps detect noisy samples [13]. SOM maps are helpful in improving sample quality. Selecting good training samples for machine learning classification of satellite images is critical to achieving accurate results. Currently, SOM is one of the few data analysis methods that enables the quality of each sample to be assessed independently of the other samples for the same class. Active learning is an iterative strategy for optimising training samples. At each round, users analyse the data using the SOM maps to remove outliers; then, they classify the area and compute uncertainty maps to define critical areas for new sample collection. Users repeat this procedure until they obtain a final set of low-noise and high-validity training samples. To support Bayesian smoothing and uncertainty estimates, the output of machine learning classifiers in sits is a set of probability matrices. Each pixel associated with a time series is associated with a set of probabilities estimated by the classifier, one for each class. Using probability maps as an intermediate step between the classification algorithm and the categorical maps has proven to be of much relevance to improve the results of land classification. In conclusion, sits provides an integrated workflow for satellite data handling, including pre-processing, sampling, feature extraction, modelling, classification, post-classification analysis, uncertainty estimation and accuracy assessment. Designed with a clear and direct set of functions, it is accessible to users with basic programming knowledge. Its easy-to-learn API simplifies the complex tasks associated with large-scale EO data analysis. The open-source software is available in the standard R repository CRAN, and the source code is on https://github.com/e-sensing/sits. The online book at https://e-sensing.github.io/sitsbook/ enables step-by-step learning with many examples. References [1] Vitor C. F. Gomes, Gilberto R. Queiroz, and Karine R. Ferreira. “An Overview of Platforms for Big Earth Observation Data Management and Analysis”. In: Remote Sensing 12.8 (2020), p. 1253. [2] Curtis E. Woodcock et al. “Transitioning from Change Detection to Monitoring with Remote Sensing: A Paradigm Shift”. In: Remote Sensing of Environment 238 (2020), p. 111558. ISSN: 00344257. [3] Valerie J. Pasquarella et al. “From Imagery to Ecology: Leveraging Time Series of All Available LANDSAT Observations to Map and Monitor Ecosystem State and Dynamics”. In: Remote Sensing in Ecology and Conservation 2.3 (2016), pp. 152– 170. ISSN: 2056-3485. [4] Michelle Picoli et al. “Big Earth Observation Time Series Analysis for Monitoring Brazilian Agriculture”. In: ISPRS Journal of Photogrammetry and Remote Sensing 145 (2018), pp. 328–339. [5] Gilberto Camara et al. “Big Earth Observation Data Analytics: Matching Requirements to System Architectures”. In: 5th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data. Burlingname, CA, USA: ACM, 2016, pp. 1–6. [6] Rolf Simoes et al. “Satellite Image Time Series Analysis for Big Earth Observation Data”. In: Remote Sensing 13.13 (2021), p. 2428. [7] M. Hanson. “The Open-source Software Ecosystem for Leveraging Public Datasets in Spatio-Temporal Asset Catalogs (STAC)”. In: AGU Fall Meeting Abstracts 23 (2019). [8] Marius Appel and Edzer Pebesma. “On-Demand Processing of Data Cubes from Satellite Image Collections with the Gdalcubes Library”. In: Data 4.3 (2019). [9] Matthias Schramm et al. “The openEO API: Harmonising the Use of Earth Observation Cloud Services Using Virtual Data Cube Functionalities”. In: Remote Sensing 13.6 (2021), p. 1125. [10] Adam Lewis et al. “The Australian Geoscience Data Cube — Foundations and Lessons Learned”. In: Remote Sensing of Environment 202 (2017), pp. 276–292. [11] M. M. Crawford, D. Tuia, and H. L. Yang. “Active Learning: Any Value for Classification of Remotely Sensed Data?” In: Proceedings of the IEEE 101.3 (2013), pp. 593–608. ISSN: 1558-2256. [12] T. Kohonen. “The Self-Organizing Map”. In: Proceedings of the IEEE 78.9 (1990), pp. 1464–1480. ISSN: 1558-2256. [13] Lorena A. Santos et al. “Quality Control and Class Noise Reduction of Satellite Image Time Series”. In: ISPRS Journal of Photogrammetry and Remote Sensing 177 (2021), pp. 75–88.
Add to Google Calendar

Monday 23 June 16:15 - 17:45 (Hall N1/N2)

Session: D.03.02 Free Open Source Software for the Geospatial Domain: current status & evolution - PART 2

#cloud-native

Free and Open Source Software is has a key role in the geospatial and EO communities, fostered by organizations such as OSGeo, Cloud Native Computing Foundation, Apache and space agencies such as ESA and NASA. This session showcases the status of OSS tools and applications in the EO domain, and their foreseen evolution, with a focus on innovation and support to open science challenges.
Add to Google Calendar

Monday 23 June 16:15 - 17:45 (Hall N1/N2)

Presentation: KNeo: yet another cloud-native platform for scalable and automated EO data processing

#cloud-native

Authors: Dr. Marian Neagul, Dr. Markus Neteler, Vasile Craciunescu, Mrs. Carmen Tawalika, Mrs. Anika Weinmann, Mr. Dan Avram
Affiliations: Terrasigna, mundialis
The Kubernetes Native Earth Observation (KNeo) platform (humble) aims to improve EO data processing through the integration of cutting-edge cloud-native technologies. Building on the Terrasigna Y22 and mundialis actinia cloud processing platforms, KNeo employs serverless computing, event-driven architectures and open-source solutions to deliver scalable, efficient and cost-effective EO data services. The platform leverages KNative for serverless operations, integrates OpenID Connect for authentication and adopts ZOO Project as a core implementation for OGC API Processes. The platform supports a diverse range of use cases, including deforestation detection, forest classification, single-tree detection, urban green roof identification and irrigation suitability mapping. These use cases employ a variety of EO data (e.g. Sentinel-1, Sentinel-2 and aerial imagery) and advanced analytical methods, such as time-series analysis, vegetation indices, image segmentation and raster algebra. All these innovations are aligned with ESA’s Earth Observation Exploitation Platform Common Architecture.
Add to Google Calendar

Monday 23 June 16:15 - 17:45 (Hall N1/N2)

Presentation: Geospatial Machine Learning Libraries and the Road to TorchGeo 1.0

#stac

Authors: Adam Stewart, Nils Lehmann, Xiao Xiang Zhu
Affiliations: Technical University of Munich
The growth of machine learning frameworks like PyTorch, TensorFlow, and scikit-learn has also sparked the development of a number of geospatial domain libraries. In this talk, we break down popular geospatial machine learning libraries, including: 1. TorchGeo (PyTorch) 2. eo-learn (scikit-learn) 3. Raster Vision (PyTorch, TensorFlow*) 4. PaddleRS (PaddlePaddle) 5. segment-geospatial (PyTorch) 6. DeepForest (PyTorch) 7. TerraTorch (PyTorch) 8. SITS (R Torch) 9. scikit-eo (scikit-learn, TensorFlow) For each library, we compare the features they have as well as various GitHub and download metrics that emphasize the relative popularity and growth of each library. In particular, we promote metrics including the number of contributors, forks, and test coverage as useful for gauging the long-term health of each software community. Among these libraries, TorchGeo stands out with more builtin data loaders and pre-trained model weights than all other libraries combined. TorchGeo also boasts the highest number of contributors and test coverage, while Raster Vision has the most forks and segment-geospatial has the most stars on GitHub. We highlight particularly desirable features of these libraries, including a command-line or graphical user interface, the ability to automatically reproject and resample geospatial data, support for the spatio-temporal asset catalog (STAC), and time series support. The results of this literature review are regularly updated with input from the developers of each software library and can be found here: https://torchgeo.readthedocs.io/en/stable/user/alternatives.html Among the above highly desirable features, the one TorchGeo would most benefit from adding is better time series support. Geotemporal data (time series data that is coupled with geospatial information) is a growing trend in Earth Observation, and is crucial for a number of important applications, including weather and climate forecasting, air quality monitoring, crop yield prediction, and natural disaster response. However, TorchGeo has only partial support for geotemporal data, and lacks the data loaders or models to make effective use of geotemporal metadata. In this talk, we highlight steps TorchGeo is taking to revolutionize how geospatial machine learning libraries handle spatiotemporal information. In addition to the preprocessing transforms, time series models, and change detection trainers required for this effort, there is also the need to replace TorchGeo's R-tree spatiotemporal backend. We present a literature review of several promising geospatial metadata indexing solutions and data cubes, including: 1. R-tree 2. GDAL GTI 3. STAC 4. OpenDataCube 5. Rasdaman 6. SciDB 7. Geopandas 8. Rioxarray 9. Geocube For each spatiotemporal backend, we compare the array, list, set, and database features available. We also compare operating system support and ease of installation for different solutions, as well as preliminary performance benchmarks on scaling experiments for common operations. TorchGeo requires support for geospatial and geotemporal indexing, slicing, and iteration. The library with the best spatiotemporal support will be chosen to replace R-tree in the coming TorchGeo 1.0 release, marking a large change in the TorchGeo API as well as a promise of future stability and backwards compatibility for one of the most popular geospatial machine learning libraries. TorchGeo development is led by the Technical University of Munich, with initial incubation by the AI for Good Research Lab at Microsoft, and contributions from 75+ contributors from around the world. TorchGeo is also a member of the OSGeo foundation, and is widely used throughout academia, industry, and government laboratories. Check out TorchGeo here: https://www.osgeo.org/projects/torchgeo/
Add to Google Calendar

Monday 23 June 16:15 - 17:45 (Hall N1/N2)

Presentation: xcube geoDB: Bridging the Gap in Vector Data Management for Earth Observation

#stac

Authors: Thomas Storm, Alicja Balfanz, Gunnar Brandt, Norman Fomferra
Affiliations: Brockmann Consult
Although Earth Observation data are typically disseminated as gridded products, vector data with geographical context often play a crucial role in their analysis, processing, and dissemination. For instance, in-situ data used for instrument calibration and validation are usually provided as vector data and training and validation datasets for machine learning applications often come as feature data. Additionally, crop type reports for agricultural fields are stored in vector formats. Such data are commonly integrated with satellite imagery for further downstream analysis and processing. However, the diversity of formats, coordinate reference systems, and data models present significant challenges, requiring substantial expertise and effort from users to process vector data effectively. To address these challenges, we introduce here xcube geoDB, a geographical database solution designed to simplify common tasks associated with vector data. xcube geoDB offers an open-source, user-friendly API that facilitates the storage, reading, and updating of vector data. Users can store, manipulate, share, and delete their data through an easy-to-use Python interface, process it using the flexible openEO API, or access it via a dedicated STAC interface. Currently, xcube geoDB is in active use by individual researchers and several projects: • The Green Transition Information Factory Austria leverages xcube geoDB to efficiently store and retrieve air quality data, traffic statistics, and other variables. • ESA’s Earth Observation Training Data Lab uses xcube geoDB to manage metadata for large machine learning test datasets. • The DEFLOX project relies on xcube geoDB to maintain and expand a streamlined archive of in-situ reflectance data collected by globally distributed devices. Potential users can explore xcube geoDB through its Python client and pre-configured Jupyter notebooks are available for testing on platforms like EuroDataCube and DeepESDL. Thanks to its versatility, generic design, and robust user management features, xcube geoDB addresses a wide range of use cases involving geographical vector data. In this presentation, we will showcase several real-world applications that highlight its reliability and performance. xcube geoDB is under active development. Recent enhancements include the introduction of a STAC interface for streamlined and standardized vector data access, improved user management features allowing for shared namespaces and collaborative data collections, and support for the openEO API, which introduces the innovative vector data cube concept. Future developments on the roadmap include a dedicated web-based graphical user interface to further enhance usability and accessibility.
Add to Google Calendar

Monday 23 June 14:00 - 15:30 (Room 0.11/0.12)

Presentation: Operational monitoring of the water quality of French lakes and rivers from space

#zarr

Authors: Guillaume Morin, Anne-Sophie Dusart, Robin Buratti, Guillaume Fassot, Tristan Harmel, Flore Bray, Nathalie Reynaud, Thierry Tormos, Gilles Larnicol
Affiliations: Magellium, 1 rue Ariane, INRAE, Aix Marseille Univ, RECOVER, Team FRESHCO, Pôle ECLA, OFB, DRAS, Service ECOAQUA, Pôle ECLA
United Nations Sustainable Development Goal No. 6 required to “ensure availability and sustainable management of water and sanitation for all” by 2020. Many national or international acts imposed regular monitoring of water quality by the authorities : the US Clean Water Act & Safe Drinking Water Act, the European Water Framework Directive (WFD). Though, lack of resources make such a surveillance hard to maintain on a continuous and long term. From the late 1970’s, and especially since 20 years, remote sensing (RS) has regularly made progress, and is now considered as a major and innovative asset for Earth Observation. It especially complements the deficit represented by temporal and spatial sparsity of in situ data samplings although it relies on strict validation with long term gathering of in situ data. In 2024, the French Ministry for Ecological Transition and the French Spatial Agency funded a national project for three years, where water quality, along with water resources and irrigation indicators, must be delivered on a weekly basis, i.e Near-Real-Time (NRT), to most of the public institutions in charge of water management. The aim of this project is to develop a perennial service where water quality (WQ) products will be delivered for all water bodies and rivers larger than 50 m, five to seven days after acquisition, to every public decision maker. During the first year and since June 2024, our system is operating and has been covering 99 tiles in mainland France and Corsica, and delivering products on the 579 water bodies monitored by the WFD (DCE). To do so, we developed a fully operational processing chain, organised structured archives on the cloud, and created a visualizing web interface, user friendly enough to be easily used by all end-users. Concerning the processing of satellite images per se, our system is implemented on WEkEO’s cloud machines with immediate access to Copernicus satellite images. It allows it to operate as soon as images are available on WEkEO’s s3 buckets. For optically derived parameters, the processing routine first treats the atmospheric-correction (AC) and sunglint removal from level-1C to level-2A with GRS. Cloud- and water- masks are synchronously computed with Sen2cloudless and an OTSU filter respectively, for an optimal water pixel identification. Then level-2A remote sensing reflectances are converted to transparency, chlorophyll-a concentration, turbidity and suspended particulate matter thanks to widely used algorithms. On average, time for the computation takes around 2h31 for 318 tiles each week. Skin surface temperatures of the water bodies are also derived from TIRS data with a split window algorithm. Data is then stored on WEkEO’s bucket in the zarr format, which is open, community supported, cloud compliant as recommended by ESA. Water quality rasters can be visualized through a web interface with their spatial average and standard deviation. Also, a feature allows users to consult the time-series for each parameter from 2017 to date, and download them to simple text format. Validation is made by matchups between RS data based on the gathering of historical national databases (priorly normalized) in order to evaluate the accuracy and uncertainties of products. It will also include the deployment of several autonomous stations that will provide high-frequency water surface temperatures, and hyperspectral data from the two Hypernet stations installed in France. Our service is a co-design solution developed with actors like institutional public managers, authorities in charge of the environment, scientists, and stakeholders interested in evaluating the ecological status of inland water bodies. We consider this program as a leverage to the use of RS data for all. As it is widely known, all quality elements required by the WFD cannot be provided by satellite observations, which will never replace in situ measurements and have to be seen as a complement. Unfortunately, the lack of technical expertise, understanding of satellite-based Earth observation methods and capacity to process the RS products render the use of such data laborious for most users. It also necessitates decision makers to get acculturated to, and to build their confidence in these products, which still contain hidden bias and require to be validated with ground data as mentioned earlier. In comparison, conventional assessment methods that, traditionally, can be intercalibrated with round-robin laboratory exercises for example, still appear more reliable. Finally, WFD status assessment and classification systems vary between European Union member countries, so their conventions and databases are unlikely compatible between states. As a result, the chance that we have to propose a wide national service providing satellite-based data associated with their validation with reglementary in situ monitoring, which is already well structured in France, offers a unique opportunity to popularize the use of such data to a wide range of decision makers. We expect this opportunity to tackle the obstacles and inertia cited above, which prevent the use of RS data by most. It also will be the first extensive production exercise on French territories and will undoubtedly bring useful data for aquatic environmental studies and related fields.
Add to Google Calendar

Monday 23 June 09:00 - 10:20 (Room 0.96/0.97)

Hands-On: D.03.10 HANDS-ON TRAINING - EarthCODE 101 Hands-On Workshop

#pangeo

This hands-on workshop is designed to introduce participants to EarthCODE's capabilities, guiding them from searching, finding, and accessing EO datasets and workflows to publishing reproducible experiments that can be shared with the wider scientific community. This workshop will equip you with the tools and knowledge to leverage EarthCODE for your own projects and contribute to the future of open science. During this 90 minute workshop, participants will, in a hands-on fashion, learn about: - Introduction to EarthCODE and the future of FAIR and Open Science in Earth Observation - Gain understanding in Finding, Accessing, Interoperability, and Reusability of data and workflows on EarthCODE - Creating reproducible experiments using EarthCODE’s platforms - with a hands-on example with Euro Data Cube and Pangeo - Publishing data and experiments to EarthCODE At the end of the workshop, we will take time for discussion and feedback on how to make EarthCODE better for the community. Pre-requirements for attendees: The participants need to bring their laptop and have an active github account but do not need to install anything as the resources will be accessed online using Pangeo notebooks provided by EarthCODE and EDC. Please register your interest by filling in this form: https://forms.office.com/e/jAB9YLjgY0 before the session.

Speakers:


  • Samardzhiev Deyan - Lampata
  • Anne Fouilloux - Simula Labs
  • Dobrowolska Ewelina Agnieszka - Serco
  • Stephan Meissl - EOX IT Services GmbH
Add to Google Calendar

Monday 23 June 09:00 - 10:20 (Room 1.34)

Hands-On: D.04.08 HANDS-ON TRAINING - EO Data Processing with openEO: transitioning from local to cloud

#cloud-native #stac

This hands-on training aims to provide participants with practical experience in processing Earth Observation (EO) data using openEO. By the end of the session, participants will be able to:
- Understand the core concepts of EO data cubes and cloud-native processing
- Transition from local data processing to cloud-based environments efficiently, always using the openEO API
- Use openEO Platform (openeo.cloud) to process EO data via multiple cloud providers
- Gain familiarity with Python data access and processing using the openEO API

Training Content & Agenda

Introduction & Overview
- Introduction to the openEO API: functionalities and benefits
- Data cubes concepts and documentation review
- Overview of the "Cubes & Clouds" online course by Eurac Research

Transitioning to Cloud Processing
- Challenges and advantages of moving from local processing to cloud environments
- Overview of cloud providers (VITO Terrascope, EODC, SentinelHub) and their integration with openEO Platform
- Key concepts of FAIR (Findable, Accessible, Interoperable, Reusable) principles implemented by openEO
- STAC: how the SpatioTemporal Asset Catalog allows interoperability

Hands-On Training with openEO
- Setting Up the Environment
-- Accessing openEO Platform JupyterLab instance
-- Clone GitHub repositories for training materials
- Basic openEO Workflow
-- Discovering and accessing EO datasets
-- Executing simple queries using openEO Python Client
-- Processing workflows using local and cloud-based computation
- Multi-Cloud Processing
-- Sample workflow using multiple cloud providers
- Executing an End-to-End EO Workflow
-- Data discovery and preprocessing
-- Applying processing functions (e.g., time-series analysis, indices computation)
-- Exporting and sharing results according to open science principles

Q&A and Wrap-Up
- Discussion on best practices and troubleshooting common issues
- Resources for further learning (EO College, openEO documentation)

Speakers:


  • Claus Michele - Eurac Research, Bolzano, Italy
  • Zvolenský Juraj - Eurac Research, Bolzano, Italy
  • Jacob Alexander - Eurac Research, Bolzano, Italy
  • Pratichhya Sharma - VITO, Mol, Belgium
Add to Google Calendar

Tuesday 24 June

14 events

Tuesday 24 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Upscaling the water use efficiency analyses - GDA Agriculture pilot case Indonesia

#cloud-native

Authors: Alen Berta, Viktor Porvaznik, Juan Suarez Beltran, Stefano Marra, Alessandro Marin
Affiliations: CGI Deutschland, CGI Italy, GMV
Agriculture, as the largest consumer of water worldwide, faces a critical challenge in improving irrigation efficiency to ensure food security and sustainable farming practices. Currently, more than 50% of ground and potable water is wasted due to inefficient irrigation systems, an issue exacerbated by the growing impacts of climate change, including more frequent and severe droughts. This inefficiency threatens food production and livelihoods for millions of people necessitating robust solutions to optimize water usage and enhance irrigation management. The GDA Agriculture project aims to tackle this issue by deploying the ESA Sen-ET algorithm, enriched with global EO based biomass products, and fully automated and integrated into the CGI Insula platform. This cloud-native platform integrates EO data, Geographic Information Systems (GIS), and advanced analytics to provide a cutting-edge solution for analyzing water use efficiency and daily evapotranspiration. Leveraging Sentinel-2 and Sentinel-3 data, along with other EO datasets, the project identifies problematic areas, evaluates irrigation system performance, and provides actionable insights to optimize water use. As such, this project supports Asia Development Bank in the related project of enhancing dryland farming systems in Indonesia, but it can be used globally as it does not rely on local data. Local data (crop areas/crop types) can be uploaded into the Insula platform for post-processing depending on the users need for granularity. The CGI Insula platform delivers significant benefits to end-users, including farmers, policymakers, and funding organizations. Firstly, it provides near-real-time monitoring and analysis of water usage efficiency, enabling farmers to make timely adjustments to their irrigation practices and mitigates the risk of water scarcity. The platform also supports the identification of areas that require additional irrigation or where existing systems are underperforming, allowing for targeted interventions and resource allocation. This targeted approach maximizes the effective use of water resources, improving agricultural productivity and fostering sustainability. By integrating EO data, GIS, and advanced analytics, the project provides a robust solution for optimizing water usage and improving agricultural productivity. The benefits for end-users are manifold, including near-real-time monitoring and targeted interventions. This operational implementation not only enhances food security and water sustainability but also supports the overall resilience and prosperity of agricultural communities.
Add to Google Calendar

Tuesday 24 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Austrian ground motion service - just a copy of EGMS?

#stac

Authors: Karlheinz Gutjahr
Affiliations: Joanneum Research
Since 2014, the European Copernicus programme has launched a wide range of Earth Observation (EO) satellites, named Sentinels designed to monitor and forecast the state of the environment on land, sea and in the atmosphere. The ever-increasing amount of acquired data makes Copernicus the largest EO data provider and the third biggest data provider in the world. Experts have already shown the data’s potential in several new or improved applications and products. Still, challenges exist to reach end users with these applications/products, i.e. challenges associated with distributing, managing, and using them in users’ respective operational contexts. In order to mainstream the use of Copernicus data and information services for public administration, the national funded project INTERFACE has been set up in autumn 2022. Since then the project consortium has been focussing on user-centric interfaces and data standards with special attention to integrating different data sets and setting up a prototype system that allows the systematic generation of higher level information products. One information layer within INTERFACE is the so-called Austrian ground motion service that is an interface to the SuLaMoSA prototype workflow and the data as provided by the European Ground Motion Service (EGMS). In this paper I will focus on the second aspect and explain the enhancements with respect to a pure copy of the EGMS data, discuss some findings for Austria and give some recommendations to further improve the usability of the EGMS data. The process of enhancing the EGMS data for inclusion in the INTERFACE STAC catalogue involves both spatial and temporal preprocessing. This includes the merging of the EGMS tiles and spatial slicing of the data to Austria, the temporal alignment and refinement of the EGMS updates to a continuous time series with additional attributes per temporal overlap, as well as the computation of supplementary statistical parameters to enrich the time series dataset. As of October 29, 2024, three updated versions of the EGMS products are available. Version 1 (v1) covers the period from February 2016 to December 2021, version 2 (v2) spans from January 2018 to December 2022, and version 3 (v3) includes the period from January 2019 to December 2023. The EGMS update strategy employs a five-year moving window approach to maximize point density. Analysis of the EGMS ortho product demonstrates that the number of valid points increases from 1.099 million in version 1 (v1) to 1.266 million in version 2 (v2) and 1.230 million in version 3 (v3). This indicates that reducing the observation period from six to five years results in an increase in point density of approximately 115% and 112%, respectively. Conversely, the temporal combination of versions 1 and 2 reduces the number of valid points to 1.036 million, while the combination of all three versions decreases the point count further to 0.998 million. This highlights a reduction of 6% and 9%, respectively, compared to v1, due to the loss of coherent scattering over time. However, this behaviour is not the same for all 18 tiles, which were used to cover the national territory of Austria. There is a clear trend in west east direction. The maximum decrease in point density is found in tile L3_E44N26, covering the area of Tirol. The minimum decrease in point density is found in tile L3_E47N28, roughly covering the area of south west of Vienna. This effect might be explained by the topography and land cover that changes from high alpine and sparsely populated terrain to moderate rolling topography with a highly urbanised environment. The extended temporal overlap of four years facilitates a robust merging of the time series under the valid assumption that the mapped points predominantly exhibit the same deformation regime across all time series. Consequently, only a relative shift of the subsequent time series with respect to the preceding one needs to be determined, resulting in a high degree of redundancy. The standard deviation of the residuals between the shifted time series i+1i+1i+1 and time series iii was 1.6 mm ± 1.87 mm for the merge of version 1 and version 2, and 1.3 mm ± 1.63  for the merge with version 3. Furthermore, the number of outliers per overlap amounted to 8.5±7.1 for the merge of version 1 and version 2, and 10.0±7.5 for the merge with version 3. Finally, to distinguish the predominant deformation regime—seasonal, accelerating, linear, or none—I propose calculating the root mean square error (RMSE) for each of these deformation models. The deformation regime with the minimum RMSE can be identified as the best fit. Subsequently, the reliability of this selection can be assessed based on the significance level of the model parameters. This straightforward decision tree would enable potential users to focus on the deformation pattern of interest and exclude the majority of points that do not conform to this pattern. In summary, geographic trends reveal varying point density reductions, influenced by terrain and land cover. A four-year temporal overlap allowed robust time series merging with low residuals and outlier counts. To identify deformation regimes, calculating the RMSE for seasonal, accelerating, linear, or no deformation models is proposed, enabling user-focused selection of relevant patterns.
Add to Google Calendar

Tuesday 24 June 17:45 - 19:00 (X5 - Poster Area)

Poster: The IRIDE Cyber Italy project: an enabling PaaS for Digital Twin Applications

#cloud-native

Authors: Stefano Scancella, Fabio Lo Zito, Stefano Marra, Davide Foschi, Fabio Govoni, Simone Mantovani
Affiliations: Serco Italia S.p.A., CGI Italia S.r.l., MEEO S.r.l.
The IRIDE Cyber Italy project represents a significant national step forward to develop and implement Digital Twins (DT) of the Earth, by leveraging Earth Observation data and cloud technologies and services to build a scalable and interoperable reference Framework enabling the use of DTs in diverse thematic domains. As part of the Italian EO space program funded by the European Union's National Recovery and Resilience Plan (PNRR) and managed by the European Space Agency (ESA) in collaboration with the Italian Space Agency (ASI), the project demonstrates Italy’s commitment to advancing EO applications and fostering digital innovation. The Cyber Italy Framework aims to provide an enabling Platform as a Service (PaaS) solution to exploit the Digital Twins capabilities with practical applications in field such as risk management, environmental monitoring, and urban planning. A Digital Twin, as a digital replica of Earth, integrates data-driven models to simulate natural and human processes, thereby allowing advanced analyses, predictive capabilities, and insights into the interactions between Earth's systems and human activities. SERCO is the company leader of the consortium composed by e-GEOS, CGI, and MEEO. Phase 1 of the project, completed in 2024 and lasted 12 months, focused on prototyping a hydro-meteorological Digital Twin, showcasing the powerfulness of a DT framework and its application to flood simulation and management. Phase 2, on-going, lasting additional 12 months, evolves the framework prototype into a pre-operational system, by: • enhancing the Framework’s scalability, elasticity and interoperability; • setting up a DevOps environment over a cloud-based infrastructure; • demonstrating the usability of the Framework by integrating an additional DT (Air quality Digital Twin), developed by a third-party. The final Phase 3, lasting 10 months and ending in 2026, will focus on the full operationalization of the Framework as a platform for the integration of any additional DTs, to expand thematic coverage. The project adopts a cloud-native, container-based architecture, leveraging the continuous integration, delivery and deployment (CI/CD) approach to ensure efficient updates and system adaptability. The infrastructure, based on OVHcloud technologies, is designed to support both horizontal and vertical scalability and elasticity, allowing it to handle increasing data volumes and concurrent user sessions seamlessly by a Kubernetes-based orchestration. The Digital Twin framework is powered by Insula, the CGI Earth Observation (EO) Platform-as-a-Service, which has been successfully implemented in various ESA projects, including DestinE DESP. Insula provides a comprehensive suite of APIs designed to support hosted Digital Twins (DTs) with functionalities such as data discovery, data access, processing orchestration, and data publishing. Beyond these foundational capabilities, Insula also enables the seamless integration of custom processors, allowing users to extend the platform's analytical capabilities to meet specific project requirements. Complementing its robust APIs, Insula offers an advanced user interface tailored for complex big data analytics. This UI leverages a scalable and cloud-native backend, empowering users to perform intricate analyses efficiently and at scale, thus making Insula a key technology for operationalizing Digital Twin frameworks. Interoperability is a key concept of Cyber Italy Framework, facilitated by the integration in the Framework of the ADAM platform developed by MEEO, which adopts both Harmonised Data Access (HDA) and Virtual Data Cube (VDC) approaches, ensuring consistent and fully customizable handling of input data, supporting the integration of distributed data sources and diverse DTs while enhancing long-term flexibility. ADAM is largely adopted as key technology within relevant European Commission initiatives (WEkEO, DestinE Service Core Platform, …) and ESA projects (ASCEND, HIGHWAY, GDA APP, …) to generate and deliver Analysis Ready Cloud Optimised (ARCO) products to support multi-domain and temporal analyses. One of the key features of the CyberItaly Framework is the ability to define and implement "what-if" scenarios, which provide stakeholders with critical tools to simulate conditions, predict outcomes, and make data-driven decisions. These scenarios are instrumental in addressing challenges like hydro-meteorological events, offering precise predictions for flood risks or air quality previsions, such as emissions or traffic pollution estimation, enabling more effective planning and response strategies. The IRIDE Cyber Italy project goal is to create a robust and versatile digital ecosystem, integrating cutting-edge EO technologies and seeks to demonstrate the potential of Digital Twins in supporting a sustainable Earth system and environmental management. By leveraging cloud-native architectures, and emphasizing standardization and scalability, the IRIDE Cyber Italy project is creating a versatile platform for DTs. This project represents a crucial step towards by creating a comprehensive framework capable of supporting a wide range of Digital Twins. Future applications could extend the use of Digital Twins on a wide range of sectors, such as urban planning, agriculture, and natural resource management, contributing to the global vision of using EO technologies to advance Earth system understanding and management.
Add to Google Calendar

Tuesday 24 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Fields of The World and fiboa: Towards interoperable worldwide agricultural field boundaries through standardization and machine-learning

#fiboa #parquet

Authors: Matthias Mohr, Michelle Roby, Ivor Bosloper, Hannah Kerner, Prof. Dr. Nathan Jacobs, Caleb Robinson
Affiliations: Taylor Geospatial Engine, Arizona State University, Washington University in St. Louis, Microsoft
In this talk, we present two closely related initiatives that aim to facilitate datasets for worldwide standardized agricultural field boundaries: the fiboa data specification and the Fields of The World (FTW) benchmark dataset and models. Both initiatives work in the open and all data and tools are released under open licenses. Fiboa and FTW emerged from the Taylor Geospatial Engine’s Innovation Bridge Program’s Field Boundary Initiative [1]. This initiative seeks to enable practical applications of artificial intelligence and computer vision for Earth observation imagery, aiming to improve our understanding of global food security. By fostering collaboration among academia, industry, NGOs, and governmental organizations, fiboa and FTW strive to create shared global field boundary datasets that contribute to a more sustainable and equitable agricultural sector. Field Boundaries for Agriculture (fiboa) [2] is an initiative aimed at standardizing and enhancing the interoperability of agricultural field boundary data on a global scale. By providing a unified data schema, fiboa facilitates the seamless exchange and integration of field boundary information across various platforms and stakeholders. At its core, fiboa offers an openly developed specification for representing field boundary data using GeoJSON and GeoParquet formats. This specification has the flexibility to incorporate optional 'extensions' that specify additional attributes. This design allows for the inclusion of diverse and detailed information pertinent to specific use cases. In addition, fiboa encompasses a comprehensive ecosystem that includes tools for data conversion and validation, tutorials, and a community-driven approach to developing extensions. This allows a community around a specific subject to standardize datasets. By using datasets with the same extensions, the tools can validate attribute names, coding lists, and other conventions. The fiboa initiative goes beyond providing specifications and tooling by developing over 40 converters for both open and commercial datasets [3]. These converters enable interoperability between diverse data sources by transforming them into the fiboa format. This significant effort ensures that users can integrate and utilize data more efficiently across different systems and platforms. All open datasets processed through this initiative are made freely accessible via Source Cooperative [4], an open data distribution platform. Fields of The World (FTW) [5] is a comprehensive benchmark dataset designed to advance machine learning models for segmenting agricultural field boundaries. Spanning 24 countries across Europe, Africa, Asia, and South America, FTW offers 70,462 samples, each comprising instance and semantic segmentation masks paired with multi-date, multi-spectral Sentinel-2 satellite images. Its extensive coverage and diversity make it a valuable resource for developing and evaluating machine learning algorithms in agricultural monitoring and assessment. FTW also provides a pretrained machine learning model for performing field boundary segmentation. This model is trained on the diverse FTW dataset, enabling it to generalize effectively across different geographic regions, crop types, and environmental conditions. Additionally, ftw-tools - a set of open-source tools accompanying the benchmark - simplifies working with the FTW dataset by providing functions for download, model training, inference, and other experimental or explorative tasks. Fiboa (Field Boundaries for Agriculture) and Fields of The World (FTW) complement each other in advancing agricultural technology. fiboa provides a standardized schema for field boundary data. FTW, with its benchmark dataset and pretrained machine learning model, generates field boundary data from satellite imagery to fill global data gaps. FTW’s source polygons used to create the benchmark dataset and output ML-generated field boundaries are fiboa-compliant. Together, both projects form a powerful ecosystem: fiboa ensures data consistency and usability, while FTW supplies the tools and insights to produce and refine this data. This synergy supports precision farming, land use analysis, land management, and food security efforts, driving innovation and sustainability in agriculture worldwide. The vision is to develop a continuously evolving global field boundary dataset by combining the open field boundaries converted into the fiboa format with the output datasets generated by FTW. References: [1] https://tgengine.org [2] https://fiboa.org [3] https://fiboa.org/map [4] https://source.coop [5] https://fieldsofthe.world
Add to Google Calendar

Tuesday 24 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Water Health Indicator System (WHIS): A Global Water Quality Monitoring Web App through Advanced Earth Observation Technologies

#stac #cog

Authors: Daniel Wiesmann, Jonas Sølvsteen, Olaf Veerman, Emmanuel Mathot, Daniel Da Silva, Ricardo Mestre, Pr Vanda Brotas, PhD Ana Brito, Giulia Sent, João Pádua, Gabriel Silva
Affiliations: Development Seed, MARE Centre, Labelec
The Water Health Indicator System (WHIS) serves as a robust platform for monitoring water quality, showcasing the capabilities of earth observation technologies and environmental data analysis that are accessible to everyone. Developed through a collaboration between Development Seed, MARE (Marine and Environmental Sciences Centre), and LABELEC, WHIS addresses common challenges in existing water monitoring by offering a scalable solution designed for assessing aquatic ecosystem health. At the heart of WHIS is a powerful integration of geospatial cloud technologies, built on the eoAPI (Earth Observation API). This allows users to leverage tools such as the Spatio-Temporal Asset Catalog (STAC) and Cloud-Optimized GeoTIFF (COG) for dynamic data access. A platform like this enables seamless integration of remote sensing datasets, particularly from the Sentinel-2 mission, ensuring precision and adaptability in water quality assessment. The application utilizes specialized atmospheric processing algorithms, such as Acolite, to analyze water quality, tackling issues related to atmospheric interference and spectral interpretation. By focusing on key indicators like chlorophyll content and turbidity, WHIS allows for localized calibration and insights into ecosystem health, demonstrating that these advancements in monitoring are achievable with the right tools. WHIS is tailored for inland and coastal water bodies. Its cloud-optimized infrastructure provides an interactive interface where users can select specific water bodies, explore geographical data, conduct statistical analyses, and inspect pixel-level information—all of which can be replicated by other users with eoAPI. Furthermore, the innovative product-services business model links technological capabilities with environmental monitoring needs, showing how any organization can leverage these advancements. As global challenges related to water availability and quality persist, the Water Health Indicator System stands as a testament to what can be achieved with eoAPI technology. If we can harness its potential, so can you, making it an essential tool for environmental monitoring and ecosystem management.
Add to Google Calendar

Tuesday 24 June 17:45 - 19:00 (X5 - Poster Area)

Poster: StacLine : new QGIS Plugin for diving into STAC Catalogs

#stac

Authors: Fanny Vignolles, Florian Gaychet, Vincent Gaudissart, Mélanie Prugniaux
Affiliations: CS Group
Geographic Information Systems (GIS) have become fundamental for analyzing and visualizing geospatial and temporal data across diverse domains, including environmental monitoring, disaster response, hydrology, urban planning, and agriculture. The availability of Earth Observation (EO) data has significantly increased in recent years, thanks to open-access data initiatives and advancements in satellite missions such as Sentinel, Landsat, and SWOT. However, while the datasets have become more accessible, the tools required to process and integrate them efficiently remain a challenge. The introduction of SpatioTemporal Asset Catalogs (STAC) as a data standard has revolutionized how datasets are organized and distributed. STAC provides a unified framework for describing, managing, and sharing spatiotemporal data through catalogs linked to geospatial servers. When combined with Open Geospatial Consortium (OGC) standards like Web Map Service (WMS), STAC enables seamless geospatial data management and interoperability. This project focuses on bridging the gap between STAC-based data catalogs and GIS workflows by developing a QGIS plugin that integrates STAC with the open-source GIS environment. The plugin simplifies data search, filtering, and visualization while adhering to both STAC and OGC standards, providing professionals and researchers with an efficient tool for managing EO data. Despite the increasing adoption of STAC-based data catalogs, their integration with GIS platforms remains a significant challenge. Existing plugins in QGIS for handling STAC data are limited, offering only basic functionalities and lacking the advanced capabilities required for sophisticated workflows. Current solutions often restrict users to viewing dataset footprints without allowing interactive visualization or the ability to style data layers dynamically. Additionally, these tools frequently require manual downloads and subsequent imports into QGIS, making the process inefficient and prone to user errors. Beyond these technical limitations, ensuring that a tool remains accessible and intuitive for a diverse audience is equally critical. Furthermore, achieving seamless interoperability between STAC and OGC protocols, particularly in the context of integrating WMS for real-time visualization, adds another layer of complexity. To address these challenges, we have developed a QGIS plugin that brings significant innovations to enhance filtering capabilities, simplify data import, and ensure interoperability. Designed with an intuitive interface, it strikes a careful balance between user-friendly simplicity for non-experts and the advanced functionality required by researchers and field practitioners. By incorporating ontological approaches, the plugin enables more precise and efficient dataset discovery. The integration of WMS protocols facilitates automatic data import, allowing users to preview datasets and dynamically apply visualization styles directly within QGIS. These styles, derived from metadata and cartographic servers adhering to OGC standards, provide tailored renderings suited to specific analytical needs. The plugin’s strict adherence to STAC standards aims to promote a compatibility with any STAC-compliant catalogue, enhancing its ability to integrate seamlessly into diverse geospatial platforms and workflows. The user interface has been designed to accommodate both novice and expert users, offering advanced configuration options for customized workflows without sacrificing simplicity. This combination of advanced functionality and ease of use positions the plugin as an essential tool for professionals relying on Earth Observation data, reducing the barriers to integrating STAC data into GIS projects. The current version has been implemented for the HYSOPE II project (CNES), the dissemination platform dedicated to SWOT products and generally to all kind of hydrological datasets and is intended to be extended to other initiatives. As the STAC ecosystem evolves, the plugin is designed to adapt and grow, incorporating new features and responding to user needs. One planned enhancement is the addition of a dynamic timeline feature, allowing users to explore temporal patterns in datasets interactively. This timeline will enable quick identification of dense data availability periods and improve usability for time-series analysis by rendering layers adaptively based on the selected temporal range. Additionally, we also envision the development of an adaptive form system that dynamically configures itself based on search parameters, which may be specific to each dataset. This automatic configuration will leverage the filtering extension and the queryables of the STAC API. This plugin, named QGIS StacLine, represents a significant advancement in democratizing access to STAC-based geospatial data. By addressing the limitations of existing tools and focusing on usability, interoperability, and scalability, it bridges the gap between complex EO data catalogs and practical GIS applications. Looking ahead, the development of the plugin involves a key decision: whether to focus on niche, closed-use cases for tailored solutions or to expand its scope for broader application across diverse projects. While an open approach offers versatility, it risks diluting the specificity and focus of the tool. Regardless of its future direction, the plugin stands as a vital resource for the geospatial community, enabling seamless access and utilization of the growing wealth of spatiotemporal data.
Add to Google Calendar

Tuesday 24 June 17:45 - 19:00 (X5 - Poster Area)

Poster: A Federated Learning Environment for Earth Observation Students: A Success Story from Austria

#stac #pangeo

Authors: Martin Schobben, Luka Jovic, Nikolas Pikall, Joseph Wagner, Clay Taylor Harrison, Davide Festa, Felix David Reuß, Sebastian Hahn, Gottfried Mandlburger, Christoph Reimer, Christian Briese, Matthias Schramm, Wolfgang Wagner
Affiliations: Department of Geodesy and Geoinformation, Technische Universität Wien, Earth Observation Data Centre for Water Resources Monitoring GmbH
Establishing an effective learning environment for Earth Observation (EO) students is a challenging task due to the rapidly growing volume of remotely sensed, climate, and other Earth observation data, along with the evolving demands from the tech industry. Today’s EO students are increasingly becoming a blend of traditional Earth system scientists and "big data scientists", with expertise spanning computer architectures, programming paradigms, statistics, and machine learning for predictive modeling. As a result, it is essential to equip educators with the proper tools for instruction, including training materials, access to data, and the necessary computing infrastructure to support scalable and reproducible research. In Austria, research and teaching institutes have recently started collaborating to integrate their data, computing resources, and domain-specific expertise into a federated system and service through the Cloud4Geo project, which is funded by the Austrian Federal Ministry of Education, Science, and Research. In this presentation, we will share our journey towards establishing a federated learning environment and the insights gained in creating teaching materials that demonstrate how to leverage its capabilities. A key aspect of this learning environment is the use of intuitive and scalable software that strikes a balance between meeting current requirements and maintaining long-term stability, ensuring reproducibility. To achieve this, we follow the Python programming philosophy as outlined by the Pangeo community. In addition, we need to ensure that the environment is accessible and inclusive for all students, and can meet the demand of an introductory BSc level course on Python programming as well as an MSc research project focused on machine learning with high-resolution SAR data. We accomplished this by combining the TU Wien JupyterHub with a Dask cluster at the Earth Observation Data Centre for Water Resources Monitoring (EODC), deployed close to the data. A shared metadata schema, based on the SpatioTemporal Asset Catalog (STAC) specifications, enables easy discovery of all federated datasets, creating a single entry point for data spread across the consortium members. This virtually “unlimited” access to data is crucial for dynamic and up-to-date teaching materials, as it helps spark the curiosity of students by opening-up a world full of data. Furthermore, the teaching materials we develop showcase the capabilities of the federated system, drawing on the combined resources of the consortium. These materials feature domain-relevant examples, such as the recent floods in central Europe, and incorporate scalable programming techniques that are important for modern EO students. These tutorials are compiled into a Jupyter Book, the “EO Datascience Cookbook”, published by the Project Pythia Foundation, which allows students to execute notebooks in our federated learning environment with a single click. Beyond serving as teaching material, the Jupyter Book also acts as a promotional tool to increase interest in EO datasets and their applications. We are already seeing the benefits of our federated learning environment: 1) it enhances engagement through seamless, data-driven storytelling, 2) it removes barriers related to computing resources, 3) it boosts performance by breaking complex tasks into manageable units, and 4) it fosters the development of an analytical mindset, preparing students for their future careers. We hope that this roadmap can serve as a model for other universities, helping to preserve academic sovereignty and reduce reliance on tech giants, such as Google Earth Engine. Federated learning environments are essential in training the next generation of data-driven explorers of the Earth system.
Add to Google Calendar

Tuesday 24 June 09:45 - 10:05 (EO Arena)

Demo: C.01.25 DEMO - DGGS: Scalable Geospatial Data Processing for Earth Observation

#zarr

Objective:
This demonstration will introduce the DGGS (Discrete Global Grid System) framework, highlighting its ability to process and analyze large Earth Observation (EO) datasets efficiently. The demo will focus on DGGS’ scalability, data accessibility, and potential to improve EO workflows by leveraging hierarchical grid structures and efficient data formats like Zarr.

Demonstration Overview:
Introduction to DGGS:
Brief overview of the DGGS framework and its hierarchical grid system designed to handle large-scale geospatial data efficiently.
Application to Earth Observation Data:
Demonstrating DGGS' ability to transform and process EO datasets, with an emphasis on its potential for improved data storage and access.
Visualization and Analytics:
Showcasing basic visualization and analytic capabilities within the DGGS framework, demonstrating its ease of use for EO data exploration.
Future Potential:
Explaining and discussing how DGGS could enhance future EO workflows, particularly for climate monitoring and large-scale environmental data analysis.
Format:
The presenter will guide the audience through the demonstration, highlighting DGGS' features and potential for real-world applications.
A short Q&A session will allow for audience interaction.
Duration:
20-minute slot.
This demonstration will showcase DGGS as a promising tool for scalable and efficient Earth Observation data processing, offering a glimpse into its potential applications and future benefits.
Add to Google Calendar

Tuesday 24 June 14:15 - 14:35 (EO Arena)

Demo: D.03.32 DEMO - NASA-ESA-JAXA EO Dashboard

#stac

This demonstration will demonstrate the features of the NASA-ESA-JAXA EO Dashboard. It will covert following elements:
- Dashboard exploration - discovering datasets, using the data exploration tools
- Browsing interactive stories and discovering scientific insights
- Discovering Notebooks in the stories and how to execute them
- Creating new stories using the story-editor tool
- Browsing the EO Dashboard STAC catalogue
- Exploring the documentation


The demo will be performed by ESA, NASA and JAXA joint development team.
Add to Google Calendar

Tuesday 24 June 13:52 - 14:12 (EO Arena)

Demo: D.04.17 DEMO - Interactively visualise your project results in Copernicus Browser in no time

#cog

In this demo, we will demonstrate how to interactively visualize and explore your project results using Copernicus Browser. Copernicus Browser is a frontend application within the Copernicus Data Space Ecosystem, designed to explore, visualize, analyze, and download Earth Observation data.

We will guide you through the necessary steps to prepare your data for ingestion, introduce various services within the Ecosystem one of them to support data ingestion (Bring Your Own COG API), and show you how to configure your data for interactive visualization. This includes setting up a configuration file, writing an Evalscript, and creating a legend.

Finally, we will demonstrate how to visualize and analyze results within Copernicus Browser.

Speakers:


  • Daniel Thiex - Sinergise
Add to Google Calendar

Tuesday 24 June 17:37 - 17:57 (EO Arena)

Demo: D.04.26 DEMO - Accessing Copernicus Contributing Missions, Copernicus Services and other complementary data using CDSE APIs: OData, STAC, S3, OGC, openEO

#stac

Copernicus Data Space Ecosystem offers a wide portfolio of data sets complementary to the “core” Sentinel products. They characteristics may differ from the Sentinel data sets and some of them may not be available in all of the CDSE APIs. The aim of this demonstration session is to facilitate usage of the complementary datasets in the CDSE platform by explaining the main differences between them and Sentinel data based on selected data access scenarios. Code snippets in the CDSE JupyterLab will be provided to allow CDSE users to utilize them in their own applications.

Speaker:


  • Jan Musiał - CloudFerro
Add to Google Calendar

Tuesday 24 June 16:07 - 16:27 (EO Arena)

Demo: D.04.28 DEMO - Exploring Copernicus Sentinel Data in the New EOPF-Zarr Format

#cloud-native #stac #zarr

Overview:
This demonstration will showcase the Earth Observation Processing Framework (EOPF) Sample Service and the newly adopted cloud-native EOPF-Zarr format for Copernicus Sentinel data. As ESA transitions from the SAFE format to the more scalable and interoperable Zarr format, this session will highlight how users can efficiently access, analyze, and process Sentinel data using modern cloud-based tools.

Objective:
Attendees will gain insight into:
- The key features of the Zarr format and its advantages for cloud-based workflows.
- How the transition to EOPF-Zarr enhances scalability and interoperability.
- Accessing and exploring Sentinel data via the STAC API and S3 API.
- Using Jupyter Notebooks for interactive data exploration and analysis.
- Running scalable Earth observation workflows on cloud platforms.

Interactive Discussion & Feedback:
Following the demonstration, there will be a dedicated time for discussion and feedback. Attendees can share their experiences, ask questions, and provide valuable input on the usability and future development of the EOPF-Zarr format. This is a great opportunity to learn about next steps in the transition process, future developments, and how to integrate EOPF-Zarr into your own workflows.

Join us to explore how EOPF-Zarr is changing access to Copernicus Sentinel data and enabling scalable Earth observation workflows, and contribute your thoughts on shaping the next phase of this transformative technology!
Add to Google Calendar

Tuesday 24 June 08:30 - 10:00 (Hall M1/M2)

Presentation: An Interactive Scientific Visualization Toolkit for Earth Observation Datasets

#zarr

Authors: Lazaro Alonso, Jeran Poehls, Nuno Carvalhais, Markus Reichstein
Affiliations: Max-Planck Institute for Biogeochemistry
Visualizing and analyzing data is critical for identifying patterns, understanding impacts, and making informed predictions. Without intuitive tools, extracting meaningful insights becomes challenging, diminishing the value of collected information. FireSight[1], an open-source prototype developed within the Seasfire[2] project, addresses these challenges by offering a data-driven visual approach to fire analysis and prediction. Other tools, such as LexCube[3], which focuses on visualizing 2D textures in a 3D space, or the Initiative Carbon Plan[4], which specializes in 2D maps from Zarr stores, provide additional methods for interacting with spatial data. While these tools excel in specific areas, FireSight's comprehensive visualization enhances multidimensional analysis, enabling users to derive deeper insights from complex datasets. The toolkit leverages advanced web technologies to deliver interactive and visually compelling 3D volumetric renders. Its design allows users to easily customize the interface by integrating modern user interface (UI) components. The platform provides intuitive browser experiences powered by React, OpenGL Shading Language, and ThreeJS. Through a web-based interface, users can interactively select variables from different data stores, dynamically explore data in 2D and 3D where applicable, and calculate relationships between variables. A key objective is to enhance the visualization of observational data and modeling outputs, supporting the interpretation and communication of results. The visualization toolkit offers several key features: (1) users can dynamically explore data, selecting any variable and viewing it in both 2D and 3D when a time dimension is available. (2) Relationships between variables can be calculated, enhancing analytical capabilities for deeper data insights. (3) The tool supports the visualization of various Earth observation datasets, which can serve as inputs for modeling frameworks, ensuring flexibility in data exploration. (4) Finally, the code base is released on GitHub as open-source FireSight, with detailed instructions for installation and operation. Currently, plotting is restricted to the entire spatial extents of datasets, requiring a local dataset for streaming information. However, the chunking method of the Zarr data format offers potential for cloud-based EO platforms to enable pixel-level exploration. This capability would facilitate the visualization of complex modeling outputs without excessive data transfer. Aligned with Open Science principles, FireSight development incorporates community-driven libraries such as React, ThreeJS, and Zarr.js, while actively contributing to repositories like tweekpane-react[5]. The platform emphasizes modularity to ensure adaptability for future EO applications and interdisciplinary outreach. This presentation will explore the platform's design philosophy, technical implementations, and future expansion plans, including the integration of pyramid data schemes for high-resolution datasets. These advancements pave the way for next-generation scientific data exploration. By fostering open innovation, FireSight aims to bridge the gap between Earth Observation researchers, educators, and non-specialist communities, amplifying the impact of scientific endeavors and encouraging cross-disciplinary collaborations. [1] https://github.com/EarthyScience/FireSight [2] https://seasfire.hua.gr/ [3] https://www.lexcube.org/ [4] https://carbonplan.org/blog/maps-library-release [5] https://github.com/MelonCode/react-tweakpane/pull/3
Add to Google Calendar

Tuesday 24 June 08:30 - 10:00 (Room 0.94/0.95)

Presentation: Supporting Urban Heat Adaptation with Earth Observation

#stac #zarr

Authors: Daro Krummrich, Adrian Fessel, Malte Rethwisch
Affiliations: OHB Digital Connect
Climate change has ubiquitous effects on the environment and on human life. While the increased frequency of extreme weather events or droughts have immediate and drastic consequences, the direct effect of rising ambient temperatures on humans is more subtle and affects demographics unequally. The direct influence of rising temperatures is most significant in cities and in environments that are heavily shaped by humans, partly because of lack of awareness of climate change and partly because planning and redesign processes did not consider those changes. A typical phenomenon seen in urban environments are urban heat islands, which manifest as microclimates affecting the surface and atmosphere above the urban space. They are indicated by average temperatures and thermal behavior that significantly exceeds that of the surrounding rural areas and can be attributed in part to the ubiquitous presences of artificial surface types suppressing natural soil function, regulatory functions of water bodies or vegetation and altered radiation budget. Further, the atmospheric modifications brought about by urban heat islands affect air quality and may even influence local weather patterns, such as rainfall. Mitigation of urban heat islands in principle can be achieved by altering urban planning to integrate more green spaces, water surfaces and to avoid certain man-made surface types. However, despite the intensity with which heat islands affect human life, redesign of existing urban environments is rarely a practical solution. Nevertheless, the need to act has been realized by administrators, leading to novel regulations foreseeing for instance the implementation of heat action plans which contain immediate measures during heat waves and guidelines for more sustainable future planning. In this presentation, we highlight the status and results from two complementary initiatives devised to support urban heat adaptation: First, we present the “Urban Heat Trend Monitor”, a GTIF capability striving to ease integration of satellite Earth observation data into adaptation strategies. Recognizing that spaceborne Earth observation cannot deliver thermal infrared data at spatial resolutions appropriate for urban spaces we introduce the thermal infrared sensor RAVEN as the second focus. RAVEN is a custom SWAPc-sensitive multiband sensor for airborne Land Surface Temperature retrieval in urban environments, which can help fill the gaps where spaceborne sensors struggle. In line with digitalization efforts across virtually all sectors, the efficiency and efficacy of adaptation measures can be supported via the provision of accessible and actionable information from spaceborne Earth observation, but also in conjunction with information from local sources including, for instance, demographic data or airborne acquisitions. This is one objective of the ESA GTIF (Green Transition Information Factories) driving the cloud integration, standardization, and commercialization of a diverse set of capabilities targeted at green transition venues, also including the domain of sustainable cities. Focusing on efforts in the scope of the ongoing “GTIF Kickstarters: Baltic” project, we present the development status of the “Urban Heat Trend Monitor”, a capability which exploits data from ESA’s Copernicus Sentinel 2 and 3 satellites to provide users with easy-to-interpret maps of urban climate information that can be integrated into administrative processes and to facilitate sustainable urban planning. Super-resolution imaging is used to enhance the resolution of the satellite imagery, allowing the analysis of heat islands and temperature fluctuations at the level of individual neighborhoods. Complementing streamlined access to raster data, the focus of the heat trend monitor is to enable users to extract, analyze and compare time series data for purposes such as the of comparison of regions, identification of problematic trends or the analysis of landcover changes. As an alternative to trend extraction in user-defined regions of interest or administrative boundaries, we propose a spatial partitioning method based on a superpixel approach to identify meaningful regions based on thermally homogeneous behavior. We approach time series analysis and trend identification using Generalized Additive Models, a data driven approach balancing predictive power and explainability. GTIF capabilities are developed in close cooperation with stakeholders to meet their, in the case of the Urban Heat Trend Monitor from the Baltic region and build on a technology stack aimed at interoperability and reusability. To this end, we adhere to standards including OpenEO, STAC and cloud-optimized storage formats like Zarr. Our second focus, RAVEN (“Remote Airborne Temperature and Emissivity Sensor”) was devised as an efficient solution to enable Land Surface Temperature retrieval at a scale appropriate for urban environments (resolution at typical operating altitudes 0.5-4 m). RAVEN employs a multi-band sensing and retrieval scheme typically reserved to spaceborne sensors and airborne demonstrator instruments yet implemented with relatively low-cost COTS hardware, enabling future use with unmanned airborne platforms. We report on the conceptualization and implementation of the sensor including geometric and radiometric calibration efforts, as well as on results from a 2024 airborne campaign conducted in Valencia in the scope of the Horizon2020 project CityCLIM and elaborate their relevance for urban adaptation.
Add to Google Calendar

Wednesday 25 June

14 events

Wednesday 25 June 17:45 - 19:00 (X5 - Poster Area)

Poster: A Decade of High-Resolution Antarctic Ice Speed Variability from the Sentinel-1 Mission

#pangeo #zarr

Authors: Ross Slater, Anna E. Hogg, Pierre Dutrieux, Benjamin J. Davison, Richard Rigby, Benjamin Wallis
Affiliations: University Of Leeds, British Antarctic Survey, University of Sheffield
Highly uncertain ocean warming is driving dynamic imbalance and increased mass loss from the Antarctic Ice Sheet (AIS), with important global sea level rise implications. Ice velocity, measured primarily through satellite-observations, is a key indicator of this change and recent advances in Earth observation capabilities now allow measurements with unprecedented spatial and temporal resolution across key regions of the AIS. The Sentinel-1 synthetic aperture radar (SAR) satellites, part of the European Commission’s Copernicus program, have acquired repeat imagery over the ice sheet’s coastal margin at a combination of 6 and 12-day repeats since 2014. Using an established offset-tracking processing chain, we generate Antarctic-wide mosaics of ice speed on a 100m grid for each 6 and 12-day separated Sentinel-1 image pair between October 2014 and February 2024. We perform analysis with tools from the Pangeo software ecosystem, using the continent-wide mosaics to generate multi-terabyte ice velocity data cubes. The Xarray and Dask Python packages are used for distributed computation of these cubes, which are stored in the Zarr chunked-array format for optimised access. We analyse the spatial distribution of ice speed trends through the study period, as well as decadal, multi-year, and sub-annual variability in time series extracted from 445 flow units across the AIS. Of these time series, we identify 239 flow units that have accelerated, and 206 that decelerated, whilst the full distribution of trends consistently skews toward acceleration. Acceleration trends are found most prominently in West Antarctica and the West Antarctic Peninsula, but ice speed variability is spectrally broad, complex, and spatially heterogeneous across the continent. Strong multi-year variability in ice speed is observed predominantly in West Antarctica. At sub-annual time scales, we identify seasonal speed variations on the Antarctic Peninsula and at scattered locations around the rest of the continent. This new dataset reveals the highly dynamic nature of the AIS, paving the way for improving our understanding of its interactions with other components of the Earth system.
Add to Google Calendar

Wednesday 25 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Optimizing EnMAP Satellite Operations: Acquisition Strategies and Data Access

#stac

Authors: Emiliano Carmona Flores, Dr. Sabine Baumann, Sabine Chabrillat, Sabine Engelbrecht, Martin Habermeyer, Sebastian Hartung, Dr. Laura La Porta, Dr. Nicole Pinnel, Dr. Miguel Pato, Mathias Schneider, Daniel Schulze, Peter Schwind, Dr. Katrin Wirth
Affiliations: German Aerospace Center (DLR), German Research Center for Geosciences (GFZ), Leibniz University Hannover, German Aerospace Center (DLR), German Aerospace Center (DLR)
Since its launch on April 1, 2022, the EnMAP mission is delivering high-quality hyperspectral data to a growing global user base. The mission follows an open-access policy, providing freely available data for scientific use. Additionally, users are encouraged to submit observation proposals granting the opportunity to task EnMAP over specific areas of interest. This approach requires the mission operations to balance the needs of the users with the technical capabilities of the satellite. This contribution summarizes the status of the tasking strategy and data access in the EnMAP mission and the challenges that have been found and optimizations that have been introduced during the initial years of operation. Acquisition Strategy While many present satellite missions offer global mapping capabilities, this is not yet available for the current generation of imaging spectroscopy missions. Today’s hyperspectral missions, like EnMAP, operate with a tasking mission concept, where acquisitions are prioritized and scheduled on specific locations. EnMAP follows a strategy that prioritizes observations entered by the users. To maximize the observation capacity, the remaining time is used for the so-called Background Mission which targets over 600 high-interest sites worldwide and, at the same time, aims to map large land surface areas. Experience from the first few years of operation shows that user requests are very unevenly distributed globally, which makes it a real challenge to fulfill all user requests. In addition, users order short acquisitions, i.e. single EnMAP products (30 x 30 km), underutilizing the potential observation capabilities of the satellite. Moreover, there is a growing number of requests for time series data which is difficult to achieve on the most requested geographic areas, due to the competing orders from different users. To address these difficulties, the so-called Foreground Mission was introduced. In this approach, a set of pre-selected areas over Europe are periodically observed and up-to-date information is shared at the EnMAP web site (www.enmap.org) about the status of the observations and future plans. The targets for the Foreground Mission were defined in collaboration with users representing different application areas. Following its positive reception by the user community, this initiative will be extended to other geographic areas in the future. In this contribution, we present the EnMAP acquisition approach and its optimization. We discuss the statistics of different observation modes and provide best practice recommendations for effectively tasking the EnMAP satellite to acquire data. Data access The open-access data policy of EnMAP allows users to access and download more than 110,000 products from the mission archive simply by registering as an EnMAP user. Archived products can be ordered and processed on-demand, with users selecting their preferred parameters, including product levels or the type of atmospheric correction. The available data product levels are: - Level 1B: radiometrically corrected data in sensor geometry - Level 1C: radiometrically and geometrically corrected, orthorectified data - Level 2A: atmospherically and geometrically corrected, orthorectified surface reflectance data with two atmospheric correction modes (land-mode and water-mode) The on-demand processing approach ensures that the EnMAP products adapt better to user needs and are generated using the up-to-date version of the processing software, which is regularly updated with enhancements. On the other hand, high-volume product requests can lead to long waiting times. To address the needs of users with no specific processing requirements or interested in large data volumes, EnMAP provides the complete set of Level 2A products processed with a standard set of parameters, available at the EOC-Geoservice and EOLab platforms. This dataset, verified as CEOS Analysis Ready Data (CEOS-ARD) for land applications, is optimized for large-scale use and easily accessible through the Geoservice STAC API, facilitating data discovery and access. In this contribution we will present what are the different deliverable EnMAP products, what are the available options to obtain them and the main differences that the users shall expect in each case.
Add to Google Calendar

Wednesday 25 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Advancing Hyperspectral Data Analysis with the EnMAP-Box

#cloud-native

Authors: Benjamin Jakimow, Andreas Janz, Akpona Okujeni, Leon-Friedrich Thomas, Patrick Hostert, Sebastian van der Linden
Affiliations: Humboldt-Universität zu Berlin, GFZ German Research Centre for Geosciences, University of Helsinki, Universität Greifswald
Imaging spectroscopy data from missions such as EnMAP, EMIT, PRISMA, and the upcoming CHIME and SBG initiatives offer transformative potential for environmental monitoring, agriculture, and mineral exploration. As imaging spectroscopy (IS) data from satellites becomes increasingly accessible, there is a growing need for advanced tools that enable users to handle, visualize and analyze images with hundreds of collinear bands, while seamlessly integrating them with data from other sources, such as multispectral raster data and measurements from field spectroscopy. Existing GIS and remote sensing software often fall short due to high costs, restricted accessibility, or insufficient flexibility for processing data from different sensors. The EnMAP-Box (Jakimow et al. 2023) is an open-source Python plugin for the QGIS geoinformation system, designed to address these challenges. Developed as part of the EnMAP mission activities (Chabrillat et al. 2024), the EnMAP-Box offers comprehensive functionality with focus, but not limited to, IS data and spectral libraries. With over 150 algorithms integrated into the QGIS Processing Framework, users can generate classification maps, estimate continuous biophysical parameters, and conduct advanced analyses. These processing algorithms are highly adaptable, capable of running in environments ranging from laptops to cloud-based processing clusters and are easily embedded into extensive workflows. The EnMAP-Box serves as a platform for domain-specific applications, such as the EnMAP Preprocessing Tools (EnPT, Scheffler 2023) for radiometric corrections, the EnMAP Geological Mapper (EnGeoMap) and EnMAP Soil Mapper (EnSoMap), for mineral and soil classification, or the Hybrid Retrieval Workflow for Quantifying Non-Photosynthetic Vegetation (Steinhauser 2024). Its versatility has made the EnMAP-Box a valuable resource across a multitude of Earth Observation applications. Our presentation focusses on the latest innovations in EnMAP-Box version 3.16, which represent a significant step forward in functionality, e.g.: • Preprocessing of Hyperspectral Time Series: A new pipeline transforms imaging spectroscopy data into Analysis-Ready Data (ARD), providing a consistent and accessible format for time-series analysis. This facilitates multitemporal investigations, such as phenology tracking and multi-temporal classification, while ensuring spectral and spatial consistency across sensor constellations. • Eased visualization and editing of spectral libraries, and integration of spectral libraries into raster processing workflows • Execution of computational-intensive EnMAP-Box algorithms on high-performance clusters, addressing the growing demand for scalability in hyperspectral data analysis. • Deep-Learning based semantic segmentation with SpecDeepMap The EnMAP-Box has proven to be a go-to environment for scientists, professional use cases and educational settings (Foerster et al. 2024), serving researchers, students, public authorities, land managers, and private companies. Its state-of-the-art algorithms embedded in the most-important open-source GIS-environment position the EnMAP-Box as a cutting-edge tool for Earth Observation, and specifically for IS applications. Attendees will gain insights into how these tools enable scalable, reproducible workflows for both researchers and operational users. Concluding the presentation, we will outline the roadmap for the EnMAP-Box, focusing on planned developments until end of 2026, including enhanced cloud-native capabilities and additional tools for emerging hyperspectral missions. This presentation aims to empower the remote sensing community to tackle complex environmental challenges with powerful and easy-to-use solutions. References: Chabrillat, S., Foerster, S., Segl, K., Beamish, A., Brell, M., Asadzadeh, S., Milewski, R., Ward, K.J., Brosinsky, A., Koch, K., Scheffler, D., Guillaso, S., Kokhanovsky, A., Roessner, S., Guanter, L., Kaufmann, H., Pinnel, N., Carmona, E., Storch, T., Hank, T., Berger, K., Wocher, M., Hostert, P., van der Linden, S., Okujeni, A., Janz, A., Jakimow, B., Bracher, A., Soppa, M.A., Alvarado, L.M.A., Buddenbaum, H., Heim, B., Heiden, U., Moreno, J., Ong, C., Bohn, N., Green, R.O., Bachmann, M., Kokaly, R., Schodlok, M., Painter, T.H., Gascon, F., Buongiorno, F., Mottus, M., Brando, V.E., Feilhauer, H., Betz, M., Baur, S., Feckl, R., Schickling, A., Krieger, V., Bock, M., La Porta, L., Fischer, S., 2024. The EnMAP spaceborne imaging spectroscopy mission: Initial scientific results two years after launch. Remote Sensing of Environment 315, 114379. https://doi.org/10.1016/j.rse.2024.114379 Foerster, S., Brosinsky, A., Koch, K., Eckardt, R., 2024. Hyperedu online learning program for hyperspectral remote sensing: Concept, implementation and lessons learned. International Journal of Applied Earth Observation and Geoinformation 131, 103983. https://doi.org/10.1016/j.jag.2024.103983 Jakimow, B., Janz, A., Thiel, F., Okujeni, A., Hostert, P., van der Linden, S., 2023. EnMAP-Box: Imaging spectroscopy in QGIS. SoftwareX 23, 101507. https://doi.org/https://doi.org/10.1016/j.softx.2023.101507 Scheffler, D., Brell, M., Bohn, N., Alvarado, L., Soppa, M.A., Segl, K., Bracher, A., Chabrillat, S., 2023. EnPT – an Alternative Pre-Processing Chain for Hyperspectral EnMAP Data, in: IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium. IEEE, pp. 7416–7418. https://doi.org/10.1109/igarss52108.2023.10281805 Steinhauser, S., Wocher, M., Halabuk, A., Košánová, S., Hank, T., 2024. Introducing the Potential of the New Enmap-Box Hybrid Retrieval Workflow for Quantifying Non-Photosynthetic Vegetation, in: IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium. IEEE, pp. 4073–4076. https://doi.org/10.1109/igarss53475.2024.10642095
Add to Google Calendar

Wednesday 25 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Cubes & Clouds – A Massive Open Online Course for Cloud Native Open Data Sciences in Earth Observation

#cloud-native #stac #pangeo

Authors: Michele Claus, Anne Fouilloux, Tina Odaka, Juraj Zvolensky, Stephan Meißl, Tyna Dolezalova, Robert Eckardt, Jonas Eberle, Alexander Jacob, Anca Anghelea
Affiliations: Eurac Research, Simula Research Laboratory AS, IFREMER Laboratoire d'Oceanographie Physique et Spatiale (LOPS), Ignite education GmbH, EOX IT Services GmbH, Eberle Web- and Software-Development, European Space Agency
Earth Observation (EO) scientists are facing unprecedented volumes of data, which continue to grow with the increasing number of satellite missions and advancements in spatial and temporal resolution. Traditional approaches, such as downloading satellite data for local processing, are no longer viable. As a result, EO science is rapidly transitioning to cloud-based technologies and open science practices. However, despite the swift evolution and widespread adoption of these new methods, the availability of training resources remains limited, posing a challenge for educating the next generation of EO scientists. The free Massive Open Online Course Cubes & Clouds - Cloud Native Open Data Sciences for Earth Observation (https://eo-college.org/courses/cubes-and-clouds/) introduces data cubes, cloud platforms, and open science in Earth Observation. Aimed at Earth Science students, researchers, and Data Scientists, it requires basic EO knowledge and basic Python programming skills. The course covers the entire EO workflow, from data discovery and processing to sharing results in a FAIR (Findable, Accessible, Interoperable, Reusable) manner. Through videos, lectures, hands-on exercises, and quizzes, participants gain both theoretical knowledge and practical experience in cloud-native EO processing. Students who successfully completed the course should confidently use cloud platforms for EO research and share their work following open science principles. The hands-on exercises use Copernicus data accessed through the SpatioTemporal Asset Catalog (STAC) to demonstrate two approaches for defining Earth Observation (EO) workflows: the openEO API and the Pangeo software stack. Participants engage in similar exercises using both methods, allowing them to compare their benefits and limitations. This approach provides a deeper understanding of the importance of APIs and standards in modern EO practices. In the final exercise, participants collaborate on a community snow cover map, mapping small areas of the Alps and submitting results to a STAC catalogue and web viewer. This project demonstrates their ability to apply EO cloud computing and open science practices while adhering to FAIR standards. Upon successful completion of the course, each participant will receive a certificate which can be listed and integrated in their CV or shared easily. The talk will guide through the topics covered in Cubes & Clouds and show how they are presented in the EOCollege e-learning platform, the links to the exercises carried out on CDSE will be explored and the open science aspect will be shown in the community mapping project.
Add to Google Calendar

Wednesday 25 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Montandon: The Global Crisis Data Bank

#stac

Authors: Emmanuel Mathot, Sanjay Bhangar, Sajjad Anwar
Affiliations: Development Seed
The International Federation of Red Cross and Red Crescent Societies (IFRC) started the Global Crisis Data Bank (GCDB, also known as Montandon) initiative in 2021 with support from UNDRR, UN OCHA, and the WMO. The GCDB aims to create a centralized database that harmonizes hazard, impact, and corresponding response data for every event from various sources, including GDACS, EM-DAT, and DesInventar. This addresses crucial gaps in the humanitarian data ecosystem, specifically IFRC’s own National Societies, to understand the past and better prepare for the future. IFRC’s vision for Montandon is to become the largest archive of structured data about current and historical disasters worldwide. This enables analysis to reveal patterns at various spatial and temporal resolutions. Montandon is the foundation for forecast models and systems and a dynamic database constantly reflecting the humanitarian community's approaches. Over the last year, the IFRC team has built several components of Montandon, including a data schema, data processing, transformation scripts, as well as a proof-of-concept API. Development Seed has started a new phase with the IFRC GO development team to enhance Montandon components, making the project operational for use within IFRC and the broader humanitarian community. We introduce a new technical approach to operationalize the Montandon, also known as the 'Monty' database, utilizing established technologies to harmonize the data model. A key part of this process is model normalization, which involves integrating data into SpatioTemporal Catalog Assets (STAC) to ensure efficient metadata management for disaster events. A dedicated extension (https://github.com/IFRCGo/monty-stac-extension) outlines specifications and best practices for cataloging all attributes related to hazards, impacts, and responses. We perform an in-depth analysis of a range of data sources, including GDACS and DesInventar, to effectively extract, transform, and load data into a harmonized model. One of the primary challenges we encounter is the organization of information across various data models and definitions of disasters. A critical aspect of our work involves the implementation of event pairing, which is necessary to connect the different episodes associated with a disaster. This method enables the utilization of STAC API functions to effectively search, filter, and aggregate data on hazards and their impacts across one or multiple events. These functions are designed for broad application by platforms such as IFRC GO and the Disaster Response Emergency Fund (IFRC-DREF). They aim to provide essential data and actionable insights, enhancing decision-making for stakeholders involved in disaster management. Finally, this initiative also seeks to significantly improve the gathering and analysis of responses to various critical events. To achieve this, it will focus on integrating with established frameworks such as the Disasters Charter and the Copernicus Emergency Management Service. These collaborations are intended to enhance the systematic cataloging of satellite imagery and value-added products. Among the key outputs will be detailed flood maps that provide insightful visual representations of flood-prone areas, accurate forecasts outlining the projected paths of cyclones, and near-real-time mapping of wildfires to aid in timely response and resource allocation at global and regional levels. These platforms aim to provide essential data and actionable insights, enhancing decision-making for stakeholders involved in disaster management. Keywords: Global Crisis Data Bank (GCDB), Montandon, International Federation of Red Cross and Red Crescent Societies (IFRC), Data Access, Interoperability, Humanitarian Data, Disaster Management, Hazard Data, Impact Data, Response Data, Data Schema, Data Processing, STAC (SpatioTemporal Catalog Assets), API (Application Programming Interface), Model Normalization, Disaster Response Emergency Fund (IFRC-DREF), Satellite Imagery, Copernicus
Add to Google Calendar

Wednesday 25 June 17:45 - 19:00 (X5 - Poster Area)

Poster: High resolution evapotranspiration for climate adaptation strategies

#stac

Authors: Stefano Natali, Leon Stärker, Stefanie Pfattner, Daniel Santillan, Maximilien
Affiliations: SISTEMA GmbH
The rapid increase in climate change is affecting a number of sectors, systems, individuals and institutions worldwide, which have to adapt to its impact. Especially in Austria, climate change is making itself more and more noticeable and its existence, its pace and its impact is demonstrated by numerous measurements and observations. According to recent climate data (https://www.iea.org/articles/austria-climate-resilience-policy-indicator), the increase of annual mean temperature in the country has been more than twice the amount of global warming, having a bigger impact in areas such as urban, agricultural or mountainous and forest areas. These climate change signals can be better observed through the rapid melting of glaciers and thawing of the permafrost in alpine regions or through increasing of hot days and tropical nights and increase in precipitation. The overall motivation of the proposed project is to develop services that can support climate adaptation strategies through integration and monitoring of evapotranspiration (ET) using the existing satellite missions and in-situ data. The tool, which is under development for the FFG Project GET-ET for GTIF, will use ECOSTRESS and Sentinel 2 data to enable high resolution ET data for urban application, agriculture, and forest management. Even though methodologies to perform estimation of evapotranspiration with Sentinel-2 have already been proposed, the current approach uses a combination of VIS-NIR high resolution satellite data, ancillary data such as land cover and climate reanalysis data to train a deep learning model and estimate the ET values provided by ECOSTRESS. New foundations models trained with geospatial imagery allow improving the prediction capabilities with satellite imagery. The IBM pretrained model “Prithvi” is one of the examples of architecture to use to perform such task, indeed the large pretraining over all the bands of Sentinel-2 will facilitate the fine tuning. Prithvi shows meaningful capabilities for segmentation and regression that fit with the objective of estimating evapotranspiration. Such a model can also be optimized based on the need of ancillary data; indeed, its flexibility is leading to this deep learning choice. After the training, the model can produce ET maps only by using Sentinel-2 and ancillary data as input. High resolution evapotranspiration maps are valuable tools in urban planning and the strategic design of green infrastructures, enabling climate resilient planning for Cities. They facilitate precise identification of areas that experience significant heat stress, known as urban heat islands (UHI), due to the lack of green infrastructures (GI). By highlighting urban heat islands, targeted green interventions can be introduced to mitigate these effects while enhancing natural cooling mechanisms, such as cooling corridors. Additionally, these maps are useful for monitoring larger green spaces, such as green roofs and parks, to assess their vitality over time. This ensures the long-term effectiveness of green infrastructure, maintaining their cooling benefits and enhancing the quality of urban living spaces. In the framework of the project, the usefulness of the produced ET maps is assessed through real use cases in the Vienna city area. The generated data and resulting information products are maintained as STAC collections (time-series data cube) and made accessible spatially disaggregated to block/district/ commune level via both RESTful API and interactive WebGUI/GIS. This functionality will be provided through the EODASH ecosystem which allows visualization of a multitude of heterogeneous data sources, from serverless to OWS (OGC Web Services) providing interactive visualization, process triggering and custom results display.
Add to Google Calendar

Wednesday 25 June 16:30 - 16:50 (EO Arena)

Demo: A.08.17 DEMO - CNES cloud platform and services to optimize SWOT ocean data use

#pangeo

The Surface Water and Ocean Topography (SWOT) mission, a joint-venture between the United States (NASA) and France (CNES), with contributions from the Canadian Space Agency (CSA) and the United Kingdom Space Agency (UKSA), has been measuring the world's surface waters for more than two years, providing the first high-resolution mapping of our planet's water resources. SWOT's innovative KaRIn (short for Ka-band Radar Interferometer) instrument provides remarkable insights into the study of fine structures (down to about 10 km) of the ocean circulation, coastal processes, and freshwater stock variations in lakes and rivers (greater than 100 m).

As part of the SWOT ocean data dissemination, this demonstration will showcase the cloud-based tools and services offered by CNES. In particular, we will present the CNES cloud-like platform for hosting SWOT projects (high computing power with CPU and GPU capacities, very fast and optimized remote access to SWOT data products, etc.) together with SWOT specific Pangeo-based libraries, powerful tools, dedicated tutorials to illustrate simple use cases (intercomparison with other satellite data or in-situ measurements, cyclone monitoring, coastal applications, etc.) and a technical support (helpdesk) for smooth sailing on the platform.

Speakers:


  • Cyril Germineaud - CNES
Add to Google Calendar

Wednesday 25 June 16:52 - 17:12 (EO Arena)

Demo: D.03.34 DEMO - EDC & Pangeo Integration on EarthCODE

#stac #pangeo

This demonstration will provide a concise yet comprehensive overview of how the Pangeo ecosystem (on EDC) integrates seamlessly into EarthCODE. During this 20-minute talk, participants will learn about EarthCODE's core capabilities that support FAIR (Findable, Accessible, Interoperable, and Reusable) and open-science principles for Earth Observation (EO) data.

We will showcase:
- The integration of Pangeo's scalable, reproducible scientific workflows within EarthCODE, enabling users to efficiently discover, access, and process large EO datasets.
- Key functionalities such as dataset access via EarthCODE Science Catalog using STAC and OGC standards.
- Practical examples demonstrating data analysis with Pangeo tools, including data loading with Xarray, visualization using HvPlot, and scalable computation leveraging Dask.
- Real-world use cases featuring Copernicus Sentinel satellite data

The demonstration will highlight how researchers can easily adapt existing workflows to their needs and ensure reproducibility by publishing results directly through EarthCODE's integrated platforms.

Speakers:


  • Samardzhiev Deyan - Lampata
  • Dobrowolska Ewelina Agnieszka - Serco
  • Anne Fouilloux - Simula Labs
Add to Google Calendar

Wednesday 25 June 15:00 - 15:20 (EO Arena)

Demo: D.04.27 DEMO - The Sentinels EOPF toolkit: Notebooks and Plug-ins for using Copernicus Sentinel Data in Zarr format

#zarr

As part of the ESA Copernicus Earth Observation Processor Framework (EOPF), ESA is in the process of providing access to “live” sample data from the Copernicus Sentinel missions -1, -2 and -3 in the new Zarr data format. This set of reprocessed data allows users to try out accessing and processing data in the new format and experiencing the benefits thereof with their own workflows.

To help Sentinel data users experience and adopt the new data format, a set of resources called the Sentinels EOPF Toolkit is being developed. Development Seed, SparkGeo and thriveGEO, together with a group of champion users (early-adopters), are creating a set of Jupyter Notebooks, plug-ins and libraries that showcase the use of Sentinel data in Zarr for applications across multiple domains for different user communities, including users of Python, Julia, R and QGIS.

This demonstration will give a first glimpse of the first set of notebooks and plugins of the Sentinels EOPF toolkit that were developed and that facilitate the adoption of the Zarr data format for Copernicus Sentinel data users. Additionally, we will give an overview of toolkit developments and community activities that are planned throughout the project period.

Speakers:


  • Julia Wagemann - thriveGEO
  • Gisela Romero Candanedo - thriveGEO
  • Emmanuel Mathot - Development Seed

Add to Google Calendar

Wednesday 25 June 08:30 - 10:00 (Room 0.11/0.12)

Presentation: Advancing Monitoring of Complex Coasts: Harnessing Sentinel-2 and Landsat Data for Complementary Open-Source Approaches at Continental Scale

#stac

Authors: Stephen Sagar, Robbi Bishop-Taylor, Claire Phillips, Vanessa Newey, Rachel Nanson
Affiliations: Geoscience Australia
Since 2015, Digital Earth Australia (DEA) has been producing continental-scale Earth observation (EO) products for the historical characterisation and ongoing monitoring of the Australian coastal region. Our products and workflows focus on leveraging petabytes of analysis-ready EO data (ARD), innovative methods for dealing with noise and environmental variability, and a commitment to open-source code, methods, and data access. These products allow coastal managers and scientists to evaluate the socio-economic and environmental impacts from issues such as coastal erosion in this dynamic interface between land and sea, integrating these historical and ongoing data insights into future planning. Australia has one of the most varied coastal and tidal environments in the world, ranging from complex macro-tidal mudflats in the north to micro-tidal rocky shores and beaches in the south of the country. This variability presents an issue common to the application of EO products to coastal monitoring in many locations worldwide: a single methodology and product type may not be sufficient for comprehensive characterisation and monitoring, and a complementary approach must often be considered. This talk will introduce a new method for mapping intertidal topography at unprecedented spatial and temporal resolution. Our approach combines analysis-ready Landsat and Sentinel-2 satellite imagery with state-of-the-art global tide modelling to analyse patterns of tidal inundation across Australia’s entire intertidal zone. This approach is applied at the pixel level, allowing us to extract fine-scale morphological details that could not be resolved by previous waterline-based intertidal mapping methods. This pixel-based method greatly reduces the volume of satellite imagery required to generate accurate intertidal elevation models, enabling us to produce multi-temporal snapshots of Australia’s dynamic intertidal zone from 2016 to the present. Importantly, this method represents the first Open Data Cube (ODC) product to fully integrate ESA Sentinel-2 data with USGS Landsat data into a single derived product for coastal applications. We show the clear benefits of incorporating 10m resolution Sentinel-2 data into the product workflow, enabled by the common ARD workflow used in DEA and the consistency of cross-sensor surface reflectance data this provides. We also demonstrate the power of increasing the temporal density of satellite observations for coastal regions and change analysis, setting the scene for future missions such as Landsat Next and the next generation of Sentinels. We show that when paired with satellite-derived shoreline approaches, such as the DEA Coastlines product, our new DEA Intertidal product allows this critical transition zone between land and sea to be fully integrated into multi-temporal coastal change analysis at a continental scale. These products work in a highly complementary way, with each filling the gap for coastal regions and environments where the other may struggle to accurately capture the nature and magnitude of coastal change. For example, in macro-tidal regions with extensive muddy tidal flats, where DEA Coastlines may produce shorelines with high uncertainty, DEA Intertidal can better model and represent the dynamic nature of the shifting mudflats. Validation of both products is completed using a full suite of LiDAR, ground survey, drone, and photogrammetry data. Along with quantified uncertainty metrics, this provides confidence for coastal managers, scientists, and modellers looking to incorporate this data into decision-making processes and scientific workflows. Underpinning this novel approach, we will introduce work we have undertaken to optimise the use of multiple global tide models, based on the findings that no single tidal model performs best across these complex environments. In keeping with our open-source ethos, this work is published as a suite of tools in the ‘eo-tides’ Python package, and we show how this package can be used freely to improve coastal EO analysis. In an international context, one of the most significant developments in this product stream is the addition of tools designed to enable the application of these workflows to locations outside of Australia. Our approach is based on open-source data and code, allowing it to be applied to any freely available source of satellite data (e.g., cloud-hosted Microsoft Planetary Computer data) loaded using STAC metadata and the Open Data Cube. This provides new opportunities for deeper engagement with initiatives like the CEOS Coastal Observations Applications Services and Tools (COAST) VC, and stronger collaboration with our international partners like ESA and the USGS.
Add to Google Calendar

Wednesday 25 June 11:30 - 13:00 (Room 0.94/0.95)

Presentation: TACO: Transparent Access to Cloud-Optimized Spatio-Temporal Datasets

#stac #parquet

Authors: Cesar Aybar, Luis Gómez-Chova, Julio Contreras, Oscar Pellicer, Chen Ma, Gustau Camps-Valls, David Montero, Miguel D. Mahecha, Martin Sudmanns, Dirk Tiede
Affiliations: Image Processing Laboratory (IPL), Institute for Earth System Science and Remote Sensing, Leipzig University, Department of Geoinformatics-Z_GIS, University of Salzburg
Over the past decade, Earth system sciences (ESS) have increasingly relied on machine learning (ML) to address the challenges posed by large, diverse, and complex datasets. The performance of ML models in ESS is directly influenced by the volume and quality of data used for training. Paradoxically, creating ready-to-use, large, high-quality datasets remains one of the most undervalued and overlooked challenges in ML development. Identifying suitable training datasets for specific tasks is often challenging, and few existing ESS datasets fully adhere to the FAIR (Findable, Accessible, Interoperable, and Reusable) principles. To address this challenge, we present TACO, a new specification designed to simplify and streamline the creation of FAIR-compliant datasets. In TACO, a dataset is represented as a data frame that lists file paths for each data point alongside its metadata. TACO stores all data point components as binary large objects (BLOBs) accessible through GDAL Virtual File Systems (VFS). It leverages the /vsicurl/ method for accessing online resources and /vsisubfile/ for reading specific file segments. These features enable efficient partial and parallel data reads (i.e. cloud-optimized) at the data point level, making TACO ideal for integration with ML frameworks like Torch and TensorFlow. Users can perform tasks such as exploratory data analysis or create online dataloaders without needing full downloads. The metadata framework builds on the STAC GeoParquet standard [1], enhanced with naming conventions from the Open Geospatial Consortium (OGC) training data markup language [2]. Additionally, TACO recommends optional fields derived from the Croissant Responsible AI (RAI) framework [3] to capture the economic, social, and environmental context of the regions surrounding each data point. Designed for broad compatibility, TACO relies solely on GDAL, a core dependency for major raster data libraries in Python (e.g., rasterio), R (e.g., terra and stars), and Julia (e.g., Rasters.jl). This ensures seamless integration with the geospatial ecosystem with minimal installation effort across various programming environments. To demonstrate the effectiveness of the TACO specification, we applied the specification to methane plume detection datasets. Each published methane plume dataset was transformed to adhere to TACO principles. Given that each TACO dataset is organized as a data frame at a high level, combining them involved a straightforward concatenation operation. This process created MethaneSet, the largest and most geographically diverse multisensor collection of methane emissions to date, covering 37,316 methane leaks across 9,931 distinct emission sites. MethaneSet is 25 times larger than the methane dataset presented by the United Nations Environment Programme at COP28. Looking ahead, we plan to extend TACO’s capabilities to other areas of ESS, further expanding its scope and impact. References [1] https://github.com/stac-utils/stac-geoparquet [2] https://docs.ogc.org/is/23-008r3/23-008r3.html [3] https://docs.mlcommons.org/croissant/docs/croissant-rai-spec.html
Add to Google Calendar

Wednesday 25 June 08:30 - 10:00 (Room 1.15/1.16)

Presentation: Surface Water Inventory and Monitoring (SWIM): Hands-on Examples for Improved Flood Mapping and Water Resource Monitoring

#stac

Authors: Sandro Groth, Marc Wieland, Dr. Sandro Martinis
Affiliations: German Aerospace Center (DLR)
Over the last years, DLR has established automated workflows to derive the extent of open surface water bodies from various Earth Observation (EO) datasets. This workflows have consistently demonstrated their value in supporting flood mapping and water monitoring activities. In order to maximize the potential of these methods, we are developing an open access surface water product at 10-20 m spatial resolution based on Sentinel-1/2 data. The primary objective is to provide easy access to global high-resolution surface water information and offer interfaces for seamless integration into automated rapid mapping and monitoring workflows to support the disaster management and water monitoring community. The proposed SWIM product contains two publicly available data collections: 1) Water Extent (SWIM-WE) provides a dynamically updated set of binary masks identifying open surface water bodies extracted from both Sentinel-1/2 scenes. This data collection enables users to rapidly identify the water extent on a specific date or to analyze surface water dynamics over time covering large areas. 2) Reference Water (SWIM-RW) contains fused information on permanent and seasonal water bodies based on the SWIM-WE collection over an observation period of two years. This collection was designed to support disaster response workflows by providing pre-computed information on the "normal" hydrologic conditions to accelerate the identification of flooded areas. By considering seasonal water dynamics, potential overestimation of inundation extent can be limited. Users are also able to access additional assets containing the relative water frequency as well as quality layers. The described product will be available via OGC-Webservices on DLR GeoService. All published data assets can be accessed from spatio-temporal asset catalogs (STAC). We choose this technology as it allows for quick data search by filtering user-specific areas and time-frames. Matching water masks can also be projected and mosaicked efficiently using open-source tools such as odc-stac. In combination with the publication of the SWIM product, a set of Jupyter Notebooks that demonstrate common use cases will be made available. For the visualization of SWIM data in GIS software or web mapping applications, Web Map Services (WMS) will be hosted. In this conference contribution, we aim to introduce the SWIM product to a broader audience and provide hands-on examples on how the data can be used for effective flood disaster response and long-term water resource monitoring. To showcase a typical rapid mapping workflow using SWIM, the visualization of observed water extent at a specific date and location using the SWIM-WE WMS service is demonstrated on the example of the 2024 flood event in Southern Germany. Additionally, the automated identification of flooded areas using the SWIM-RW reference water mask is shown. The importance of considering seasonality in reference water layers is demonstrated by comparing inundation masks from real flood events in Germany and India. To highlight potential use-cases of the SWIM product apart from flood monitoring, time-series analysis of SWIM-WE items of water reservoirs in Germany is conducted to showcase the analysis of hydrologic drought conditions.
Add to Google Calendar

Wednesday 25 June 08:30 - 10:00 (Room 1.31/1.32)

Presentation: Enhancing Disaster Response Through Cloud-Based Multi-Mission EO Data Processing

#stac #cog

Authors: Fabrizio Pacini, Mauro Arcorace, Marco Chini, Zachary Foltz, Roberto Biasutti
Affiliations: Terradue Srl, Environmental Research and Innovation, LIST, ACRI-ST, ESRIN, European Space Agency
The effective use of Earth Observation (EO) data during disasters is critical for timely response and recovery efforts. Since 2000, the International Charter Space and Major Disasters has provided at no-cost for the final user satellite imagery to global EO experts, empowering them to deliver information products essential for disaster relief operations. Supporting this mission in recent years, the ESA Charter Mapper platform has been developed as an innovative cloud-based processing environment, showcasing the transformative potential of cutting-edge technologies in EO data exploitation. Concerning the analysis of remote sensing data, the Charter Mapper provides a dedicated web application where Project Managers (PM) and Value Adders (VA) can find pre-processed multi-sensor EO data including extracted metadata, perform visual analysis, and submit on-demand processing to extract geo-information from satellite imagery. After a search of a desired calibrated dataset via the GUI, PM/VAs can visualize either an overview or a single band asset in the map at full resolution and apply on the fly changes in the render of imagery by stretching the histogram. Furthermore users can also combine single-band Assets of Calibrated Datasets to create custom intra-sensor RGB band composites on the fly or can employ expressions to derive a binary mask from a single band asset. Users can also visually compare pre- and post-event images directly in the map using a slider bar to depict the evolution of catastrophic events. Beyond visualization, the ESA Charter Mapper offers a robust suite of EO processing services designed to meet diverse analytical needs during disaster response and recovery. Accessible through an intuitive interface, the platform’s portfolio includes 26 processing services that support both systematic and on-demand workflows. These services empower users to perform advanced data operations such as pan sharpening, band combination, image co-location and co-registration, change detection, cloud masking, and hotspot and burned area mapping. For SAR data, the platform enables InSAR processing for surface displacement monitoring and coherence analysis. Furthermore users can also perform unsupervised image classification, raster filtering, vectorization, and map composition. Outputs include geo-information products such as spectral indices, flood masks, and burned area maps in various formats, from TOA reflectance to false-color RGBA visualizations. Recent developments introduced advanced functionalities to facilitate the generation of Value Added Products (VAP) such as the GIS functions panel where users can work with vector files, and the Map Composition functionality to generate a professional cartographic product as PDF or PNG file. The Charter Mapper integrates technologies such as Kubernetes, SpatioTemporal Asset Catalog (STAC), and cloud-optimized GeoTIFF (COG), enabling streamlined access, visualization, and processing of data from a constellation of 41 EO missions managed by 24 international space agencies and data distributors. By employing Common Band Names (CBN) for harmonized spectral mapping and automating ingestion workflows, the platform ensures rapid, systematic pre-processing of diverse datasets, including optical and SAR imagery. This automation enables consistent, high-quality, and analysis-ready datasets to be available within short timeframes, a critical advantage for timely decision-making in disaster response. Additionally, the Charter now hosts a growing archive of multi-mission data, providing a valuable resource for analysis and reanalysis across different activations and scenarios. In September 2021 the Charter Mapper was officially released in operations. So far it facilitated PM and VA users in accessing and using a large amount of EO data acquired over 195 charter activations. The architecture of the Charter Mapper has been designed with scalability and adaptability in mind, benefiting from the Open Geospatial Consortium (OGC) Application Package best practice. This approach, developed in collaboration with the EO Exploitation Platform Common Architecture (EOEPCA) under ESA, enables EO applications to be portable and reproducible across different infrastructures and cloud environments. By leveraging the Common Workflow Language (CWL) and containerized workflows, EO algorithms can be seamlessly deployed, whether for local testing, distributed Kubernetes clusters, or OGC API Processes. The Charter Mapper adopts this framework to support rapid deployment of EO algorithms, ensuring that its processing services are both robust and scalable for disaster response applications. The Charter Mapper architecture is designed to address complex challenges in harmonizing, visualizing, and processing large EO datasets and it can provide a replicable framework for multi-mission EO data management. Its blueprint is a scalable and adaptable solution for other EO initiatives and is well-suited for applications beyond disaster response, such as environmental monitoring, urban planning, and climate change analysis. This oral presentation will showcase the platform’s innovative technical framework and its role in supporting disaster relief efforts. We will present recent case studies from Charter activations, demonstrating how the automated workflows and scalable architecture facilitate faster and more reliable disaster response. Furthermore, we will illustrate how the Charter Mapper's design and operational principles can serve as a replicable model for other EO platforms seeking to address the growing demand for efficient data processing and analysis across multiple sectors.
Add to Google Calendar

Wednesday 25 June 17:00 - 19:00 (Schweizerhaus, Prater 116)

Social: Cloud-native Geospatial Community Social.
Register to attend here

#cloud-native

Socialize with other members of the Cloud-Native Geospatial community.
Add to Google Calendar

Thursday 26 June

46 events

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: The Centre for Environmental Data Analysis (CEDA) and JASMIN: EO and Atmospheric data next to a fast parallel processing cluster.

#stac

Authors: Steve Donegan, Ed Williamson, Alison Waterfall, Fede Moscato
Affiliations: Stfc Ceda
The Centre for Environmental Data Analysis (CEDA) provides access to over 23 Pb of EO and Atmospheric data from UK-funded research. CEDA’s data holdings range from surface weather station and airborne survey flight data to the daily retrieval of satellite data as well as climate model data. CEDA has over 35000 users, who benefit from the ability to process and analyse this data using JASMIN– a world-class fast parallel processing cluster hosted by STFC (Science and Technology Facilities Council) with 55K cores and 24Pb of dedicated user storage in Group Workspaces (GWS). There are over 400 GWS’s and they provide a useful facility for CEDA and JASMIN users to share and develop the data further, contributing to many national and international projects and datasets that will be placed on the CEDA archive. CEDA and JASMIN are based at the UK Science Technology Facilities Council (STFC) at the Rutherford Appleton Laboratory (RAL) in Oxfordshire, UK and is a division within RALSpace. CEDA is part of the UK Natural Environment Research Council (NERC) Environmental Data Service (EDS) providing FAIR (Findable, Accessible, Interoperable & Reusable) access to data and services. CEDA also provides the data archive component for the UK National Centre for Earth Observation (NCEO) and the UK National Centre for Atmospheric Science (NCAS). CEDA’s EO archive component alone includes data from the Sentinel, Landsat, Terra/Aqua and ENVISAT missions in addition to data from the NERC ARF and DEFRA Sentinel ARD - as well as many other missions and research data outputs. It also hosts datasets from, and works closely with, international projects such as the ESA Climate Change Initiative (ESA CCI), The Couple Model Intercomparison Project (CMIP) and is also a designated data centre/primary archive for the IPCC Data Distribution Centre. CEDA also provides a Data Hub Relay (DHR) for ESA as part of the international Copernicus data dissemination effort with almost 20Tb per day flowing to and through the CEDA DHR daily. CEDA is one of the leading partners on the UK Earth Observation Data Hub (UK EODH), a high-profile world leading UK specific software infrastructure tying the UK academic and commercial EO communities and easing data access for both. CEDA maintains many data streams across the EO and Atmospheric disciplines, with daily incoming data flows of 7-8Tb per day which are typical for just the Sentinel mirror archive and MODIS data streams alone. CEDA works closely with NCEO, NCAS and UK stakeholders to identify data streams of use to the community and actively engages to provide timely and reliable access to the data. Data is not only sourced and retrieved actively from such sources as EUMETSAT, Copernicus (via the CEDA DHR) and NASA/USGS but data is automatically pushed to us from various sources such as the UK Meteorological Office (UKMO) and ground station retrievals, via a data arrivals service. CEDA provides many methods for users to find and access EO data, not least fast access via the JASMIN environment that allows users access to the data using the JASMIN fast parallel processing cluster. The Satellite Data Finder is a web tool that allows users to quickly find most CEDA EO datasets. Users can access this via a conventional GUI or by an OpenSearch interface. CEDA is constantly involved in identifying and being involved with efforts to improve data search and access, not least current development efforts to ensure a STAC catalogue to support the UK-EODH. With developments such as this and the new Big Data paradigm, CEDA is giving much thought to how best to structure and support data formats that ease this transition as well as how to allow the data to be accessed and processed with these technologies. CEDA has recently celebrated 30 years of continuous data centre operations supporting vital access to the UK EO and Atmospheric Science communities. We maintain a keen eye on emerging future technologies that will impact our operations and interaction with users. Not least of these is the support and research into NetZero technologies for data centres. CEDA is working closely with its STFC parent organisation to ensure that CEDA remains fit for the future and will meet its community and societal obligations.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Enhancing Earth Observation Accessibility with AI-Driven Natural Language Interfaces

#stac

Authors: Sergey Sukhanov, Enrique Fernandez, Dr. Ivan Tankoyeu, Hector Lopez, Revan Rangotis, Federico Fioretti
Affiliations: Ai Superior Gmbh, Serco
Earth observation (EO) data is a cornerstone for scientific research, environmental monitoring, and decision-making, yet its accessibility remains a significant challenge for non-expert users. Sentinel satellite missions, operated under the Copernicus program, generate vast amounts of geospatial data. However, effectively querying this data requires domain expertise and familiarity with complex query languages like OData and STAC. This barrier limits the utilization of EO data by a broader audience, including policymakers, educators, and researchers in non-technical fields. To address this challenge, we present a generative AI-driven natural language interface designed to bridge the gap between user-friendly interaction and the technical intricacies of EO data discovery. Developed as part of the Collaborative Data Hub Software evolution, the proposed system leverages advancements in natural language processing (NLP) and geospatial technologies to transform user queries into structured formats compatible with Data Hub Software (DHS) ecosystems, such as the GAEL Store Service (GSS) and Copernicus Space Interface (COPSI). Users can input queries in plain language, such as “Find cloud-free Sentinel-2 images of the Amazon rainforest from July 2024,” and receive actionable results seamlessly. The system's architecture is modular, comprising a tool selector, an extractor powered by large language models (LLMs), a geospatial lookup service, and a validation and conversion pipeline to ensure queries adhere to OData and STAC standards. The design also incorporates fuzzy geolocation capabilities and dialogue-based feedback mechanisms to refine ambiguous inputs iteratively. Deployed in a containerized environment on OVHCloud, the system supports scalable, real-time query processing. The solution addresses key challenges, including handling ambiguous or incomplete user inputs, maintaining compatibility with DHS standards, and scaling to support diverse user demands. Initial results demonstrate significant improvements in query accuracy and usability, enabling a wider audience to access and utilize EO data effectively. In our presentation, we explore the need for such a system, detailing the technical challenges, system architecture, and implementation strategies. We conclude with insights from early deployment phases and a roadmap for future enhancements.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Exploring Federated Processing of Earth Observation Data Through Cloud-Native

#cloud-native

Authors: Mr. Guilhem Mateo, Cesare Rossi, Ms. Kavitha Kollimalla, Mr. Roberto Di Rienzo, Mr. Hugues Sassier, Mr. Jean Luc Gauthier
Affiliations: CGI Italia, Thales Alenia Space, CGI France
This work presents our experience implementing cloud-based processing of Earth Observation data across federated environments in EUMETSAT, addressing the growing need for interoperable and secure distributed computing solutions in the EO domain. Building on established technologies, we explore an architecture combining Kubernetes and Knative for workload management, demonstrating how modern cloud-native approaches can be effectively applied to satellite data processing challenges. At the core of our implementation, Argo Workflows serve a dual purpose: orchestrating cross-platform execution of long-lived processing tasks and defining standardized satellite image analysis pipelines. This approach ensures reproducibility of scientific workflows while maintaining flexibility across different computing environments. Our system implements a subset of the OGC Processes API specification, facilitating standardized access to processing capabilities, alongside WebDAV and S3-compatible interfaces for efficient data access and storage. Our implementation can also support short-lived tasks, purely based on Knative capabilities, ready to demonstrate how serverless computing patterns can be effectively applied to EO processing workflows. Security considerations are addressed through a comprehensive approach to credential management, utilizing Kubernetes secrets, Vault, and a dedicated key management service. This ensures secure access to distributed storage systems while maintaining the scalability required for large-scale EO processing. The architecture implements API gateways for centralized access control and security policy enforcement, complemented by a quota management system that ensures fair resource allocation across users and processing tasks. The federation of Data Processing Instances (DPIs) enables workload distribution across different cloud environments, addressing challenges of data locality and processing efficiency within EUMETSAT's operational context. Through practical implementation and testing, we have identified several key challenges in federated EO processing, including credential propagation across security boundaries, enabling workflow portability between different cloud providers, and maintaining consistent performance across heterogeneous computing environments. Our solutions to these challenges contribute to the broader discussion of best practices in cloud-based EO ecosystems. The implementation has required careful consideration of storage access patterns, network latency between federated instances, and the balance between processing efficiency and resource consumption. In summary, with this work we aim to contribute to ongoing discussions about best practices in cloud-based EO ecosystems, particularly focusing on the intersection of security, interoperability, and scientific workflow management. Our findings suggest that while challenges remain in achieving truly seamless federation across cloud environments, current cloud-native technologies provide a solid foundation for building robust, secure, and scalable EO processing systems. The work demonstrates the potential for standardized, secure, and efficient processing of Earth Observation data across federated cloud environments.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Empowering Your Community with Earth Observation Insights: An All-in-One Online Workspace Platform Solution

#stac

Authors: Stephan Meißl, Daniel Santillan, Josef Prenner, DI (MSc) Stefan Achtsnit, Karolina Lehotska
Affiliations: EOX IT Services GmbH
Earth Observation (EO) projects offer immense potential for deriving valuable insights, but face challenges in data access, computing needs, and publishing results. This presentation introduces EOxHub Workspaces (https://eox.at/software-products/#managed-cloud-workspace), a comprehensive online workspace platform designed to empower users throughout their EO journey, as used for example in Euro Data Cube or EarthCODE. EOxHub Workspaces provide a complete suite of services for geospatial professionals and communities. It offers an efficient, fully integrated environment where users can immediately begin working and producing insights. The platform's cloud integration ensures scalability based on user needs. Additionally, EOxHub Workspaces are supported by experts with years of EO-related project experience, offering custom solutions and services. Key features of EOxHub Workspaces include: * Scientific development environment: Facilitates data processing, process and resource management, and AI development. * DevOps and agile life-cycle compatibility: Supports development, testing, and production phases following the GitOps model. * Community sharing: Enables users to visualize, analyze, and share research insights derived from satellite data and products like for example presented in the dashboards at https://race.esa.int, https://eodashboard.org, or https://gtif.esa.int. * Data management and accessibility: Offers multiple storage options, intuitive GUIs, STAC querying, and standardized data rendering. * Experiment management: Ensures reproducible workflow management with input data and configuration customization. * Scalable processing: Provides scalable processing power for demanding EO workflows. Resource management: Supports multiple resource plans and fine-grained resource tracking and billing. * Authentication & authorization management: Offers role-based access, SSO, and user groups. * Branding and customization: Allows for customized landing pages and private user areas. EOxHub's technology stack includes MLflow, Jupyter, Grafana, Dask, Argo, STAC, and OGC, ensuring compatibility and support for a wide range of EO workflows. EOxHub has been successfully implemented in various use cases, including Euro Data Cube (https://eurodatacube.com), Polar TEP community (https://polartep.hub.eox.at), EarthCODE (Earth Science Collaborative Open Development Environment; https://earthcode.esa.int), Cubes & Clouds MOOC (Massive Open Online Course; https://eo-college.org/courses/cubes-and-clouds/), individual workshops, and GTIF (Green Transition Information Factory; e.g. https://gtif-austria.info) projects. Join our presentation to learn how EOxHub Workspaces can empower your EO projects and unlock the full potential of satellite data. Let's discuss your use case and explore how EOxHub Workspaces can support your specific needs and apply for pre-commercialization sponsoring via ESA’s Network of Resources (https://nor-discover.org).
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: The Earth Observation Training Data Lab (EOTDL) - Addressing Training Data related needs in the Earth Observation community.

#stac

Authors: Juan B. Pedro
Affiliations: Earthpulse
The Earth Observation Training Data Lab (EOTDL), launched by the European Space Agency (ESA), addresses critical challenges in the development of AI for Earth Observation (EO) applications. A major barrier for leveraging AI in EO is the lack of accessible, high-quality training datasets. These datasets are costly and complex to create, requiring extensive manual labeling, expert input, and often in-situ validation, which limits innovation and hinders the growth of EO-based solutions. The EOTDL aims to tackle these challenges by providing an open, collaborative platform that offers a suite of tools for generating, curating, and utilizing AI-ready training datasets and pre-trained ML models. This platform includes a cloud-based repository with over 100 datasets spanning multiple EO applications, from computer vision tasks like classification and object detection to advanced parameter estimation and 3D analysis. In addition to the repository of training datasets, the platform also includes a repository of pre-trained machine learning models, which accelerates the development process for users by providing a starting point for various EO tasks. EOTDL facilitates seamless data access, supports multi-GPU training directly in the cloud through its cloud workspace, and provides users with tools to create and train models effectively. The platform's cloud workspace is equipped with GPU machines, enabling users to create datasets and train models with high computational efficiency. EOTDL also enables interoperability with third-party platforms, and supports the building of third-party applications on top of its infrastructure, enhancing the versatility of the platform. The EOTDL is built upon open-source foundations, with all code hosted on GitHub along with contributing guides and tutorials to foster community involvement. It promotes community engagement through collaborative tools, user contributions, and incentives. Users can contribute by enhancing existing datasets or adding new ones, with rewards to encourage active participation. The platform includes a labeling tool with active learning capabilities, which simplifies and improves the process of labeling datasets. This community-driven approach ensures a growing, diverse repository that evolves to meet the needs of researchers, industry practitioners, and engineers, helping to unlock the full potential of AI for Earth Observation. Feature engineering is another key aspect facilitated by EOTDL, particularly through integration with openEO. This collaboration allows users to leverage openEO's standardized interfaces and powerful processing capabilities for feature extraction from Earth Observation data. By using openEO, users can perform complex geospatial analyses and transformations efficiently, thereby enhancing the quality and relevance of features used for training machine learning models. This integration not only supports reproducibility but also improves the accessibility of sophisticated feature engineering workflows for a wide range of EO applications. The EOTDL also places strong emphasis on dataset and model metadata management, utilizing the SpatioTemporal Asset Catalog (STAC) standard along with custom STAC extensions. This approach ensures that metadata is standardized, searchable, and compatible across various EO datasets and models. The use of custom STAC extensions allows EOTDL to accommodate specific requirements for EO datasets, such as quality metrics, labeling details, and data provenance. This metadata framework significantly enhances dataset discoverability, quality assurance, and ease of use for the community. EOTDL's flexible access mechanisms—via APIs, web interfaces, and Python libraries—make it accessible to a wide range of users, thus creating a powerful ecosystem for advancing EO capabilities. Future plans include the introducion of gamification features to further incentivize contributions and will explore commercialization opportunities through premium features, thereby expanding the scope and sustainability of the platform.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Scaling Earth Observation Workflows with openEO: Managing Large-Scale Processing Efficiently

#cloud-native #stac

Authors: Vincent Verelst, Ir. Victor Verhaert, Dr. Ir. Stefaan Lippens, Ir. Jeroen Dries, Dr. Hans Vanrompay
Affiliations: VITO Remote Sensing
The openEO API offers a cloud-native interface for seamless access to and processing of Earth Observation (EO) data. By providing a standardized and user-friendly platform, openEO simplifies the integration of diverse EO datasets and facilitates efficient workflows in the cloud. Its integration with the Copernicus Dataspace Ecosystem (CDSE) ensures users can process and analyze EO data at scale. This makes openEO an indispensable tool for researchers and practitioners working with satellite data in fields like agriculture, land monitoring, and environmental management. However, the nature of EO data and analysis presents significant challenges for computational workflows. Tasks such as processing vast spatial extents or executing complex algorithms often involve datasets too large or operations too resource-intensive to handle within the constraints of a single batch job. These limitations can slow progress, increase costs, and create inefficiencies in EO projects. Addressing these challenges requires robust solutions that can manage large-scale workflows, optimize resource usage, and ensure cost-effectiveness. This demonstration highlights how openEO can help you overcome these challenges. By offering advanced job management capabilities, automated processing features, and enhanced metadata integration, openEO enables users to scale up their EO workflows while maintaining efficiency and cost control. These features are particularly beneficial for demanding computational tasks, such as continental land cover mapping or time-series analysis. Key Features of the openEO Job Management System: 1. Automated Spatial and Temporal Splitting Large-scale EO tasks, often exceed the resource limits of a single batch job. To address this, openEO automatically divides these massive jobs into smaller, more manageable sub-jobs. This splitting ensures that each sub-job fits within resource constraints while maintaining the integrity of the overall workflow. 2. Comprehensive Job-Tracking System Managing large workflows often requires monitoring multiple parallel processes. OpenEO’s MultiBackendJobManager includes a robust tracking feature that monitors the status of all jobs in real time. This JobManager provides detailed insights into processing progress, memory usage, and monetary costs (credits). 3. Direct Storage in Cloud-Native Formats A critical aspect of openEO’s scalability lies in its ability to store outputs directly in cloud-native formats (GeoTIFF), on project-specific storage solutions (S3). By bypassing intermediary steps and writing processed data directly to these locations, openEO significantly reduces overhead in data handling. 4. Customizable Memory Usage Settings Cost optimization is a key consideration in cloud-based EO workflows. OpenEO addresses this by allowing users to customize memory allocation settings for each task. By fine-tuning memory usage, users can allocate just enough resources to ensure high performance without incurring unnecessary expenses. 5. Automatic STAC Metadata Generation Metadata plays a crucial role in making processed EO data discoverable, reusable, and interoperable. OpenEO automatically generates STAC-compliant metadata for all workflow outputs. This ensures that processed data can be easily cataloged and shared, simplifying collaboration across teams and projects. Conclusion: The integration of advanced job management features and cloud-native tools positions openEO as a critical platform for scalable EO analytics. Its ability to generate STAC-compliant metadata ensures processed data remains reusable, accessible, and aligned with FAIR data principles, fostering collaboration across the EO community. Successful real-world applications include projects like WorldCereal [1], LCFM (Land Cover and Forest Monitoring) [2], and WEED [3], demonstrating openEO’s capacity to handle even the most computationally intensive workflows efficiently. As the demand for global and regional Earth monitoring grows, openEO provides the tools necessary to meet these challenges, making it an indispensable platform for researchers, policymakers, and practitioners alike. [1] https://esa-worldcereal.org/en [2] https://land.copernicus.eu/en [3] https://esa-worldecosystems.org/en
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Leveraging Insula for Advanced Eutrophication Monitoring in Albania and Tanzania

#cloud-native

Authors: Alessandro Marin, Roberto Di Rienzo, Paola Di Lauro, Giulio Ceriola
Affiliations: CGI, Planetek
The Eutrophication Monitoring (Eu-Mon) project contributes to Sustainable Development Goal (SDG) 14.1.1, which seeks to reduce marine pollution, including nutrient loads. Collaborating with stakeholders such as the University of Tanzania and the Resource Environmental Center (REC) Albania, the project targets ecologically significant test and pilot sites: Shengjin Bay and Durres Bay in Albania, as well as the Zanzibar Channel and Mafia-Rufiji Channel in Tanzania. These areas, impacted by nutrient loads from rivers, estuaries, and coastal activities, require advanced monitoring techniques to assess water quality and environmental health effectively. To meet these challenges, the project utilizes Insula, CGI’s Earth Observation (EO) Platform-as-a-Service (PaaS), as a solution. Insula provides a robust framework for integrating EO data and simplifying the generation of key eutrophication indicators, such as Chlorophyll , Turbidity and Water Transparency. The platform offers seamless access to Sentinel-3 OLCI Level 2 WFR data via its deployment within the Copernicus Data Space Ecosystem (CDSE), enabling rapid and efficient dataset retrieval. Insula’s flexibility extends to the integration of custom processors tailored to project-specific needs, allowing for the precise extraction of environmental indicators from EO data. The platform’s ability to perform large-scale processing campaigns has been pivotal for the Eu-Mon project. For example, it processed data spanning six years (2017–2022) over large geographical areas in Albania and Tanzania, completing more than 4,800 processing jobs. Utilizing managed Kubernetes solutions within CDSE, Insula dynamically scaled resources to handle intensive computational demands efficiently. This scalability ensures that even vast datasets are processed with reliability and speed, supporting long-term trend analysis critical for understanding eutrophication dynamics. Insula’s cloud-native architecture enhances its capacity to analyze extensive time series data, uncovering patterns and trends essential for informed decision-making. Its intuitive user interface empowers stakeholders to monitor and manage processing campaigns transparently, offering detailed insights into progress and outputs. By providing user-friendly tools for analyzing environmental conditions, Insula bridges the gap between advanced EO analytics and actionable policy-making. Through its deployment in the Eu-Mon project, Insula has demonstrated its transformative potential to support sustainable coastal ecosystem management in Albania and Tanzania. By enabling the generation of high-quality indicators that inform targeted policies and interventions, the platform contributes significantly to addressing global environmental challenges. Insula’s innovative approach highlights the critical role of EO technologies in achieving SDG 14.1.1 and advancing global efforts to mitigate the impacts of eutrophication on vulnerable marine ecosystems.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: From GeoTIFF to Zarr: Virtualizing a Petabyte-Scale SAR Datacube for Simple, Scalable, and Efficient Workflows

#zarr #pangeo #cog

Authors: Clay Harrison, Wolfgang Wagner, Christoph Reimer, Florian Roth, Bernhard Raml, Sebastian Hahn, Matthias Schramm
Affiliations: TU Wien, Research Unit Remote Sensing, Department of Geodesy and Geoinformation, EODC Earth Observation Data Centre for Water Resources Monitoring GmbH
The technological landscape of big geodata has rapidly evolved, with the "Pangeo stack" emerging as a leading paradigm. This ecosystem, built around data formats like Zarr and Cloud-Optimized GeoTIFF (COG), and processing tools like Xarray and Dask, enables more portable, scalable, reproducible, and FAIR workflows. However, adapting large preexisting data pools to these standards can be logistically challenging, especially when they are integral to numerous existing pipelines and services. Exemplifying this challenge is the petabyte-scale Sentinel-1 Backscatter Datacube, hosted at Vienna's Earth Observation Data Center (EODC) and crucial for global-scale SAR analysis, such as the Copernicus-run Global Flood Monitoring (GFM) and Copernicus Land Monitoring Service (CLMS). Current access and processing rely on custom Python packages, which require ongoing maintenance as both the Datacube and Python itself evolve. A conversion of the Datacube to Zarr format would reduce this maintenance burden by allowing direct use of Xarray for all data access, selection, and processing, and Dask for parallelization and scaling. It would also facilitate uptake of future advancements in the Pangeo ecosystem. However, the costs of duplicating the massive dataset or quickly rewriting existing pipelines to suit a new data are prohibitive for the time being. We have successfully implemented a basic solution to this problem, using the Python package "fsspec" to create a Reference File System that indexes existing GeoTIFF files according to Zarr structure. The RFS exists as a single JSON file which tells Xarray how to access the Datacube as if it were in Zarr format, simplifying access without physical data conversion. Such a virtual Zarr archive allows a gradual transition of existing downstream pipelines to the new format while leaving upstream pipelines untouched. However, significant challenges remain in optimizing this approach for a petabyte-scale cube of raster files in swath format. Our ongoing work focuses on addressing these challenges, particularly in representing the cube's time dimension. We explore various strategies for time representation, considering trade-offs between precision, sparsity, and computational efficiency with respect to several workflows representative of access patterns. Additionally, we investigate the implications of preexisting blocked compression strategies of legacy TIFF files on the overall performance of our approach. This contribution will discuss our successes in implementing the basic fsspec solution, the challenges encountered in scaling to petabyte-level SAR data with irregular timestamps, and our progress in addressing these issues. We aim to contribute insights into adapting legacy data structures to modern, cloud-optimized analysis paradigms, potentially informing similar efforts with other large-scale geospatial datasets.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: From Complex EO Data to Actionable Insights: CRISP and Insula’s Role in Sustainable Agriculture

#cloud-native

Authors: Alessandro Marin, Roberto Di Rienzo, Loris Copa, Marcelo Kaihara, Giaime Origgi
Affiliations: CGI, sarmap
The Consistent Rice Information for Sustainable Policy (CRISP) project, funded by the European Space Agency (ESA), aligns with Sustainable Development Goal (SDG) 2.4.1, aiming to enhance sustainable food production systems and resilient agricultural practices by 2030. Through advanced Earth Observation (EO) solutions, CRISP addresses critical agricultural indicators such as seasonal rice planted area, crop growing conditions, yield forecasting, and production at harvest. By integrating contributions from stakeholders like AfricaRice, GEOGLAM, GIZ, Syngenta Foundation, IFAD, WFP, and SRP, CRISP represents a collaborative and impactful initiative. To achieve its ambitious objectives, CRISP leverages Insula, CGI’s EO Platform-as-a-Service (PaaS), as a cornerstone for scalable EO data processing, integration, and analysis. Insula’s capabilities significantly simplify the challenges associated with handling large-scale EO datasets, ensuring cost-effective and efficient processing of Sentinel-1 and Sentinel-2 data. This platform enables CRISP to deploy advanced workflows and algorithms, previously tailored to specific user needs during the project's test phase, across five diverse test sites in South-East Asia, India, and Africa. A key innovation of Insula lies in its ability to provide an intuitive user interface (UI) tailored to decision-makers. This simplifies the complexity of EO data handling, making advanced geospatial analytics accessible to non-experts. Insula’s UI facilitates seamless access to high-quality agricultural intelligence while maintaining a focus on usability, enabling users to interact with, visualize, and analyze data products efficiently. This functionality is vital for CRISP’s mission to empower early adopters with actionable insights for sustainable agriculture. Additionally, Insula excels in its cloud-native architecture, designed to handle the high computational demands of large-scale EO processing. Its scalability ensures that the CRISP project can process massive datasets with consistent performance, accommodating the global scope of its objectives. By integrating EO best practices and leveraging multi-mission data sources, Insula helps CRISP deliver robust, reproducible, and scientifically validated results. Insula’s role extends beyond technical capabilities, fostering a collaborative ecosystem where Early Adopters actively engage with the platform. This hands-on involvement minimizes the risk of unmet expectations and facilitates the endorsement of EO-based services. The platform’s ability to harmonize diverse user requirements into streamlined workflows ensures that CRISP remains a user-centric initiative, delivering operational solutions aligned with the demands of sustainable agriculture. Through Insula, CRISP demonstrates how cutting-edge PaaS technology can transform complex EO data processing into a practical tool for achieving global agricultural sustainability. The project showcases the power of combining advanced analytics with user-oriented design to address large-scale challenges, ensuring a pathway to resilient and productive agricultural practices worldwide.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Cloud-native Near-Real-Time Image Land-Cover Segmentation Data Pipeline

#cloud-native #stac #zarr

Authors: Tobias Hölzer, Jonas Küpper, Todd Nicholson, Luigi Marini, Lucas von Chamier, Ingmar Nitze, Anna Liljedahl, Guido Grosse
Affiliations: Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Permafrost Research Section, University of Potsdam, Institute of Geosciences, Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Computing and Data Centre, National Center for Supercomputing Applications, University of Illinois, Woodwell Climate Research Center
Neural network-based image segmentation is increasingly used in remote sensing and earth observation, often combining multiple earth observation data sources such as digital elevation models and optical satellite imagery. Recent free and publicly available satellite imagery has a resolution of up to 10m per pixel, and repeat cycles of a few days, such as Sentinel-2. Such resolutions result in Tera- and even Petabyte large datasets. In the context of machine learning and feature segmentation, this amount of data poses specific challenges:. Running a full segmentation pipeline, with adding auxiliary data, running data preprocessing, segmentation, and post-processing, poses high requirements on processing infrastructure, such as storage, network bandwidth, CPU and GPU resources. In previous work, we created an automated segmentation pipeline to map retrogressive thaw slumps (RTS), a widespread mass wasting feature in permafrost regions, similar to landslides, over multiple years using Sentinel-2 and PlanetScope imagery. This pipeline was sufficient for regional analysis and served its purpose for creating the DARTS dataset. However, this pipeline lacks the optimization necessary for scaling the workflow to the circumarctic scale. Thus, our main goal is to scale our processing throughput to go from regional to pan-arctic scale and potentially high frequency. To address this challenge, we aimed to combine multiple state-of-the-art technologies that rely on proven computational concepts such as ray, which eases distributed computing. Hence, we focused on clean, explainable code and the use of existing solutions instead of re-inventing the wheel. For efficient data caching of procedurally downloaded auxiliary data with the STAC protocol, we used zarr datacubes instead of plain raster data formats, such as GeoTiff. Here we used emerging libraries like odc-geo, which provides an easy-to-use and fast API for defining grids and reprojections of tiles with their GeoBox model. This approach led to the development of a cloud-native pipeline that can efficiently process a single 10,000px x 10,000px tile in mere seconds. Using GPU resources, even for preprocessing such as calculating derived data from digital elevation models, further helped to speed up our processing pipeline. By applying this concept, we successfully built a scalable pipeline for segmenting thaw-slumps across the circumarctic permafrost region, which can be run on multiple platforms from a local machine, to cloud computing and HPC systems.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: ROCS: Extending Romania’s National Infrastructure within the European Collaborative Ground Segment

#cloud-native #stac #zarr #cog

Authors: Marian Neagul, Conf. Dr. Gabriel iuhasz, Vasile Craciunescu, Prof. Dr. Florin Pop, Dr. Alina Radutu, Dr. George Suciu, Prof. Dr. Daniel
Affiliations: West University Of Timisoara
The ROCS project (2024-2027), aims to enhance Romania’s integration into the European Collaborative Ground Segment (COLGS). Expanding Romania's national infrastructure for the storage, processing, and dissemination of Earth Observation (EO) data, ROCS addresses critical challenges associated with the exponentially growing volume of satellite data from programs such as Copernicus. This initiative strengthens Romania’s role in the European space ecosystem and facilitates innovation and data-driven solutions to environmental, economic, and societal challenges. Being one of the world’s most comprehensive EO programs, Copernicus generates daily data through its Sentinel satellites, complemented by in-situ observations and contributions from global missions such as Landsat. These datasets provide opportunities for monitoring Earth’s natural processes. Their massive volume creates data storage, access, and analysis challenges. Traditional data management systems are increasingly unable to meet the demands of real-time applications, multi-modal integration, and cross-domain collaboration. Initiatives such as the Copernicus Data Space Ecosystem (CDSE), Destination Earth (DestinE), and projects similar to Digital Twins address these barriers through modern approaches that promote cloud-native architectures, open standards, and distributed processing.Adhering tot these initiatives, ROCS leverages state-of-the-art solutions to enable efficient and scalable EO data exploitation. By building a federated infrastructure based on the European collaborative efforts, this project seeks to bridge the gap between data availability and actionable insights for Romania’s specific needs. Project Objectives and Scope The main objective of ROCS is to establish a national EO data infrastructure that integrates seamlessly into the European Collaborative Ground Segment and supports Romania’s specific needs across multiple domains. The project’s scope focuses on several key objectives: First, the analysis of operational requirements by performing a broad assessment of operational needs in accordance with European and industry best practices. The underlying insight regarding requirements comes with the engagement of stakeholders across public institutions, academia and industry. Second, infrastructure and software design based on a federated, distributed architecture based on open-source tools. It also incorporates "cloud-native"storage formats such as but not limited to: STAC, COG, and Zarr. Optimized environments to support scalable data ingestion, indexing and processing. Third, the design and implementation of modular tools for both raster and vector data indexing, leveraging Open Geospatial Consortium (OGC) standards. Necessarily, the inclusion of an established, robust security framework with centralised authentication and authorisation mechanisms such as OIDC is also a core consideration. Finally the rapid deployment of our platform (even in pre-operational phases) so that it can be validated in real-world case studies. In order to aid in platform flexibility and operational efficiency, we will utilize containerised workspaces, which ensure secure data processing. Technical and Scientific Contributions The architecture of the ROCS platform comprises three distinct levels: Platform, Federation, and Application, ensuring a consistent ecosystem for EO data management and utilization. The platform level will be responsible for data ingestion, storage and primary processing. It integrates existing EO datasets, such as Sentinel (1,2,3, 5P), using cloud-native storage solutions such as MinIO. At a federation level, we aim to establish a network of distributed data centres with harmonised capabilities, enabling shared processing across institutions. We leverage Kubernetes for container orchestration and for scheduling, ensuring high computational efficiency. At the application level, we aim to provide tools for advanced analytics, visualization and development. End-users will be provided access to intuitive dashboards facilitating real-time data access. Advanced Processing and Accessibility The project emphasizes the “bring the user to the data” paradigm by adopting cloud-native storage and processing frameworks. Data replication and indexing will occur periodically , ensuring the most up-to-date information is accessible. By leveraging Kubernetes and OGC APIs, the platform will enable dynamic task execution, facilitating access to complex EO analyses for users without advanced technical expertise. Case Studies: Real-world Applications ROCS’s functionality and potential impact will be demonstrated through five specific case studies, showcasing its utility and societal relevance: Climate Change Adaptation: Integration with the RO-ADAPT platform will strengthen Romania’s capacity for climate resilience by providing EO-driven insights for vulnerable ecosystems. The utilization of Sentinel-2 and Sentinel-5P data will inform localized adaptation strategies aligned with international pledges/guidelines. Agricultural Policy Compliance: In collaboration with the Ministry of Agriculture, ROCS will facilitate automate crop classification processes, supporting compliance with new EU agricultural regulations. Sentinel-1 radar imagery and machine learning models will, potentially, help in the identification of issues associated with subsidy allocations. Forestry Monitoring and Deforestation Control: The platform aims to support initiatives like the EUDR regulation and national forest monitoring programs, providing tools to detect deforestation activities and monitor biodiversity using Sentinel-1 and Sentinel-2 data. Education and Skills Development: By integrating Earth Observation data into university programs, ROCS will encourage the next generation of geospatial professionals. Interactive tools like JupyterHub and Eclipse Che will enable hands-on learning, addressing gaps in EO data education and analytical skills. High-Resolution Cloudless Mosaic: A seamless, cloud-free mosaic (or basemap) of the region will be generated using Sentinel-2 data. This foundational dataset will support various applications, including urban planning, disaster risk management, and environmental conservation. Broader Impact and Relevance The successful implementation of ROCS will position Romania as an important player within the European Collaborative Ground Segment. Beyond technical advancements, the project has significant societal implications: Enhanced Policy Implementation: Supports national authorities in environmental monitoring, disaster response, and urban resilience planning. It aligns with European Green Deal objectives, supporting carbon neutrality and sustainable resource management. Economic Growth: Catalyzes the development of geospatial services, opening new avenues for commercial applications in agriculture, forestry, and urban management. Scientific Collaboration: Provides seamless access to EO data for researchers, fostering innovation and multidisciplinary studies. Education and Awareness: Encourages data-driven learning and public engagement with EO technologies, ensuring long-term sustainability of knowledge transfer. Alignment with the Living Planet Symposium Themes The ROCS project follows the themes of the Living Planet Symposium by addressing the synergy cutting-edge technology and real-world applications. By adopting cloud-native solutions and adopting interoperability with European systems, ROCS demonstrates a commitment to advancing Earth science, bridging data with decision-making, and contributing to sustainability goals. Conclusion ROCS represents a transformative effort to advance Romania's EO data capabilities while enhancing its contributions to the European Collaborative Ground Segment. Through its innovative architecture, real-world applications, and societal relevance, the project aligns seamlessly with the goals of the Living Planet Symposium. It highlights the potential of Earth observation to address critical challenges in climate resilience, agricultural sustainability, and environmental protection, fostering a future of informed decision-making and scientific excellence.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: ORBIS: Earth Observation data service for NewSpace missions

#stac

Authors: Roman Bohovic, Jan Chytry
Affiliations: World from Space
In recent years, there has been a rapid expansion of satellite Earth Observation (EO) missions in the private sector through increasing affordability and accelerating adoption processes. The commercial EO market is expected to grow by 8% annually towards 2032. The NewSpace approach enabled cost and time savings through high failure tolerance, development agility and utilizing off-the-shelf components, and led to serializing satellite production, including the EO instruments. The above trends are lesser noticeable in the midstream part of the value chain handling the immediate imaging results, i.e., calibration/validation (cal/val) activities, mission-to-use case analysis, scalable data ingestion and low-level processing. At the time of writing this abstract, only five to six commercial companies were known to exclusively focus on these services while having a repeatable product. Nevertheless, neglecting these areas effectively decreases the mission's value. In response, World from Space (WFS) has been developing Orbis, a cloud-based service for complete optical satellite data management based on close provider-customer cooperation throughout the mission lifetime. The service is primarily focused on NewSpace missions and intends to offer an easily approachable, cost-effective and highly scalable solution. The first phase of the development spun off from the work on the Czech Ambitious Missions framework, being able to collaboratively build knowledge and collect user needs. It further continued under the custom ESA project. The capabilities of the integrated Orbis prototype include both software and non-software (processed-based/professional) services: - Mission analysis - Early mission onboarding enables WFS to perform requirements and observation use cases analysis, consult the mission design and simulate data or performance. - In-flight calibration/validation - It is possible to aid with the planning and assessment of analyses and perform experiments to adjust for imaging biases and errors, using appropriate reference data. - Novel data ingestion and mission setup - Profiling the mission in the system and drawing from the existing settings and heritage enables continuous API-based ingestion of data in various processing stages. - Image (pre)processing - This includes in-house rectification of sensing errors, radiometric, geometric and atmospheric corrections and higher-level processing operations to enhance usability and interoperability, possibly providing customers with ready-made EO intelligence. - Quality assessment - It is thoroughly performed in performance validations and the processed imagery and reported in data products. - Product distribution, storage and archiving - Human or machine-friendly interface and several standardized formats. - Data management services - Overview of the services and interaction with missions, instruments, data and third-party customers in a lightweight browser interface. Several key properties were realized while implementing the Orbis prototype to tackle the challenge of a commercially viable EO data system. The system’s modular architecture is based on flexible and loosely coupled components, minimizing dependencies and promoting standardized data exchange. The system entities are hierarchically structured to include satellite constellations, missions, instruments, and acquisitions, their processing pipelines and catalogued results. The repeatability is secured by careful layerization of processing algorithms, creating a substrate for rapid prototyping, adjustments and re-contribution back to the system. To address the variable nature of EO data inputs, a rigid internal data model is defined, which is extensible by mission adapters. The distribution is based on the STAC standard. For a NewSpace system, it is also important to balance quality and cost, i.e., high-end mission capabilities with best-effort processing for missions with lower-end equipment or constrained budgets. The platform can scale with mission data throughput and satellite capabilities, offering both premium and cost-effective EO product solutions. Lastly, the cloud-based model of Orbis means that its deployment, operations and upgrades can be fully managed by WFS so that the mission is accompanied continuously throughout its lifetime. Orbis presents a technical response to the rapidly advancing EO domain in the NewSpace business, which requires (1) agnostic EO processing and data access expertise while (2) being able to aid the technical conception of the customer, (3) quickly utilizing modular entities, predefined interfaces and continuous deployment and (4) having flexible business models at various scales. In future development, it is planned to extend services to advanced payload-related analyses, complete ground and in-orbit calibration/validation, deployment agnosticity, highly secure operation modes and on-board data processing. All these efforts make Orbis a relevant solution for contemporary and future space-based observation challenges.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: FAO Essential Remote Sensing Data Product Portal for Agricultural Application Services

#stac #cog

Authors: Pengyu Hao, Zhongxin Chen, Karl Morteo, Ken Barron, Pedro MorenoIzquierdo, Valeria Scrilatti, Gianluca Franceschini, Battista Davide, Daniele Conversa, Carlo Cancellieri, Yohannis Bedane, Aya Elzahy, Ahmed AhmedHassan, Mohamed Megahed, Furkan Macit, Muhammad Asif, Noah Matovu
Affiliations: Food and Agriculture Organization of the United Nations
The increasing availability of satellite observations, combined with advancements in cloud-based computing and the improvement of land surface parameters calculation models, has led to the development of a wide range of land surface products. However, these advancements have also introduced several challenges. First, because these products are generated by different teams, they often follow varying standards. For example, differences in definitions, algorithms, projections, tile systems, and scale factors can result in inconsistencies across datasets. Second, although accuracy assessments are typically conducted before data publication, significant disagreement often exist among products addressing the same topic. The lack of third-party evaluations further limits the usability of these satellite-derived products. Third, several platforms provide data visualization and processing functionalities, but a significant amount of data of high-quality remains accessible only through data repositories. The absence of efficient search tools further restricts the practical application of these products. In this work, we propose a new data portal to address these limitations. First, we identified essential topics in the agricultural domain and selected global-level land surface data products derived from satellite observations. The products were primarily chosen based on whether they are consistent to FAO’s definition; and as for the data quality control, both accuracy assessments reported by the data providers and third-party evaluation reports by our data evaluation team are considered. Second, we are developing a data portal using SpatioTemporal Asset Catalogs (STAC) built on FAO’s geospatial data storage and service infrastructures (GIS Manager 2), all selected data are processed to uniform tiling systems, scale factors and the file format are standardized as cloud-optimized geotiff (COG). This portal provides efficient data search and download functionalities, which enabling users to access data with geospatial extent and temporal range of their interest. Furthermore, case studies demonstrating data analysis applications are also included to promote the practical use of typical land surface products. The portal could promote high-quality satellite data derived land surface products and provide efficient toolkits for data users, it is currently available at https://data.review.fao.org/remote-sensing-portal, has been presented to FAO for feedback and hosts data products on topics such as cropland mapping, leaf area index, net primary production, and land surface phenology. We are actively working on adding more datasets and improving the portal’s search and download functionalities, with the goal of formally launching it in 2025.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Scalable and Automated Cloud-Based Pipelines for Earth Observation: Enhancing the Hellenic Ground Segment Infrastructure and Collaborative Support Activities

#cloud-native #stac #zarr #cog

Authors: Thanassis Drivas, Fotis Balampanis, Iason Tsardanidis, Ioannis Mitsos, Charalampos Kontoes
Affiliations: National Observatory Of Athens
The rapid expansion of Earth Observation (EO) data necessitates the development of robust, scalable solutions for storage, pre-processing, and advanced analytics. This presentation introduces a fully automated, end-to-end data pipeline leveraging cloud-native technologies to address these challenges and is being carried out in the context of supporting the DHR Network Evolution. Processing elements consider the overall architecture of the ESA Ground Segment Architecture and traceability of the various processing steps afforded therein. Developed as part of the Dataspace Copernicus Ecosystem, the pipeline integrates advanced orchestration frameworks, S3-compatible object storage, and cutting-edge Machine Learning (ML) algorithms to enable efficient processing of satellite data. Scalability is achieved through containerization and dynamic resource allocation, making the system adaptable for diverse analytical scales, ranging from localized to global assessments. The pipeline automates the generation of Analysis Ready Data (ARD) utilizing modern data formats such as Cloud-Optimized GeoTIFFs (COGs) and Zarr. Building on this foundation, sophisticated algorithms and state-of-the-art AI models are employed to develop advanced applications, including cloud-gap interpolation, grassland mowing detection, and crop classification. These applications unlock deeper insights from EO data, transforming it into actionable intelligence. Following the pre- and post-processing steps, SpatioTemporal Asset Catalogues (STAC) are utilized to ensure EO data and derived products are accessible, interoperable, and usable by the broader scientific and operational community accessing the Ground Segments facilities. Ingesting Level-2 and Level-3 products into a STAC catalogue not only supports the reproducibility of research but also fosters collaboration and accelerates innovation, transforming insights into validated services. Overall, this study highlights how relay data hubs leveraging cloud infrastructure and AI scale up EO applications to address global challenges and support informed decision-making across diverse sectors and stakeholders such as environmental monitoring, energy, disaster response, and sustainable agriculture.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Reuse of Copernicus Reference System for Earth Explorer missions

#cloud-native #stac

Authors: Espen Bjorntvedt, Alessandra Rech
Affiliations: ESA ESRIN, CS GROUP
This presentation describes the evolution of the Copernicus Reference System (COPRS), initially conceived as a mission-specific solution within the Copernicus program for Sentinel 1, 2, and 3 missions, into a highly adaptable Generic Processing Orchestration System (GPOS) for ESA Earth Explorers and other scientific EO missions, ready to be adopted as reference processing system within the new ESA EO Framework for Earth Observation Science Missions. Traditionally within a mission-specific Payload Data Ground Segment (PDGS), the Processing Orchestration System is one of the core sub-systems of the downstream chain and was tailored for the requirements of specific missions, often resulting in tightly coupled designs. COPRS’ original design already deviated from this schema, adopting a generic, modular architecture that separated the underlying framework from mission-specific functions and data processors. By leveraging loosely coupled microservices orchestrated by a workflow manager, COPRS enabled seamless integration of additional capabilities without disrupting existing components. From its inception, COPRS was designed as cloud-native, integrating the scalability and efficiency of cloud environments. This foundation allows it to support diverse use cases, from the high-volume Sentinel missions to smaller-scale nanosatellite and demonstration missions. Its ability to dynamically scale processing nodes based on throughput needs provides both flexibility and cost-efficiency, addressing mission peaks and minimizing data production costs. Additionally, the system natively tackles cloud-specific constraints, such as shared data access, through innovative solutions tailored to meet the Sentinel missions' stringent performance and data volume demands. This robust starting point made COPRS a natural candidate for ESA’s vision of a Generic Processing Orchestration System for the new ESA EO Framework for Earth Observation Science Missions. While the Earth Explorer missions present new challenges and requirements, the generic design of COPRS proved to be highly adaptable, allowing seamless integration of processors from the Cryosat-2, EarthCARE, and Swarm missions through specific configurations. The first version of GPOS, validated on the above-mentioned missions, is now available and ready to be operationalized. Built on top of Kubernetes, the system supports deployment on private or public clouds, ensuring platform independence. The integration of modern standards, such as standardized workflow languages and STAC catalogs, simplifies processor integration and data accessibility. Finally, as Free Open-Source Software, the system is ready to power future Earth observation missions while benefiting from community contributions and collaborative enhancements.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: ProsEO - A Cloud Native Processing Framework for EO Data Processing

#cloud-native

Authors: Peter Friedl, Anett Gidofalvy, Maximilian Schwinger, Frederic Raison, Nicolas
Affiliations: German Aerospace Center (DLR e.V.)
The increasing complexity of upcoming Earth Observation (EO) research missions, particularly those within ESA’s Earth Explorers program, demands innovative and sustainable solutions for ground operations. These missions, characterized by higher data volumes, intricate processing algorithms, and the need for synergetic processing with data from Copernicus and international partners, impose significant challenges on IT infrastructure. In addition to the technical demands, there is a growing imperative to address sustainability by minimizing environmental impacts while meeting the user community’s expectations for collaborative and efficient data exploitation. We introduce ProsEO (Processing System for Earth Observation), a cloud-native processing system designed to respond to these challenges. Built on a microservices architecture, ProsEO provides a scalable and flexible solution for EO data processing across diverse cloud environments. Its advanced capabilities include intelligent dependency analysis between EO products and dynamic optimization of production workflows based on input data availability. By integrating resources from multiple cloud providers, ProsEO ensures efficient use of IT infrastructure while reducing duplication of resources, thereby contributing to sustainable ground operations. ProsEO exemplifies a shift towards environmentally conscious EO ground systems through its ability to streamline data workflows and maximize computational efficiency. Its modular design facilitates seamless integration of new missions and data sources, ensuring the long-term sustainability of ground operational frameworks. ProsEO is capeable of answering requirements of online mission data processing as well as major reprocessing campaigns. We will detail ProsEO’s technical architecture, highlighting its use of containerized microservices, orchestration technologies, and its ability to handle large-scale data dependencies. Through real-world use cases, we will demonstrate how ProsEO optimizes data processing, and exploitation workflows, and reducing costs while addressing the increasing complexity of EO missions. We aim for discussions on sustainable solutions for EO ground systems, showcasing ProsEO and giving insights into the role of innovative technologies in shaping the future of EO research mission ground frameworks.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Earth Observation data for Environmental Monotoring and Maritime Situational Awareness in the Black Sea

#cog

Authors: Marius Budileanu, Ionuț Șerban, Vasile Craciunescu, Sorin Constantin, Michela Corvino
Affiliations: Terrasigna, ESA
In the last years, the Black Sea has become one of the most important navigation areas in the world. Having in mind the general context of the Black Sea area, navigation safety and the risk of polluting accidents have led to the need for better monitoring of maritime traffic. A new, innovative platform, for data processing, integration, and visualization for situational awareness in the Black Sea will be showcased/presented. The main objective of the platform is to semi-automatically detect ships in the area of interest and provide a brief characterization of these vessels (e.g. length, bearing). The platform benefits from the SAR data provided by Copernicus Sentinel-1 mission, which allows the information extraction concerning maritime traffic in all weather conditions. Optical images (such as Sentinel-2 data), together with other SAR-derived products are also taken into account to minimize the gap between the Sentinel-1 sensors revisiting time. Automatic identification system (AIS) data is used for correlation with targets obtained from Earth Observation (EO) to derive different types of information. These can refer to vessel speed over ground (SOG), course over ground (COG) or its maritime mobile service identity (MMSI). The correlation module is also used to detect anomalies regarding ships' navigation like out-of-path trajectories or AIS broadcaster turned off. All the above-mentioned modules operates in a cloud platform - EO4BSP - that integrates state of the art technologies with open access based on OGC compliant standards and user friendly web interface.
Add to Google Calendar

Thursday 26 June 13:30 - 13:50 (EO Arena)

Demo: C.06.15 DEMO - InSAR Time Series Benchmark Dataset Creation by a new Open-Source Package (AlignSAR)

#zarr

The demonstration would be as follows:

(1) Introduce the AlignSAR project:
The AlignSAR package is a new tool for creating SAR signatures. It is an open-source software that can provide datacubes with InSAR time series signatures. The primary objectives of the AlignSAR are: (1) to provide a full and FAIR-guided InSAR time series datacube; and (2) to containerise the entire workflow so that it is easily accessible to the SAR community. The utility of such datasets for ML applications is evaluated using the example of deformation change detection, recognizing spatial and temporal changes in InSAR signals.

(2) Discuss the implementation of the solution:
The AlignSAR package is presented on one use case, Campi Flegrei, a volcanic area in Italy. The main workflow is separated into three stages: (a) downloading and processing interferograms using LiCSBAS (LiC Small Baseline Subset); (b) spatial and temporal SAR signature extraction and datacube production; and (c) detecting deformation changes in generated datacubes using LiCSAlert. The AlignSAR package uses LiCSBAS and LiCSAlert tools to generate interferograms and identify anomalies in time series signatures. Moreover, additional extensions are discussed that utilize the capabilities of these tools to achieve the project’s goals.

(3) Audience questions (Q&A)

We conclude that the AlignSAR package presented here is an extension of the previous version, which was focused on basic SAR signature extraction. Together, it provides a comprehensive and consistent procedure for creating SAR datasets in standard formats such as Zarr. They can be used for various ML applications created by end users, such as change detection tasks or land use classification. All developed tools and sample datasets are available in the AlignSAR GitHub repository (https://github.com/alignsar/alignsar).

Speakers:


  • Milan Lazecky - University of Leeds
  • Zachary Kiernan - Starion Italia S.p.A
Add to Google Calendar

Thursday 26 June 10:30 - 10:50 (EO Arena)

Demo: D.03.20 DEMO - Cubes & Clouds 2.0 – A Massive Open Online Course for Cloud Native Open Data Sciences in Earth Observation

#cloud-native #stac #pangeo

The Cubes & Clouds 2.0 online course offers vital training in cloud-native open data sciences for Earth Observation (EO). In this 20-minute demonstration, participants will gain insights into the course structure and content, which includes data cubes, cloud platforms, and open science principles. The session will highlight hands-on exercises utilizing Copernicus data, accessed through the SpatioTemporal Asset Catalog (STAC), and showcase the openEO API and Pangeo software stack for defining EO workflows.?

Attendees will also learn about the final collaborative project, where participants contribute to a community snow cover map, applying EO cloud computing and open science practices. This demonstration is ideal for Earth Science students, researchers, and Data Scientists looking to enhance their skills in modern EO methods and cloud platforms. Join us to explore how Cubes & Clouds equips learners with the tools to confidently conduct EO research and share their work in a FAIR manner.

Speakers:


  • Dolezalova Tyna - EOX IT Services GmbH
  • Claus Michele - Eurac Research
  • Zvolenský Juraj - Eurac Research
Add to Google Calendar

Thursday 26 June 11:15 - 11:35 (EO Arena)

Demo: D.03.27 DEMO - openEO by TiTiler: Demonstrating Fast Open Science Processing for Dynamic Earth Observation Visualization

#stac

This demonstration aims to highlight our streamlined implementation of openEO by TiTiler, known as titiler-openEO (https://github.com/sentinel-hub/titiler-openeo), which has been developed through a collaborative effort between Sinergise and Development Seed.

In contrast to conventional openEO implementations that often involve extensive datacube processing and asynchronous workflows, titiler-openEO is designed to emphasize synchronous processing and dynamic visualization of raster data. We believe this approach will enhance the user experience and efficiency in handling raster datasets.

The session will highlight the key innovations of our approach:
- Synchronous Processing: Real-time execution of process graphs for immediate visualization
- ImageData-Focused Model: Simplified data model optimized for raster visualization
- Fast, Lightweight Architecture: Built on TiTiler and FastAPI without additional middleware
- Streamlined Deployment: Easily deployable for quick prototyping and visualization
- Early Data Reduction: Intelligent data reduction techniques to minimize processing overhead

We will demonstrate practical applications directly integrated in the Copernicus Data Space Ecosystem using the new catalog of Sentinels data, showing how titiler-openEO can transform complex Earth Observation workflows into lightweight, interactive visualizations. Attendees will see how this implementation complements existing openEO backends for common visualization needs.

This demonstration is particularly relevant for users wanting to quickly prototype and validate algorithms without the overhead of a complex processing backend setup. We'll show how titiler-openEO can be integrated with existing EO platforms and STAC catalogs to provide immediate visual feedback for data analysis.

Speakers:


  • Emmanuel Mathot - DevelopmentSeed
  • Vincent Sarago - DevelopmentSeed
Add to Google Calendar

Thursday 26 June 16:30 - 16:50 (EO Arena)

Demo: D.04.25 DEMO - Codeless EO data analysis with openEO, leveraging the cloud resources of openEO platform straight from your web browser

#stac

This demo aims at giving a general introduction to the core concepts of openEO and connecting it with a live demo using the openEO Web Editor to highlight the generation of workflows based on the openEO user-defined process (UDP) concept without any coding skills. The demo will operate on the openEO platform and illustrate the ease with which anyone can create workflows for analyzing EO data without the need to take care of data management or writing scalable parallelized code and optimized code. The demo will be hosted by Alexander Jacob from Eurac Research and Matthias Mohr from Matthias Mohr - Softwareentwicklung.

Demo Content & Agenda

1.) Introduction & Overview
a.) Introduction to the openEO API: functionalities and benefits
b.) Data cubes concepts and documentation review
2.) Transitioning to Cloud Processing
a.) Challenges and advantages of moving from local
processing to cloud environments
b.) Overview of cloud providers (VITO Terrascope, EODC,
SentinelHub) and their integration with openEO Platform
& CDSE
c.) Key concepts of FAIR (Findable, Accessible, Interoperable,
Reusable) principles implemented by openEO
d.) STAC: how the SpatioTemporal Asset Catalog allows
interoperability

Live Demo with openEO
1.) Accessing and using the openEO Web Editor
2.) Discovering and accessing EO datasets and processes
3.) Generating workflows using the openEO Web Editor
4.) Processing workflows
5.) Managing and checking the status of submitted jobs
6.) Visualizing results

Speakers:


  • Alexander Jacob - EURAC
  • Matthias Mohr
Add to Google Calendar

Thursday 26 June 13:07 - 13:27 (EO Arena)

Demo: D.04.32 DEMO - KForge: enable close-to-real-time EO for all - from a demonstrator to a scalable European capability

#cloud-native

Europe instituions call for a "big bang" in the space strategey. The focus is on boosting competitiveness of the industry, fostering dual-use innovation and fostering the leverage commercial capabilities— KForge is a concrete step forward. It is the backbone of the ESA Close-to-Real-Time Ship Detection Platform demonstrator. KForge is a secure, cloud-native PaaS solution that aims at radically simplifing EO data processing. It allows mission operators to land data directly from ground station networks into a pre-configured cloud environment, ready for near real-time analytics for all domains of applications from environmental, scientific, to security and defence.

KForge contributes to the effort to lower technical and economical barriers to EO, enables larger access to the data and accelerates use case development. From climate monitoring to disaster response and situational awareness, access to cost optimise timely data is critical. Designed with sovereignty, and cost-efficiency in mind, the platform is built to scale beyond its demonstrator role. Future deployments will support institutional missions meeting European sovereign cloud environments requirement, offering a robust and modular processing infrastructure fit for New Space and legacy missions alike.

KForge is a practical enabler of Europe’s strategic autonomy, demonstrating how commercial innovation can empower institutional goals while democratising the benefits of EO.

Speakers:


  • Romain Poly - KSAT
Add to Google Calendar

Thursday 26 June 14:37 - 14:57 (EO Arena)

Demo: E.03.04 DEMO - GMV Prodigi: Cloud-Native EO Data Processing as a Service – Global Launch on AWS Marketplace

#cloud-native

We propose a demonstration session at the Living Planet Symposium 2025 to show the worldwide launch of GMV Prodigi®, an innovative Ground Segment as a Service (GSaaS) solution available on the AWS Marketplace. Developed under the ESA InCubed program, GMV Prodigi is a fully cloud-based framework running on AWS Cloud, providing scalable, efficient, and cost-effective Earth Observation (EO) data processing.
This solution is the result of a strategic alliance between AWS and GMV, combining GMV’s expertise in EO ground segment solutions with AWS’s cloud infrastructure and advanced computing capabilities. GMV Prodigi enables users to process EO data directly on AWS Cloud without requiring data movement, ensuring security, flexibility, and high performance for satellite operators, EO service providers, and the scientific community.
The session will feature a live demonstration, highlighting:
1.Seamless EO data processing directly on AWS Cloud – executing real-time workflows.
2.Scalability & automation – adapting to different missions, constellations, and user needs.
3.Cost and resource optimization – accelerating time-to-market with AWS-powered efficiency.
As the official global launch event, the Living Planet Symposium provides a unique opportunity for the EO community to explore this state-of-the-art cloud-native solution, designed to revolutionize EO data exploitation through the power of AWS cloud computing.


Speakers:


  • Jorge Pacios Martinez – GMV Prodigi Product Owner
  • Vital Teresa – Ground Segment Business Manager
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Hall E2)

Session: D.01.08 4th DestinE User eXchange - Addressing Data and Service Needs

#zarr

Addressing Data and Service Needs

Maximising the impact of DestinE requires that its data products and services align with what users across science, policy, and industry need.

This session will explore the challenges and opportunities of accessing and using data available through DestinE, combining technical insights with real-world user developments. Participants will gain an overview of the different ways to access Digital Twin data and learn about data-oriented services. Attendees will also hear from users who have developed applications or contributed data to the DestinE system. The session will conclude with an open discussion on data formats and upcoming developments.

Introduction to the session by presenting DestinE Data offering


  • Danaële Puechmaille - EUMETSAT

How to access DestinE data? • HDA • Polytope • Platform Services


  • Michael Schick - EUMETSAT
  • Tiago Quintino - ECMWF
  • Inés Sanz Morere - ESA

Serve DestinE users with near data computing capabilities (EDGE services)


  • Miruna Stoicescu - EUMETSAT

AI4Clouds application demonstrator using DestinE


  • Fernando Iglesias - Predictia Intelligent Data Solutions SL

Visualizing data in DestinE


  • Barbara Borgia - ESA

A collaborative toolbox to build and share your digital twin components – Delta Twin


  • Claire Billant - Gael Systems

Moderated discussion:


  • Data formats challenges (netcdf, zarr etc.)
  • New developments
  • Data quality
  • Trainings data and ML Models
  • Contribute to Data Portfolio
Add to Google Calendar

Thursday 26 June 16:15 - 17:45 (Hall K1)

Presentation: Introducing the OGC API – Discrete Global Grid Systems Standard for Enhanced Geospatial Data Interoperability

#zarr

Authors: Matthew Brian John Purss, Jérôme Jacovella-st-louis, Alexander Kmoch, Wai Tik Chan, Peter Strobl
Affiliations: Pangea Innovations Pty Ltd, Ecere Corporation, Landscape Geoinformatics Lab, Institute of Ecology and Earth Sciences, University of Tartu, European Commission Joint Research Centre
The advent of the OGC API – Discrete Global Grid Systems (DGGS) Part 1: Core Standard marks a significant evolution in geospatial data handling, promising to streamline the integration and retrieval of spatial data through an innovative, standardized API framework. This candidate standard is designed to facilitate the efficient retrieval of geospatial data, organized according to a Discrete Global Grid Reference System (DGGRS), tailored for specific areas, times, and resolutions. It emerges as a robust solution aimed at overcoming the complexities traditionally associated with projected coordinate reference systems. A DGGS represents the Earth through hierarchical sequences of tessellations, offering global coverage with progressively finer spatial or spatiotemporal refinement levels. This well-defined hierarchical structuring allows each data sample to be precisely allocated within a DGGRS zone that reflects the location, size, and precision of the observed phenomenon. This simplifies the aggregation and analysis of spatial data, enhancing capabilities for detailed statistical analysis and other computational operations. Rooted in the principles outlined in OGC Abstract Specification Topic 21, the OGC API – DGGS candidate Standard introduces a comprehensive framework for accessing data organized via DGGRS. This API is not merely a repository access point but a dynamic interface that supports complex querying and indexing functionalities integral to modern geospatial data systems. The standard specifies mechanisms for querying lists of DGGRS zones, thus allowing users to seamlessly locate data across vast datasets or identify data that corresponds to specific queries. This is achieved through the integration of HTTP query parameters combined with advanced filtering capabilities offered by the OGC Common Query Language (CQL2). Moreover, the candidate standard advocates for multiple data encoding strategies, accommodating a variety of data types and formats. It supports the retrieval of DGGS data using the widely adopted JSON encoding formats and additional requirements classes to enable raster or vector data indexed to DGGRS zones. Additionally, it provides compact binary representations for both zone data and zone lists in UBJSON and Zarr, enhancing data transmission efficiency and processing speed. Traditional indexed geospatial data formats are also supported for interoperability. The OGC API – DGGS candidate standard also includes an informative annex providing a JSON schema that describes a DGGRS, coupled with practical examples of DGGRS definitions. This annex serves as a valuable resource for developers and system architects aiming to implement the standard, offering guidance and examples that demonstrate the versatility and applicability of the DGGS approach. By defining a uniform standard for DGGS APIs, this initiative paves the way for a new era of geospatial data exchange and indexing. It addresses the growing challenges of managing massive geospatial datasets in today's digital age, promising enhanced interoperability, precision, and efficiency in geospatial data services. As the candidate Standard moves along the OGC standardization process and becomes more widely implemented in geospatial software tools, OGC API – DGGS is poised to become a cornerstone in the geospatial science and industry, fostering a more interconnected and accessible digital Earth.
Add to Google Calendar

Thursday 26 June 16:15 - 17:45 (Hall K1)

Presentation: Highly Scalable Discrete Global Grid Systems Based on Quaternary Triangular Mesh and Parallel Computing

#zarr

Authors: Davide Consoli, Daniel Loos, Luís Moreira de Sousa, Tomislav Hengl
Affiliations: OpenGeoHub Foundation, Max Planck Institute for Biogeochemistry, Instituto Superior Técnico
Discrete global grid systems (DGGS) can be used to efficiently store and access rasterized Earth observation data, including satellite images and derived products. One of the main advantages lies in the fact that, compared to data stored in standard projections like the Universal Transverse Mercator (UTM) or the WGS84, DGGS minimizes area distortions and avoids data replications in regions near the poles. For large-scale datasets, including the Sentinel-2 collection, this translates into a potential saving of petabytes of data storage [1]. In addition, using uniform cell sizes for tessellation facilitates analysis, derivation of spatial statistics, and application of spatial filters [2]. Finally, when implemented using hierarchical structures, they can perform operations such as point querying in logarithmic complexity, improving scalability at higher spatial resolutions. Despite their potential, DGGS are still not widely adopted in the geoscience remote sensing community. One of the main bottlenecks is that most of the data currently produced by the community are stored as geo-referenced images, typically in formats like GeoTIFF. Furthermore, most software used by scientists and developers does not yet support DGGS data. To enable a transition to DGGS within the community, it is essential to have libraries that include fast I/O methods, allowing reciprocal conversion between rasters in standard projections and DGGS data structures. We propose a strategy based on DGGS that can effectively perform I/O operations from and to standard raster formats. Using a triangular tessellation in a hierarchical structure with aperture 4 derived from the geodesic subdivision of an icosahedron, each node of the 20 quad-trees is univocally associated with an integer sequential index. Each sequential index can be translated into a hierarchical index represented as a vector of size corresponding to the node level and storing integer numbers spanning from 0 to 3, with the exception of the second level that identifies one of the 20 quad-trees, and the first level with only the root index 0. The area non-uniformity of this tasselation, measured as the areas standard deviation normalized by their mean, saturates around 0.086 for a high number of subdivisions. Similar approaches, like the Quaternary Triangular Mesh (QTM), have already been proposed in literature [3] and implemented for large scale applications [4]. One of the main novelty contents of our work, realized in the performance and in the scalability. Targeting an highly parallel implementation, each sequential index is associated with an independent process that can easily communicate with its parent and its children processes. By simply converting its sequential index to the hierarchical one adding or removing one element from the vector (depending on the target), and converting the result back to a sequential index, each process can communicate with the relative processes using, for instance, the Message Passing Interface (MPI). This will result in a process topology composed by interconnected nested spheres that can be used to process geospatial data at different spatial resolutions in parallel. In addition, querying operations can be performed with exponential parallel efficiency and logarithmic complexity, like for standard quad trees. This last characteristic allows associating a substantial amount of pixel locations from input raster files with the DGG leaf cells in which they fall in feasible computational times. After this operation, the associated pixel indices can be aggregated to a higher level of the DGGS depending on the chunking size of the original images. The nodes and processes associated with the selected level will be in charge of reading the required chunks of the input files, associate the pixels of interest to each leaf and aggregate them in case multiple pixels are associated with a single leaf. These pixel chunks can be used for processing and then converted back to raster files. Best writing performance will be achieved when using file formats that allow parallel writing of data chunks such as Zarr. Finally, relying on a meshing approach, the framework can be used to include elevation information directly in the meshed structure, enabling the usage of the DGGS to applications such as hydrology modeling, electromagnetic scattering and Earth digital twins. [1] Bauer-Marschallinger, B., & Falkner, K. (2023). Wasting petabytes: A survey of the Sentinel-2 UTM tiling grid and its spatial overhead. ISPRS Journal of Photogrammetry and Remote Sensing, 202, 682-690. [2] Kmoch, A., Vasilyev, I., Virro, H., & Uuemaa, E. (2022). Area and shape distortions in open-source discrete global grid systems. Big Earth Data, 6(3), 256-275. [3] Dutton, G. (1989, April). Planetary modelling via hierarchical tessellation. In Proc. Auto-Carto (Vol. 9, pp. 462-471). [4] Raposo, P. (2022, November) Implementing the QTM discrete global grid system (DGGS). https://doi.org/10.5281/zenodo.7415011
Add to Google Calendar

Thursday 26 June 16:15 - 17:45 (Hall K1)

Presentation: XDGGS: Integrating Xarray with Discrete Global Grid Systems for Scalable EO Data Analysis

#cloud-native #pangeo

Authors: Justus Magine, Benoit Bovy, Jean-Marc Delouis, Anne Fouilloux, Lionel Zawadzki, Alejandro Coca-Castro, Ryan Abernathey, Peter Strobl, Daniel Loos, Wai Tik Chan, Alexander Kmoch, Tina Odaka
Affiliations: LOPS - Laboratoire d'Oceanographie Physique et Spatiale UMR 6523 CNRS-IFREMER-IRD-Univ.Brest-IUEM, Georode, Simula, CNES, The Alan Turing Institute, Earthmover PBC, European Commission Joint Research Centre, Max Planck Institute for Biogeochemistry, University of Tartu, Institute of Ecology and Earth Sciences, Landscape Geoinformatics Lab
DGGS offers a systematic method for dividing the Earth's surface into equally sized, uniquely identifiable cells, enabling efficient data analysis at global scales. The XDGGS library integrates DGGS with the xarray framework, allowing users to work seamlessly with data mapped onto DGGS cells. Through XDGGS, users can select, visualise, and analyse data within a DGGS framework, utilising the hierarchy and numeric IDs of the cells for operations like up-/downsampling, neighbourhood search, and data co-location. The library also supports the computation of geographic coordinates for DGGS cell centres and boundaries, facilitating integration with traditional Geographic Information Systems (GIS). By providing a scalable and systematic approach to geospatial data analysis, XDGGS enhances the ability to work with large, multi-dimensional datasets in diverse scientific domains. It offers robust solutions for tasks such as data fusion, interpolation, and visualisation at global scales. This presentation highlights the potential of XDGGS for Earth Observation (EO) applications by: Simplifying Access to DGGS Workflows: Embedding DGGS functionality within xarray objects lowers the barrier for adopting DGGS frameworks, fostering broader adoption across disciplines. Enabling Scalable Analysis: With xarray's support for Dask, XDGGS facilitates scalable processing of massive EO datasets on DGGS, making it ideal for cloud-native environments and large-scale scientific workflows. Cross-Disciplinary Applications: Through the pangeo ecosystem, XDGGS promotes interoperability across scientific domains, offering use cases in global environmental monitoring, EO data sets, and data fusion with bio-geospatial datasets. Streamlining Integration and Visualization: Combining xarray's user-friendly API with DGGS, XDGGS enables the rapid development of reproducible workflows, advanced visualizations, and real-time data interaction. The presentation will include a demonstration of XDGGS applied to real-world EO datasets, showcasing its efficiency in handling complex global-scale analyses. This integration of xarray with DGGS provides a powerful tool for the EO community, empowering researchers and developers to tackle today's pressing environmental and societal challenges with innovative, scalable, and reproducible solutions.
Add to Google Calendar

Thursday 26 June 14:00 - 15:30 (Hall K1)

Presentation: Evolution of the CEOS-ARD Optical Product Family Specifications

#stac

Authors: Christopher Barnes, Dr. Ferran Gascon, Matthew Steventon, Ake Rosenqvist, Peter Strobl, Andreia Siqueira, Jonathon Ross, Takeo Tadono
Affiliations: KBR contractor to the U.S. Geological Survey (USGS), European Space Agency (ESA), Symbios Communications, solo Earth Observation (soloEO), Japan Aerospace Exploration Agency (JAXA), European Commission, Geoscience Australia
The CEOS Land Surface Imagining Virtual Constellation (LSI-VC) has over 20 members representing 12 government agencies and has served as the forum for developing the CEOS Analysis Ready Data (ARD) compliant initiative since 2016. In 2017, LSI-VC defined CEOS-ARD Product Family Specification (PFS) optical metadata requirements for Surface Reflectance and Surface Temperature that reduced the barrier for successful utilization of space-based data to improve understanding of natural and human-induced changes on the Earth’s system. This resulted in CEOS-ARD compliant datasets becoming some of the most popular types of satellite-derived optical products generated by CEOS agencies (e.g., USGS Landsat Collection 2, Copernicus Sentinel-2 Collection 1, the German Aerospace Center) and commercial data providers (e.g., Catalyst/PCI, Sinergise). Since 2022, LSI-VC has led the definition of two new optical PFSs (i.e., Aquatic Reflectance and Nighttime Lights Surface Radiance) and four Synthetic Aperture Radar (SAR) PFSs (i.e., Normalised Radar Backscatter, Polarimetric Radar, Ocean Radar Backscatter, and Geocoded Single-Look Complex), signifying the recognition in importance of providing satellite Earth observation data in a format that allows for immediate analysis. As of December 2024, eleven data providers have successfully achieved CEOS-ARD compliance with a further 12 organizations either in peer-review or underdevelopment for future endorsement. However, this has engendered a need for transparency, version control, and (most importantly) a method to facilitate consistency across the different PFSs and alignment with SpatioTemporal Asset Catalogs (STAC). Thus, all future PFS development will be migrated into a CEOS-ARD GitHub repository. This will facilitate broader input from the user community which is critical for the optical specification to meet real-world user needs and ensures broader data provider adoption. CEOS agencies have concurred that now is the time with increased traceability and version control offered by GitHub, to seek to parameterise the CEOS-ARD specifications and introduce an inherent consistency across all optical and SAR PFS requirements while benefiting from active user feedback. In this presentation, we will share a status on the optical PFS transition to GitHub, as well as a set of implementation practices/guidelines and a governance framework that will broaden the portfolio of CEOS-ARD compliant products so they can become easily discoverable, accessible, and publicly used.
Add to Google Calendar

Thursday 26 June 14:00 - 15:30 (Hall K1)

Presentation: Development of Analysis Ready Data Products for European Space Agency Synthetic Aperture Radar Missions

#stac #zarr #cog

Authors: Clement Albinet, Davide Castelletti, Fabiano Costantini, Mario Costantini, Francesco De Zan, Jonas Eberle, Paco Lopez Dekker, Juan M Lopez-Sanchez, Federico Minati, Muriel Pinheiro, Sabrina Pinori, David Small, Francesco Trillo, John Truckenbrodt, Antonio Valentino, Anna Wendleder, Marco Wolsza
Affiliations: ESA, Telespazio VEGA, B-Open Solutions s.r.l, Delta phi remote sensing GmbH, German Aerospace Center (DLR), Delft University of Technology, Universidad de Alicante, Serco, University of Zürich, STARION, Friedrich Schiller University Jena
The current family of Synthetic Aperture Radar (SAR) products from Sentinel-1 and TerraSAR-X contains primarily Level-1 Single Look Complex (SLC) and Ground Range Detected (GRD) data types [1][2][3], which inherited their definitions from the European SAR satellite missions ERS-1/2 and ENVISAT [4]. These products have proven to be reliable, high-quality data sources over the years. In particular, users largely benefit from the open and free data policy of the Copernicus programme (European Space Agency (ESA), European Commission). This has led to Sentinel-1 products being routinely used in several operational applications and to a substantial growth of the user base of SAR data in general. However, the rapid increase of data volume is presenting a challenge to many users who aim to exploit this wealth of information but lack the processing resources needed to convert these Level-1 products into interoperable geoinformation. Cloud solutions offer opportunities for accelerated data exploitation but require new strategies of data management and provision. As a consequence, the term Analysis Ready Data (ARD) was coined, and several activities have indicated the potential for extending the Earth Observation product family with such ARD products. With the aim to standardize different categories of ARD, the Committee on Earth Observation Satellites (CEOS) has set up the CEOS Analysis Ready Data (CEOS-ARD) initiative. Within this context, Analysis Ready Data were defined as: »satellite data that have been processed to a minimum set of requirements and organized into a form that allows immediate analysis with a minimum of additional user effort and interoperability both through time and with other datasets.« A variety of SAR product specifications are currently being defined to provide guidelines on how best to process and organize data to serve as many use cases as possible with the respective products [5]. In this context, ESA and DLR decided to collaborate in order to define a family of SAR ARD products for Sentinel-1, TerraSAR-X, ROSE-L, ERS-1/2 and ENVISAT, potentially to be extended to other SAR missions. These products should be calibrated the same way (Radiometric Terrain Correction (RTC) [6]), denoised, projected and geolocated in order to allow immediate analysis by the users. The same ridding / tiling system (Military Grid Reference System (MGRS)) and the same Digital Elevation Model (Copernicus DEM) shall be used in order to allow interoperability together with Earth Observation data from different missions. The use of Cloud Optimised GeoTIFF (COG) raster files or Zarr format, VRT files and STAC metadata will enable efficient exploitation of these datasets into cloud-computing environments by allowing optimizations for cloud storage, enabling concurrent processing and selective data access. Finally, using permissive open-source code and libraries to generate these new products, processors will represent a considerable step toward Open Science. The current status of ARD product development for different ESA missions (Sentinel-1, ROSE-L, ERS-1/2, ENVISAT) and DLR missions (TerraSAR-X) and of the processing experiences will be presented, together with the plans for future missions like Sentinel-1 NG and BIOMASS. References: [1] ESA, “Sentinel-1 Product Specification”, version 3.9, 2021. https://sentinel.esa.int/documents/247904/1877131/Sentinel-1-Product-Specification-18052021.pdf/c2f9d58d-217f-e21d-548d-97a2cbd71e2b?t=1621347421421. [2] Airbus, “TerraSAR-X Image Product Guide”, issue 2.3, March 2015. https://www.intelligence-airbusds.com/files/pmedia/public/r459_9_20171004_tsxx-airbusds-ma-0009_tsx-productguide_i2.01.pdf. [3] https://earth.esa.int/eogateway/instruments/sar-ers/products-information. [4] ESA, “ENVISAT-1 Products Specifications Volume 8: ASAR Products Specifications”, issue 4, Ref: PO-RS-MDA-GS-2009, 20 January 2012. https://earth.esa.int/eogateway/documents/20142/37627/Envisat-products-specifications-VOLUME-8-ASAR-PRODUCTS-SPECIFICATION.pdf/1fd5a0be-1634-06cc-9a1e-249874a6e3aa. [5] CEOS, “Analysis Ready Data for Land: Normalized Radar Backscatter”, version 5.5, 2021. https://ceos.org/ard/files/PFS/NRB/v5.5/CARD4L-PFS_NRB_v5.5.pdf. [6] Small, D. (2011). “Flattening Gamma: Radiometric Terrain Correction for SAR Imagery”. IEEE Transactions on Geoscience and Remote Sensing, 49, 3081-3093. https://doi.org/10.1109/TGRS.2011.2120616
Add to Google Calendar

Thursday 26 June 16:15 - 17:45 (Hall L3)

Presentation: SharingHub: A Geospatial Ecosystem for Collaborative Machine Learning and Assets Management

#stac

Authors: Clément Guichard, Olivier Koko, Vincent Gaudissart, Brice Mora
Affiliations: CS Group
SharingHub is a comprehensive machine learning (ML) development ecosystem designed to empower collaboration, enhance productivity, and ensure secure management of artificial intelligence models and datasets. Inspired by platforms like Hugging Face, SharingHub offers similar collaborative features but caters specifically the unique needs of geospatial data and AI scientists. Indeed, unlike traditional ML domain, our data is located in space and time, characteristics that are also reflected in the models themselves. For these reasons, traditional ML ecosystems lack some capabilities needed for spatial domain. One of SharingHub’s key strengths is this web portal, with the ability to facilitate the discovery, browsing, and download of ML models and datasets. Acting as a central repository for these resources, it simplifies sharing and collaboration among data scientists, researchers, and organizations, fostering innovation. The service is engineered with interoperability in mind, supporting industry standards such as Open Geospatial Consortium (OGC) standards and SpatioTemporal Asset Catalog (STAC). Built on top of GitLab, SharingHub leverages GitLab’s powerful version control, access management, and collaborative features. However, while GitLab excels in traditional software development, it lacks ergonomics tailored to ML workflows and geospatial domain needs. SharingHub bridges this gap by extending GitLab with a dedicated web portal designed specifically for AI researchers. SharingHub ecosystem integrates with popular and well-adopted tools for ML community, such as MLflow, Hugging Face datasets, and Data Version Control (DVC). This ensures smooth integration with various ecosystems, enabling users to work with familiar tools and frameworks while benefiting from SharingHub’s enhanced capabilities. Through its integration with MLflow, SharingHub offers experiment tracking and model distribution for GitLab projects. Additionally, its DVC integration adds scalable, versioned data storage, that is essential for managing the large datasets commonly used in ML and geospatial projects. Together, MLflow and DVC streamline the end-to-end workflow of model and data management, allowing teams to focus on delivering insights rather than managing infrastructure. SharingHub also integrates with JupyterHub, enabling interactive exploration and experimentation with models and datasets. This functionality closes the gap between prototyping and production by allowing data scientists to test, validate, and refine their work in an interactive environment, enhancing both productivity and model quality. Furthermore, one of our objectives is also to accelerate the projects initiations through the use of preconfigured, standardized templates for common ML project setups. These templates significantly reduce the time required to launch new projects, enhances reproducibility, and ensures adherence to industry best practices, which is particularly valuable for teams seeking consistency and efficiency across multiple projects. Finally, the integration with GitLab provides a fine-grained Single Sign-On (SSO) access control, enabling centralized security and allowing teams to securely manage their large-scale datasets and sensitive models. As a member of the Earth Observation Exploitation Platform Common Architecture (EOEPCA) consortium, SharingHub serves as a core component of one of the European Space Agency (ESA) Building Blocks, the MLOps Building Block. Its geospatial capabilities, such as support for OGC standards and STAC, set it apart from other ML hubs like Hugging Face, with enhanced geospatial tools and capabilities. SharingHub is uniquely positioned as a geospatial focused ML initiative. In essence, SharingHub is more than just a platform for managing models and datasets. It is a comprehensive solution that extends your GitLab instance with specialized tools designed for the geospatial and ML communities. By combining Git-based version control with specialized ML tools, SharingHub creates a unique ecosystem that supports the entire ML lifecycle, including collaboration, versioning, peer review, and model management, making it an essential solution for modern geospatial-oriented MLOps, and promotes a culture of collaboration, efficiency, and continuous improvement for the ML ecosystem. The project, being part of the EOEPCA consortium, is open-source, meaning that you can deploy your own SharingHub, targeting your own instance of GitLab. You can always deploy your own, and try it out! Links: - SharingHub main repository: https://github.com/csgroup-oss/sharinghub - EOEPCA MLOps Building Block: https://eoepca.readthedocs.io/projects/mlops/
Add to Google Calendar

Thursday 26 June 16:15 - 17:45 (Hall L3)

Presentation: AIOPEN – Platform and Framework for developing and exploiting AI/ML Models

#cloud-native

Authors: Leslie Gale, Bernard Valentin
Affiliations: Space Applications Services
AI/ML is a transversal skill set that is being applied to EO application development. While AI developers and the EO science and service developer communities are now engaged in developing models offering new possibilities with predictive capabilities, the disruptive nature of AI/ML has also impacted platforms designed to facilitate conventional analytic algorithm development and exploitation. Although current AI/ML development environments have come a long way, they still face several shortcomings that hinder their effectiveness and accessibility. AI/ML ecosystem are fragmented, with multiple frameworks (e.g., TensorFlow, PyTorch, Scikit-learn) and tools often lacking seamless interoperability. Support for creating integrated applications is lacking. Furthermore, functionalities that ensure reproducibility of experiments aggravate by inconsistencies in environment setups, dependency versions, or data pipelines and large dataset management burden users, making collaboration, scientific rigour, publishing and exploitation of models cumbersome. AIOPEN provides state-of-the-art end-to-end AI model development lifecycle support tackling and solving interoperability as far as is possible using existing technologies. AIOPEN provides a solution that is a significant step towards harnessing the power of AI/ML technologies for the advancement of Earth Observation data analyses. To do so AIOPEN utilizes cutting-edge technology and a cloud-native approach to address challenges in big data management, access, processing, and visualization. The paper will present the work performed in an ESA funded project to extend the Automated Service Builder (ASB), a cloud hosting infrastructure and application agnostic framework for building EO processing platforms developed by Space Applications Services. We discuss problems encountered and solutions created showing how an existing EO data processing platform EOPEN developed using ASB, hosted in the ONDA Data and Information Access Service (DIAS), uses extensions developed for ASB to fully support AI/ML developers. Besides the ASB framework (https://asb.spaceapplications.com), other frameworks, services and tools are integrated including from ESA’s EOEPCA (https://eoepca.org) and AI4DTE (https://eo4society.esa.int/projects/ai4dte-software-stack/) projects, OGC services, the tracking server MLFlow (https://mlflow.org) and the inference server MLserver (https://www.seldon.io/solutions/seldon-mlserver). AIOPEN is a robust platform providing collaboration services, allowing seamless model and data sharing, and efficient search across local and remote catalogues. It facilitates the hosting and sharing of models and training data, training of AI models, integration into new applications via standard interfaces, and effective management and tracking of AI assets offering scientists and industry professionals public services capable of bringing together the processing and data access capabilities. AIOPEN enables end-users (scientists and industry professionals) to leverage the vast amounts of EO data available and unlock valuable insights. Through community engagement activities AIOPEN fosters collaboration and gathers valuable feedback. To demonstrate and evaluate the AIOPEN capabilities two uses cases have been implemented: 1) Forest Cover Monitoring making use of a standard, well-established deep learning architectures like U-Nets for semantic image segmentation as the forest segmentation (in a single time point) is a binary semantic segmentation task. 2) Urban Change Detection using a Transformer Architecture EO data and Deep Neural Networks to detect (urban) related changes on the Earth’s surface to construct a digital twin of Earth’s (urban) changes. The paper will conclude with a discussion of the evaluation performed by independent users and presentation of ideas for future work.
Add to Google Calendar

Thursday 26 June 16:15 - 17:45 (Hall L3)

Presentation: Operationalizing MLOps in the Geohazards Exploitation Platform (GEP)

#cloud-native #stac

Authors: Simone Vaccari, Herve Caumont, Parham Membari, Fabrizio Pacini
Affiliations: Terradue Srl
The Geohazards Exploitation Platform (GEP) is a cloud-based Earth Observation (EO) data processing platform developed and operated by Terradue to support geohazard monitoring, terrain motion analysis, and critical infrastructure assessment. It serves a diverse user base of over 3,200 researchers, public authorities, and industry professionals, providing access to EO data archives, advanced processing services, and analytical tools. These services range from systematic data processing workflows, such as generating interferometric deformation maps, to event-triggered processing for rapid response scenarios like earthquake damage assessments. They support a variety of data-driven applications, from data screening and area monitoring to the integration of multi-temporal data for long-term risk assessment. Expanding the portfolio of services that leverage artificial intelligence (AI) and machine learning (ML) is a key objective for GEP to meet the growing demands of its users. However, the complexity of training, deploying, and maintaining ML models at scale posed significant challenges. These include managing large and diverse EO datasets, ensuring reproducibility, and maintaining model performance over time in dynamic operational environments. Addressing these obstacles was essential for unlocking the full potential of AI in geospatial applications, and open GEP to an enlarged set of data processing services, users and stakeholders. As part of an ESA-funded initiative targeted at expanding the use of AI and ML, the GEP has recently embedded Machine Learning Operations (MLOps) capabilities to address these challenges. This encompasses use cases for developing scalable workflows and operating the resulting ML models in geospatial applications. With its EO data repositories and cloud-based processing environment, GEP now supports the full lifecycle of ML operations including data discovery, preparation, model development, deployment, monitoring, and re-training. By embedding MLOps principles into the platform, GEP provides a comprehensive solution for automating and scaling AI-driven geospatial analyses. These capabilities have been designed to ensure reproducibility, improve operational efficiency, and support dynamic adaptation to real-world conditions. This presentation will focus on the practical implementation and use of these MLOps enhancements within GEP. The cloud-native architecture of GEP ensures compatibility with modern DevOps frameworks, providing scalable and interoperable solutions for geohazard assessment and disaster response. We will show how the platform provides advancements like automated pipelines for data preparation and training, real-time monitoring tools for identifying performance issues such as data drift, and SpatioTemporal Asset Catalogs (STAC) compliant cataloging of datasets and models to streamline access and management. We will present technical insights from integrating MLOps into GEP, highlighting challenges and solutions developed to meet the specific needs of EO applications. Operational examples will illustrate how these capabilities are used to address user needs effectively and, by automating and standardizing ML workflows, how GEP empowers scientists and service developers to deploy reliable AI-driven models while reducing the complexity of cloud-based system operations. This session will provide attendees with a comprehensive understanding of how MLOps enhances cloud-based EO ecosystems, demonstrating its potential to enable innovative and sustainable geospatial solutions.
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Hall L3)

Presentation: Mapping Crops at Scale: Insights From Continental and Global Crop Mapping Initiatives

#cloud-native

Authors: Dr. Kristof Van Tricht, Christina Butsko, Jeroen Degerickx, Gabriel Tseng, Kasper Bonte, Jeroen Dries, Bert De Roo, Hannah Kerner, Dr Laurent Tits
Affiliations: VITO, McGill University, Mila, Ai2, Arizona State University
Crop mapping has been a focus of the remote sensing community for many years. Ideally, for food security monitoring purposes, we would like to know what is being planted globally, preferably at the time of planting. Realistically, however, the community has had to adjust expectations to align with the current capabilities of remote sensing technologies. Agriculture is one of the most dynamic forms of land use, with agro-climatic conditions and local management practices creating unique agricultural activities in nearly every region. This diversity presents significant challenges for consistently mapping agricultural crops at large scales over multiple years. Consequently, creating reliable, large-scale crop maps requires careful planning from setting appropriate requirements to deploying classification algorithms at scale that ensure maximization of workflow generalizability. A robust approach to large-scale crop mapping involves key decisions such as which crops to map (or not to map), how to cope with seasonality, the collection, harmonization, and sampling of training data, selection of satellite and auxiliary data inputs and their preprocessing, computing and selecting classification features, choosing the appropriate algorithm, and building an efficient cloud-based inference pipeline. These elements ensure that the classification workflow is well suited to meet the specific requirements of agricultural diversity while still being feasible to operate at continental to global scales. In this presentation, we highlight valuable lessons learned by researchers engaged in making large-scale crop maps for two distinct products: the Copernicus multi year High-Resolution Layer (HRL) Vegetated Land Cover Characteristics (VLCC) crop type layer, and the ESA WorldCereal global cropland and crop type maps. The workflows behind these products have many things in common but also exhibit notable differences. We will discuss the synergies and divergences between these crop mapping pipelines, focusing on training data sources and algorithms, as well as the particularities of deploying both workflows in the cloud. For example, spatially and temporally distributed reference data in Europe allows for a powerful fully supervised end-to-end classification workflow based on transformers, while large spatial and temporal gaps at the global scale benefit from a self-supervised pretrained foundation model followed by a lightweight CatBoost classifier. Regardless of the approach, ensuring efficient deployment at scale is crucial at all stages of development. In conclusion, we will reflect on the key challenges and lessons learned from developing and deploying these crop mapping systems, emphasizing the importance of adaptability, careful selection of training data and algorithms, the need for cloud-native infrastructures, and the flexibility to refactor parts of the workflow along the way. By sharing our experiences, we hope to provide valuable perspectives for future endeavors in scaling Earth observation algorithms from regional research efforts to global applications, ultimately contributing to enhanced agricultural monitoring and food security initiatives.
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Hall L3)

Presentation: Scalable and Energy Efficient Compositing of Sentinel-2 Time Series

#zarr

Authors: Pablo d'Angelo, Paul Karlshöfer, Uta
Affiliations: German Aerospace Center
Introduction As Earth observation data archives continue to grow thanks to long-term missions such as the Sentinels, scalable data processing is a key requirement for increasingly complex analysis workflows. At the same time, the increase in data and computing resources results in an increase in energy consumption and thus in the carbon footprint of data analysis. By extracting the visible bare surface of agricultural fields after harvesting and ploughing, multispectral observations of the soil surface can be obtained from Sentinel-2 time series data at 20 m resolution. A complete bare surface reflectance composite can only be obtained from a multi-year time series, which is acceptable due to the low dynamics of soil properties. In the CUP4SOIL project, several soil parameters such as soil organic carbon, pH and bulk density are estimated using digital soil modelling, and the bare surface reflectance composites provide additional information to the traditionally used DSM covariates. The SCMAP compositing process detects pixels with bare surfaces based on a spectral index and regionally varying thresholds. During compositing, robust statistically based outlier detection is used to remove cloud, snow and haze pixels, and reflectance and statistical data are calculated for both bare and non-bare surfaces. Each pixel stack in the time series is processed independently, resulting in a massively parallel reduction operation with no spatial dependencies. This setting is typical of temporal compositing algorithms, which usually reduce along time and spectral dimensions with little or no spatial influence. Many existing products depend on time series analysis of Sentinel data [1, 2]. Efficient computation both decreases the environmental footprint and the costs of processing, and is thus of prime interest. This requires both efficient an implementation of algorithms, as well a compute platform that offers the required compute and data resources. While the embarrassingly parallel nature of this task provides a high scalability potential, high efficiency can only be archived when tailoring the algorithms to the performance characteristics to the employed hardware platform. Method The core SCMAP algorithm is implemented in a C++ application called from Python code responsible for product discovery and data format processing. The use of containers and the modular input interfaces allow the process to be easily adapted to different data archives and to run in cloud or HPC environments. The experiments are performed on the Terrabyte HPC platform of LRZ and DLR[3], which provides ~50 PB of GPFS storage and 271 CPU compute nodes with 40 cores and 1 TB of RAM each. These nodes are completely fanless machines, cooled with a highly efficient hot water cooling system. Using the SCMAP application, we explore several implementations and optimisations on the Terrabyte compute platform. The algorithm allows for multiple levels of parallelization as data dependencies are limited to the temporal and spectral axis. Spatially, neighbouring pixels are independent. Thus, at the SLURM task level, tiles of the Sentinel-2 tiling grid are computed using OpenMP, allowing parallel pixel computations within each task. We are investigating reordering the input data axes to improve cache coherence and align with data access patterns. Concurrent task execution on compute nodes is analysed to assess how memory allocation, task density and data request rates affect I/O complexity and file system load. The Sentinel-2 tiling grid results in spatial tiles of 100x100 km for a given date, and a standard Level 2A Sentinel-2 product stores each of the used 10 bands in separate image files. As each SLURM task processes on Sentinel-2 tile, and thus reads from 1000 to 10000 input files, parallel IO and increasing the IO chunk size were essential for high scalability of the process. In addition, we compare the performance and decompression overheads of several common file formats (cloud-optimised GeoTIFF, JPG2000, ZARR). We further investigate the energy consumption of the compositing tasks and compare the energy efficiency of different processing and data storage setups. Conclusions With the current optimisations, a state-of-the-art bare surface reflectance composite for the whole of Europe can be computed from 500 TB of Sentinel-2 L2A input data in less than 12 hours using 25 CPU nodes on the Terrabyte HPC platform. The complete process, including scheduling, input data reading, compositing and output product formatting operates with an sustained input data rate of ~110 GBit/s. Re-processing EU wide 5 yearly Sentinel-2 bare surface composites in case of algorithmic updates thus reduces to an overnight batch job. References: 1. https://land.copernicus.eu/en/products 2. https://esa-worldcover.org 3. https://docs.terrabyte.lrz.de
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Hall L3)

Presentation: A Comprehensive Monitoring Toolkit for Energy Consumption Measurement in Cloud-Based Earth Observation Big Data Processing

#pangeo

Authors: Adhitya Bhawiyuga, Serkan Girgin, Rolf de By, Raul Zurita-Milla
Affiliations: Faculty Of Geo-information Science And Earth Observation, University Of Twente
The processing of earth observation big data (EOBD) in distributed environments has increased significantly, driven by advances in satellite technology and the growing number of earth observation missions. This massive influx of data presents unprecedented opportunities for environmental monitoring, climate change studies, and natural resource management, while simultaneously posing significant computational challenges. Cloud computing has emerged as an enabler for handling such EOBD, offering scalable computational resources, flexible storage solutions, and on-demand processing capabilities through platforms such as Google Earth Engine (GEE), AWS SageMaker, OpenEO, and Pangeo Cloud. While these cloud-based EOBD processing platforms offer varying levels of monitoring capabilities to help users understand their workflow execution, they primarily focus on traditional performance metrics. GEE provides basic performance insights focusing on task execution status, AWS SageMaker offers comprehensive resource utilization metrics through Amazon CloudWatch, and Pangeo Cloud implements the Dask profiler for real-time monitoring of cluster performance. However, a significant gap exists: none of these platforms incorporate energy consumption as a standard monitoring metric. This limitation becomes increasingly critical as the scientific community grows more concerned about the environmental impact of large-scale data processing operations. The absence of energy-related metrics from monitoring may hinder users from understanding the environmental impact associated with their EOBD processing workflows. This knowledge is particularly crucial in the earth observation domain, where the balance between computational requirements and environmental impact directly aligns with the field's core mission of environmental protection. Furthermore, recent green computing initiatives have emphasized the importance of sustainable IT infrastructure, yet the lack of standardized energy consumption metrics in EOBD processing platforms hinders researchers' ability to make informed decisions about computational resource usage. To address this gap, we propose a monitoring toolkit for understanding the energy consumption patterns in distributed EOBD processing. We develop an integrated approach that combines multi-level energy measurements: (1) hardware-level power data collected through RAPL for CPU and DRAM, IPMI for system-level metrics, and external power sensors for overall consumption; (2) software-level resource utilization metrics from the operating system including CPU usage, memory allocation, I/O operations, and network traffic; and (3) application-level profiling through integration with Dask's distributed processing framework. Our methodology employs power ratio modeling to correlate these measurements and estimate process-level energy consumption, enabling fine-grained energy profiling of EOBD workflows. The toolkit generates comprehensive monitoring reports that include energy consumption patterns, resource utilization correlations, and efficiency metrics, allowing users to make informed decisions about their processing strategies. By providing visibility into the energy consumption of computational workflows, this work contributes to the development of more sustainable EOBD processing practices. The toolkit enables users to better evaluate the true environmental cost of their computational workflows and optimize their processing strategies accordingly, supporting the broader goal of environmental protection through more energy-efficient earth observation data processing.
Add to Google Calendar

Thursday 26 June 14:00 - 15:30 (Hall L3)

Session: D.06.05 Addressing Data Processing Challanges in EO Digital Framework: Scaling Computational Resources

#cloud-native

With the ever-growing volume of Earth observation (EO) data, ensuring efficient storage, processing, and accessibility has become an ongoing challenge. The anticipated rapid increase in EO data further emphasizes the need for advanced technologies capable of providing scalable computational infrastructure to support this growth.

The current challenge lies in processing this vast amount of EO data efficiently. Computationally intensive tasks, such as those driven by artificial intelligence (AI) and machine learning (ML), alongside image processing applications, place significant demands on existing solutions. These challenges are further compounded by the need for sustainable approaches to manage increasing computational workloads.

This session aims to address these challenges in the context of ESA's current and emerging computational infrastructure. Discussions will focus on the use of diverse computational solutions, including High-Performance Computing (HPC) systems, cloud-based platforms, and hybrid models adopted across the industry. This will encompass ESA's first HPC system, SpaceHPC, and explore how these technologies address these challenges. While these systems offer substantial processing power and flexibility, the continued growth of data inflow necessitates further advancements in supporting computational infrastructure to maintain efficiency and scalability.

A key consideration will be how these developments can align with sustainability goals, focusing on reducing CO₂ emissions and adopting environmentally responsible practices. Guest speakers from industry will share insights into these topics, highlighting both the challenges and opportunities posed by evolving data processing needs.

Moderators:


  • Peter Gabas - ESA

Presentations and speakers:


SpaceHPC - ESA’s Supercomputing Infrastructure


  • Peter Gabas - ESA

Unifying HPC and Cloud Systems: A Cloud-Native Approach for Infrastructure Integration


  • Vasileios Baousis - ECMWF

Industrial Perspective on the High-Performance Computing and Quantum Computing Opportunities for EOF Processing, Operations, and Archiving


  • Mark Chang - Capgemini

terrabyte: A "Cloud-Like" HPC System for Addressing Earth Observation Challenges


  • Friedl Peter - German Aerospace Center
  • CINECA
  • European HPC Center
Add to Google Calendar

Thursday 26 June 14:00 - 15:30 (Room 0.14)

Session: D.05.03 Towards Modernized Copernicus Data: Enabling Interoperability through EOPF Principles and Advanced Data Access Strategies

#cloud-native #zarr

As demand for high-accuracy Copernicus data products grows, modernizing and re-engineering existing processors is essential. The Sentinel data processors, developed over a decade ago, require upgrades to remain viable for the next 15 years. A key focus of this modernization is enhancing data access through cloud optimization, interoperability, and scalability, ensuring seamless integration with new technologies.

A major development in this transition is the adoption of cloud-native data formats like Zarr, which significantly improve data handling, storage, and access. This shift supports the increasing volume and complexity of data from current and future missions. The Earth Observation Processing Framework (EOPF) plays a crucial role in enabling these advancements, providing a scalable and flexible environment for efficiently processing large datasets.

This insight session will provide updates on the latest status of EOPF project components, as well as the future of the Copernicus data product format, with a strong focus on Zarr and its practical applications. Experts will showcase how these innovations enhance data accessibility and usability, ensuring that Copernicus remains at the forefront of Earth observation. The session will also highlight EOPF’s role in streamlining data workflows, fostering collaboration among stakeholders, and advancing next-generation EO solutions.
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Room 1.31/1.32)

Presentation: EUMETSAT’s Contribution Towards Generating Uncertainty Characterised Fundamental Climate Data Records

#zarr

Authors: Jörg Schulz, Viju John, Timo Hanschmann, Carlos Horn, Oliver Sus, Jaap Onderwaater, Rob Roebeling
Affiliations: EUMETSAT
Climate change is currently one of the main threats our planet is facing. Observations are playing a pivotal role in underpinning the science to understand the climate system and monitor its changes including extreme events, which have adverse effects on human lives. Information generated from measurements by Earth observation satellites contribute significantly to the development of this understanding and to the continuous monitoring of ongoing climate change and its impacts. However, the meaningful use of data from these satellites requires them to be long-term, spatially and temporally homogeneous, and uncertainty characterised. The process of preparing satellite data for climate studies is tedious and only recently being recognised as fundamental first step in preparing records of Essential Climate Variables (ECV) from these data. During the last decade EUMETSAT has generated several Fundamental Climate Data Records (FCDR) consisting of measurements from instruments operating from microwave to visible frequencies. These measurements are not only from satellites operated by EUMETSAT but also from satellites operated by other agencies such as NOAA and CMA. Scientific advances for the data generation have been made through several EU research projects such as ERA-CLIM, FIDUCEO and GAIA-CLIM. The FIDUCEO project was pivotal for developing a framework for characterising uncertainties of Earth Observation data. The principles developed in the project have been adapted and extended by EUMETSAT by including other sensors and by consolidating longer time series. This presentation outlines the basic principles of FCDR generation illustrated through a few examples. Basic steps of the FCDR generation is comprised of quality control and indicators of the raw data, recalibration of the raw data to produce physical quantities, such as radiances or reflectance. Throughout these steps uncertainty characterisation and harmonisation of a suit of instruments are performed. Finally, outputs are generated in user-friendly formats, e.g., NetCDF4 and/or Zarr adhering to community best practises for meta data. The presentation illustrates these principles by two examples, one on the creation of a harmonised time series of microwave humidity sounder data and the other on the creation of FCDRs from geostationary satellite infrared and visible range measurements. The resulting FCDRs are used to create data records of ECVs for example by the EUMETSAT Satellite Application Facilities (SAFs) and enable uncertainty propagation into the derived ECV data records. EUMETSAT data records support international research activities in the World Climate Research Programme (WCRP) and national and international climate services such as the Copernicus Climate Change Service particularly global reanalysis. Illustration of the use of FCDRs to improve the quality of CDRs will be presented as well.
Add to Google Calendar

Thursday 26 June 11:30 - 13:00 (Room 1.34)

Presentation: Data Sharing Infrastructures to Bring EO-Powered Intelligence to a Wider Audience

#stac

Authors: Liz Scott
Affiliations: Satellite Applications Catapult
The UK’s National Cyber Physical Infrastructure Programme brings together government, industry and its network of Catapults to share best practice on data sharing and minimisation of data silos. Cross-sector digital twin demonstration projects such as the Climate Resilience Demonstrator (CreDo) from Connected Places Catapult are already planning their production phases, but the inclusion of data streams from Earth Observation (EO) sources remain outside of such projects. Despite recent advancements in the interoperability and reusability of EO data thanks to cataloguing technologies such as STAC, access to EO data continues to be a blocker to its wider adoption. For the data science and analysis needed for addressing many global environmental challenges, it must be possible for a wider audience of scientists, analysts and policymakers to get easier access to a wider range of data sources that already exist, and this requires multiple stakeholders to collaborate on the data engineering to make it happen. The Earth Observation Datahub is a UK-built data infrastructure project at the forefront of digital transformation for the use of UK academia, government and industry. Research from the project’s End User and Stakeholder Forum had shown that even for those working in the EO sector there were barriers to wider usage due to disparate data sources and processing capability a long way from the data. Furthermore, for data scientists and engineers on the periphery of the EO sector, the learning curve to EO data exploitation has been too great for many to make a start. The EO Datahub has been built to address these problems and create a federated ecosystem of sources –both commercial and open source, processing pipelines and end-user ready applications to enable the beginnings of a wider ecosystem of EO usage. By utilising STAC metadata, open source software components and containerised processing, the system presents the potential for data sources to be more easily discovered, and derived data products to be created. Both can then be exploited using coding tools such as a custom Python toolkit as well as no-code/low-code user interfaces. The containerisation of data processing allows for scaling of data processing jobs. With all code open source, the components have been designed to be portable should there be a requirement in the future to scale further including across multiple public cloud offerings. With such a federated system, incomers outside of the EO sector have a foot in the door into creating systems that exploit satellite data sources without the substantial overhead of data management and entire end to end processing chain. Data product development for commercial applications can potentially be a step easier and near-time digital twins into systems which currently have no input from space now become a possibility. This presentation will explain the architecture utilised and data flow from incoming data streams through the hub platform and to the applications, with a discussion on the end user applications being trialled in the pilot phase and potential for future advancements that can feed cross-sector digital twins.
Add to Google Calendar

Thursday 26 June 11:30 - 13:00 (Room 1.34)

Presentation: The CCI Open Data Portal: Evolution and future plans after 10 years of operations

#kerchunk #zarr

Authors: Alison Waterfall, Emily Anderson, Rhys Evans, Ellie Fisher, Philip Kershaw, Diane Knappett, Federica Moscato, Matthew Paice, Eduardo Pechorro, David Poulter, William Tucker, Daniel Westwood, Antony Wilson
Affiliations: Centre for Environmental Data Analysis, RALSpace, STFC, European Space Agency
The CCI Open Data Portal has been developed as part of the European Space Agency (ESA) Climate Change Initiative (CCI) programme, to provide a central point of access to the wealth of data produced across the CCI programme. It is an open-access portal for data discovery, which supports faceted search and multiple download routes for all the key CCI datasets and can be accessed at https://climate.esa.int/data. The CCI Open Data portal has been operating since 2015 and during this time the project has gone through several evolutions in terms of the technologies used and the challenges faced by the portal. In this presentation we will describe the current CCI portal, its future plans and the lessons learnt from 10 years of operations. Since its inception in 2015, the CCI Open Data Portal has provided access to nearly 600 datasets. It consists of a front end access route for data discovery comprising: a CCI dashboard, which shows at a glance the breadth of CCI products available and which can be drilled down to select the appropriate datasets; and also a faceted search option, which allows users to search for data over a wider range of characteristics. These are supported at the back end by a range of services provided by the Centre for Environmental Data Analysis (CEDA), which includes the data storage and archival, catalogue and search services, and download servers supporting multiple access routes (FTP, HTTP, OPeNDAP, OGC WMS and WCS). Direct access to the discovery metadata is also publicly available and can be used by downstream tools to build other interfaces on top of these components e.g., the CCI Toolbox uses the search and OPeNDAP access services to include direct access to data. A key challenge in the operation of the CCI Open Data Portal comes from the heterogeneity of the different datasets that are produced across the Climate Change Initiative programme, with different scientific areas and different user communities all having differing needs in terms of the format and types of data produced. To this end, the work of the CCI Open Data Portal, also includes maintaining the CCI data standards. These standards aim to provide a common format for the data, but necessarily, still leave considerable breadth in the types of data produced. This provides challenges in providing harmonised search and access services, and solutions have been developed to ensure that every dataset can still be fully integrated into our faceted search services. Currently, technologically the CCI Open Data Portal combines search and data cataloguing using OpenSearch with a data serving capacity using Nginx and THREDDS and utilises containers and Kubernetes to provide a scalable data service. These are currently hosted on the academic JASMIN infrastructure in the UK, but for the future, we are exploring a hybrid model whereby some of the functionality will be moved or duplicated to an external cloud provider for increased resilience, whilst still retaining the flexibility and cost benefits of primarily hosting data on a local infrastructure. Over the 10 years of operations of the CCI Open Data Portal, one key evolution relates to the ways in which people prefer to access data. Whilst the original data products are mostly in NetCDF, which is still a popular access mechanism, there is an increasing need to provide data in cloud-ready formats. Over the last few years, work has been carried out in conjunction with the CCI Toolbox, to provide cloud-ready versions of many of the datasets through the alternative provision of data formatted in Zarr and Kerchunk to provide more performant access to the data for cloud-based activities. In the current phase of the CCI Open Data Portal, it is also planned to integrate some of the CCI datasets into other data ecosystems, thereby increasing the reach of the CCI data products and making them accessible to a wider audience. These products will also be made accessible for users accessing the data via the Open Data Portal.
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Room 1.34)

Presentation: Evolutions in the Copernicus Space Component Ground Segment

#stac #zarr

Authors: Jolyon Martin, Berenice Guedel, Betlem Rosich
Affiliations: European Space Agency
The Copernicus Space Component (CSC) Ground Segment (GS) is based on a service-based architecture and a clear set of operations management principles (management and architectural) hereafter referred as the ESA EO Operations & Data Management Framework (EOF). ESA needs to guarantee the continuity of the on-going operations with the maximum level of performances for the flying Copernicus Sentinels while facing the technical and financial challenges to adapt to the evolutions of the CSC architecture including the Copernicus Expansion Missions and Next Generation Sentinels. The EOF encompasses all the activities necessary to successfully deliver the expected level of CSC operations entrusted to ESA (i.e. establishment and maintenance of the new baseline, procurement actions, operations management, reporting, etc.) The EOF implementation is based on a service architecture with well-identified components that exchange data through Internet respecting defined interfaces. A service presents a simple interface to its consumer that abstracts away the underlying complexity. Combined with deployments on public cloud infrastructure, the service offers large adaptability to evolution of the operational scenarios in particular for what regards scalability. This presentation aims to introduce the ongoing and planned evolutions of the Ground Segment architecture. Recognising community driven initiatives in interoperatbility such as STAC, and tapping into the rich framework for scientific computing offered by Python, Dask and Zarr the EOF intends to further streamlinine the interfaces within the Ground Segment and opening more opportunities in empowering an open ecosystem of service providers leveraging and enhancing the capabilities of the Copernicus programme.
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Room 1.34)

Presentation: Advancing Earth Observation with the ESA Copernicus Earth Observation Processor Framework (EOPF): New Approaches in Data Processing and Analysis Ready Data

#zarr

Authors: Davide Castelletti, Vincent Dumoulin, Roberto De Bonis, Kathrin Hintze, Jolyon Martin, Betlem Rosich
Affiliations: ESA
The ESA Sentinel missions, a fundamental component of the Copernicus Earth Observation program, deliver a comprehensive range of essential data for the monitoring of Earth's environment. The focus of the presentation will be on ESA Copernicus Earth Observation Processor Framework (EOPF), which aims to innovate the data processing infrastructure supporting the Sentinel missions, including the use of open-source tools and cloud computing platforms. A key highlight will be the adoption of the Zarr data format, which facilitates the storage and access of multidimensional data across all Sentinel missions, improving data interoperability, scalability, and performance. Additionally, the presentation will cover the development of Analysis Ready Data (ARD) products, in particular in the context of Sentinel 1 mission. ARD streamline processing by providing ready-to-use datasets for immediate analysis and are crucial for a wide range of applications, from climate change monitoring to disaster response and resource management. Finally, we will explore the evolving processor design for the Copernicus Expansion (COPEX) missions, emphasizing the need for new data processing approaches to handle the increasing volume, complexity, and diversity of satellite data.
Add to Google Calendar

Thursday 26 June 08:30 - 10:00 (Room 1.34)

Presentation: COPERNICUS REFERENCE SYSTEM PYTHON: AN INNOVATIVE WORKFLOW ORCHESTRATION WITH THE ADOPTION OF THE SPATIOTEMPORAL ASSET CATALOG

#cloud-native #stac

Authors: Nicolas Leconte, Pierre Cuq, Vincent Privat
Affiliations: CS Group, Airbus
This presentation showcases an overview of the Reference System Python (RS Python) developed within the Copernicus program for the Sentinel 1, 2, and 3 missions. The system will be able to expand its capabilities to include Sentinel-5P and pave the way for other upcoming Copernicus missions. RS Python orchestrates processing chains in a standard environment, from retrieving input data from the ground station, processing it, and providing the final products through an online catalog. The Copernicus Reference System Software has been developed in AGILE since 2021 under Copernicus, the European Union's Earth observation program implemented by the European Space Agency. Reference System Python is a continuation of the services implemented during the first phase, with adaptations considering the lessons learned during the first two years and the evolving CSC Ground Segment context. It offers a set of services necessary to build Copernicus processing workflows relying on Python frameworks. One main goal of the product is to provide an easy-to-use toolbox for anyone wanting to test and integrate existing or new Python processors. It is fully open-source and available online on a public Git repository (https://github.com/RS-PYTHON), allowing anyone to use it and even contribute. The major component is called rs-server. It exposes REST endpoints in the system and controls user access to all sensitive interfaces that require authentication. These endpoints can be called directly via HTTPS with OAuth2 authentication or by using our client named rs-client, a Python library with examples provided to ease the use of RS Python. It simplifies interactions with the system by embedding methods to call the various services of rs-server and handling more complex tasks under the hood, such as authenticating with an API key. RS Python can stage data (download and ingest in the catalog) using various protocols, like OData (Open Data protocol) or STAC (SpatioTemporal Asset Catalog), and multiple sources including CADIP (CADU Interface delivery Point) stations, AUXIP (Auxiliary Interface delivery Point, also known as ADGS, Auxiliary Data Gathering Service) stations, PRIP (Production Interface delivery Point) and LTA (Long Term Archive) stations. The ground stations still use the OData protocol, so until they are STAC-ready, RS Python performs STAC’ification on the fly to provide a unified experience and a unique protocol inside the system. The catalog is based on stac-fastapi-pgstac and is STAC-compliant. On top of that, we deploy STAC browser instances that provide a friendly Graphical User Interface (GUI) over the web browser. Our catalog, as well as the stations, are now easily searchable using all kinds of metadata. We use Prefect, an innovative Python orchestrator, to trigger the staging, processing, and inventorying of data and metadata. The processing can run locally on a laptop, or on Dask clusters to perform distributed computing with auto-scaled workers to achieve maximum performance when it’s needed. The auto-scaling features are applied at two different levels: nodes (infrastructure) and pods (services). This allows optimization of the number of running machines to handle the processing tasks and the number of tasks running in parallel on the available resources. It’s also designed with a sustainable approach, to reduce the cost, usage, and carbon footprint to the minimum. RS Python provides access to JupyterLab for the end-user. The end-user can build or start pre-made Prefect workflows from rs-client libraries. Grafana and OpenTelemetry enhance project monitoring and observability by providing real-time visualization and comprehensive data collection. Grafana provides interactive dashboards for tracking performance, while OpenTelemetry standardizes telemetry data, enabling seamless integration across systems. RS Python will run the refactored processors in Python from the Sentinel missions provided in the context of ESA’s CSC Data Processors Re-engineering project. Another goal is to be able to run any Python processor, making it a reference platform. With the RS Python open-source solution, one can set up a platform to support Copernicus Ground Segment operation-related activities such as processor validation and benchmarking, implementation and fine-tuning of data processing workflows, re-processing and production services, data quality investigations, integration of new processors and missions. In that sense, the Reference System is already used in other contexts, such as the ESA's Earth Explorer missions. Finally, the system is Cloud Native and designed to run with optimal performance in a fully scalable Kubernetes cluster. Yet it’s still possible to install it locally on a laptop ... so anyone can play with it!
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: D.05.03 - POSTER - Towards Modernized Copernicus Data: Enabling Interoperability through EOPF Principles and Advanced Data Access Strategies

#cloud-native #zarr

As demand for high-accuracy Copernicus data products grows, modernizing and re-engineering existing processors is essential. The Sentinel data processors, developed over a decade ago, require upgrades to remain viable for the next 15 years. A key focus of this modernization is enhancing data access through cloud optimization, interoperability, and scalability, ensuring seamless integration with new technologies.

A major development in this transition is the adoption of cloud-native data formats like Zarr, which significantly improve data handling, storage, and access. This shift supports the increasing volume and complexity of data from current and future missions. The Earth Observation Processing Framework (EOPF) plays a crucial role in enabling these advancements, providing a scalable and flexible environment for efficiently processing large datasets.

This insight session will provide updates on the latest status of EOPF project components, as well as the future of the Copernicus data product format, with a strong focus on Zarr and its practical applications. Experts will showcase how these innovations enhance data accessibility and usability, ensuring that Copernicus remains at the forefront of Earth observation. The session will also highlight EOPF’s role in streamlining data workflows, fostering collaboration among stakeholders, and advancing next-generation EO solutions.
Add to Google Calendar

Thursday 26 June 17:45 - 19:00 (X5 - Poster Area)

Poster: Advancing Global Land Cover Monitoring: Innovations in High-Resolution Mapping with the Copernicus Data Space Ecosystem

#stac #cog

Authors: Joris Coddé, Victor Verhaert, Adrian di Paolo, Max Kampen, Dorothy Reno, Dr. Yannis Kalfas, Wanda De Keersmaecker, Carolien Toté, Mathilde De Vroey, Luc Bertels, Dr. Tim Ng, Daniele Zanaga, Dr. Hans Vanrompay, Jeroen Dries, Dennis Clarijs, Dr. Ruben Van De Kerchove
Affiliations: VITO, Sinergise
Land cover mapping is crucial for comprehending and managing the Earth's dynamic environment. The Copernicus Global Land Cover and Tropical Forest Mapping and Monitoring service (LCFM), part of the Copernicus Land Monitoring Service (CLMS), addresses the need for high-resolution, dynamic global land cover information. By leveraging the capabilities of the Copernicus Data Space Ecosystem (CDSE), LCFM aims to deliver frequent, sub-annual land surface categories and land surface features. These are consolidated into global annual land cover maps and tropical forest monitoring products at 10 m resolution. This high level of detail will facilitate better decision-making and more effective monitoring of environmental changes. LCFM is powered by several essential components, all available from and operating within CDSE. The starting point is the extensive collection of Sentinel-1 and Sentinel-2 data available on EODATA, providing frequent, high-resolution imagery crucial for accurate land cover mapping. A significant challenge faced by the LCFM service is the processing of the extensive global archive of Sentinel-1 and Sentinel-2 data into multiple land cover products with varying temporal resolutions. To overcome this, the service employs processing workflows that operate close to the data, utilizing the openEO Processing system, as well as multiple cloud providers. The results are written directly to CloudFerro’s S3 storage. Additionally, LCFM offers access to its products through a dedicated viewer set up by Sinergise, with plans to incorporate this functionality into the CDSE browser for enhanced usability. Notably, LCFM is the first service to utilize the CDSE Cloud Infrastructure across both CloudFerro and Open Telekom Cloud, enhancing computational resources and scalability. The associated openEO workflows function on both clouds. Furthermore, the workflows read raw satellite data and output products directly, generating (e.g.) multiple resolutions at Sentinel-2 tile level in the form of single-band Cloud Optimized GeoTIFF (COG) files. Additionally, the workflows produce gdalinfo statistics and STAC metadata, which facilitate online quality assurance and enable seamless integration and retrieval of products through a STAC API. This allows further processing by openEO, among others. As a result, the project has driven a paradigm shift in openEO's processing approach—from a traditional single-output model, where workflows produce a single data cube with multiple bands, to a multi-output (multi-head) model that generates multiple files in parallel. This transformation greatly improves the efficiency of the overall workflows and keeps computing costs manageable. This presentation will illustrate how LCFM stands as a flagship project to showcase the potential of generating state-of-the-art global maps using European infrastructure. It will highlight the resulting products, how they are served to and usable by users, as well as how the underlying architecture and workflows are leveraged to generate these products. By continuously driving improvements in openEO, effective use of European cloud infrastructure, and other components, LCFM has significantly enhanced cost efficiency and scalability. These advancements position European cloud services as challengers to global cloud providers, marking a significant step forward in sustainable environmental monitoring and data processing capabilities.
Add to Google Calendar

Thursday 26 June 14:00 - 15:30 (Room 0.96/0.97)

Presentation: Leveraging Geospatial Data for Environmental Compliance Professionals: a Prototype for EU-Protected Forest Habitats

#stac

Authors: Corentin Bolyn, Kenji Ose, Giovanni Caudullo, Carlos Camino, Rubén Valbuena, Jörgen Wallerman, Pieter S A Beck
Affiliations: European Commission, Joint Research Centre (JRC), Swedish University of Agricultural Sciences (SLU)
Environmental compliance assurance is key to upholding environmental laws that protect the natural resources society depends on. Compliance assurance comprises promotion, monitoring, and enforcement, and each of these components can benefit from geospatial intelligence. Geospatial information can promote compliance by helping convey the importance of environmental laws and where it applies. Situational awareness derived by combining spatial information and legal expertise can allow inspectors to assess where compliance may be at risk and deploy resources for on-the-ground interventions more efficiently. And, when necessary, geospatial intelligence can help demonstrate breaches of environmental law. The volume of geospatial data is growing thanks to greater sharing of in situ data and maps. It is of course also growing due to new remote sensing data that provides increasingly detailed and up-to-date information on the environment, complementing legacy remote sensing data. Whether they come from remote sensing programmes, mapping agencies, or monitoring programmes, geospatial data are often sectorial. For those responsible for assuring compliance, data from other sectors can often be hard to access, let alone integrate into their workflows to generate geospatial intelligence. Here we show how geospatial data can be combined to support environmental compliance assurance using the European Union’s Habitats Directive as example. Among other things, the Directive aims to prevent the deterioration of protected habitats listed in its Annex I within designated Natura 2000 sites. Ensuring compliance with the relevant provisions of the directive requires that potential threats to protected habitats are effectively identified, monitored and assessed. In protected forest habitats, logging can be considered a hazard that increases the risk that they deteriorate because it affects the specific structure and functions necessary to maintain the habitat or associated species. Priority natural forest habitats listed in the Habitats Directive are particularly vulnerable as they are at risk of disappearing. In contrast, non-forest habitats, such as peatlands pastures, typically have a negligible risk of being damaged by logging, and may even be threatened by tree encroachment. We combined authoritative maps of the distribution of protected forest habitats with Earth Observation-based data on tree cover loss into a prototype tool to monitor forest habitats for logging activity. The prototype processes geospatial datasets to produce information that is then made available through a user-friendly web interface to aid interpretation by compliance professionals. The tool starts by identifying hazards, which are patches of protected habitats where tree cover has been lost. The web interface then helps the user explore and assess the hazards in their area of interest. First, it offers the possibility to refine the definition of hazards by filtering tree cover loss events based on: • The area of tree cover loss, both in absolute terms and relative to the size of the habitat patches; • The Annex I habitat type where the loss occurred; • The time period during which the loss took place. This filtering allows users to narrow the scope of the analysis to types of forest loss they consider of greatest concern. For example, a user could focus on recent large-scale clear-cutting within a specific priority habitat type. This ability to define and refine threats based on different criteria provides a more nuanced and targeted approach to assessing compliance risks. Once the user has set these criteria, they can then investigate the detected hazards in two complementary ways: • The Regional Assessment: This component summarizes hazard information for the entire study area with graphs. The graphs are interactive and allow users to move seamlessly to the Local Assessment component for more detailed investigation of specific hazards; • The Local Assessment: This component allows users to visualise and analyse the identified hazards in a map viewer together with various Earth Observation layers. Users can explore the spatial distribution of tree cover loss events, examine their characteristics and assess their potential impact on protected habitats. The assessments would allow compliance professionals to identify and prioritize areas for further investigation; in this case for example compliance would be checked with regard to Natura 2000 site’s conservation objectives, legal provisions, and the actual situation on the ground. The prototype can easily be updated to integrate new remote sensing data through the SpatioTemporal Asset Catalogs (STAC) standard. This opens perspectives to incorporate near real-time satellite data into the tool. It also makes it possible to incorporate information derived from airborne LiDAR campaigns which are particularly valuable for our example as they provide a level of reliability to assess tree cover change that is hard to obtain through other means. Our prototype shows how geospatial intelligence can be made more accessible to end users such as forest managers or environmental compliance professionals. It consolidates geospatial datasets and existing information into a single web interface, facilitating the collection of evidence for risk assessment. It serves as a powerful analytical tool for experts, while providing user-friendly access to information for those without specialist geoscience skills. This bridge between expert-generated evidence and management needs is becoming increasingly important to assure environmental compliance.
Add to Google Calendar

Friday 27 June

29 events

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Two Decades of Global Grassland Productivity: High-resolution GPP and NPP via Light Use Efficiency Model

#stac

Authors: Mustafa Serkan Isik, Leandro Parente, Davide Consoli, Lindsey Sloat, Vinicius Mesquita, Laerte Guimaraes Ferreira, Radost Stanimirova, Nathália Teles, Tomislav Hengl
Affiliations: Opengeohub Foundation, Land & Carbon Lab, World Resources Institute, Remote Sensing and GIS Laboratory (LAPIG/UFG)
Grassland ecosystems play a crucial role in absorbing carbon dioxide from the atmosphere and helping to reduce the impacts of climate change by sequestering carbon in the soil. They can either become the source or sink of the carbon cycle, depending on a number of factors like environmental constraints, climate variability, and land management. Given the importance of grasslands to the global carbon budget, accurately measuring and understanding Gross Primary Productivity (GPP) and Net Primary Productivity (NPP) in these ecosystems is essential. However, the spatial resolution and coverage of available productivity maps are often limited, reducing the possibility of capturing the spatial variability of grasslands and other ecosystems. In this paper, we present a high-resolution mapping framework for estimating GPP and NPP in grasslands at 30 m spatial resolution globally between 2000 and 2022. The GPP values are derived through a Light Use Efficiency (LUE) model approach, using 30-m Landsat reconstructed images combined with 1-km MOD11A1 temperature data and 1-degree CERES Photosynthetically Active Radiation (PAR). We first implemented the LUE model by taking the biome-specific productivity factor (maximum LUE parameter) as a global constant, producing a productivity map that does not require a specific land cover map as input and enables data users to calibrate GPP values accordingly to specific biomes/regions of interest. Then, we derived GPP maps for the global grassland ecosystems by considering maps produced by the Global Pasture Watch research consortium and calibrating the GPP values based on the maximum LUE factor of 0.86 gCm-²d-¹MJ-¹. Nearly 500 eddy covariance flux towers were used for validating the GPP estimates, resulting in R² between 0.48-0.71 and RMSE below 2.3 gCm-²d-¹ considering all land cover classes. In order to estimate the annual NPP, we computed the amount of yearly maintenance respiration (MR) of grasslands using MOD17 Biome Property Look-Up Table. The daily estimation of MR values are accumulated to yearly MR and finally subtracted from GPP to calculate annual NPP maps. The final time-series of GPP maps (uncalibrated and grassland) are available as bimonthly and annual periods in Cloud-Optimized GeoTIFF (23 TB in size) as open data (CC-BY license). The users can access the maps using SpatioTemporal Asset Catalog (http://stac.openlandmap.org) and Google Earth Engine. The NPP product is still an experimental product and is in the process of being developed. To our knowledge, these are the first global GPP time-series maps with a spatial resolution of 30m and covering a period of 23 years.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Cloud-Native Strategies for Legacy EO Data: Processing Challenges and Innovations

#cloud-native #stac #cog

Authors: Stefan Reimond, Senmao Cao, Christoph Reimer, Richard Kidd, Christian Briese, Clément Albinet, Mirko Albani
Affiliations: EODC Earth Observation Data Centre for Water Resources Monitoring GmbH, European Space Agency (ESA)
Preserving and utilizing Earth Observation (EO) data from heritage missions like ERS-1/2 and Envisat is crucial for advancing scientific research. However, integrating these legacy systems into modern cloud environments presents significant challenges. This contribution explores complexities and solutions associated with processing historic satellite data using legacy software in today's cloud-native ecosystems, illustrated by an example of ERS-1/2 and Envisat SAR data over Austria. Modern cloud technologies offer significant advantages for data processing and accessibility. They provide scalable, flexible, and efficient solutions that can handle large volumes of data with ease. Specifically, at EODC, we operate a Kubernetes cluster on top of OpenStack to manage our cloud infrastructure. Apart from providing state-of-the-art services like Dask and Jupyter for contemporary data analysis, this setup also supports the execution of legacy processing workflows. By making use of containerization tools like Docker to encapsulate these older processors, we minimize the risk of incompatibilities, ensuring they are executable and functional in the cloud. To efficiently manage such complex data processing workflows, we use Argo. However, several challenges arise when adapting legacy software to these modern environments. Compatibility issues with outdated libraries often require modifications or workarounds. Developing new software to manage input/output data and configuration files is essential to ensure smooth operation. Additionally, handling broken raw data and missing auxiliary data necessitates robust data management strategies. These challenges demand extensive testing and adjustments to ensure that legacy processors can function efficiently in a scalable cloud environment. An example application of this approach is the generation of a comprehensive time series of ERS-1/2 and Envisat (A)SAR data over Austria, demonstrating the practical implementation of these methodologies. This project, conducted in cooperation with ESA, highlights the successful integration of legacy processors into a Kubernetes cluster, utilizing Docker for containerization and Argo for workflow automation. Preliminary results from these processing efforts include various Level-1 and Analysis Ready Data (ARD) datasets, most notably Normalized Radar Backscatter (NRB) products. When applicable, these datasets utilize cloud-native formats like Cloud Optimized GeoTIFFs (COGs) and are accessible through EODC's SpatioTemporal Asset Catalog (STAC) interface. This setup enables on-the-fly analysis of decades-long time series using tools such as Jupyter and Dask, significantly enhancing data discoverability, accessibility, and usability.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: xcube: A Scalable Framework for Unified Access of Earth Observation Data

#cloud-native #stac #pangeo #zarr

Authors: Konstantin Ntokas, Pontus Lurcock, Gunnar Brandt, Norman Fomferra
Affiliations: Brockmann Consult GmbH
The increasing availability of Earth observation (EO) data from diverse sources has created a demand for tools that enable efficient, robust, and reproducible data access and pre-processing. Users of EO data often resort to custom software solutions that are time-consuming to develop, challenging to maintain, and highly dependent on the user’s programming skills and documentation practices. To address these challenges, the open-source Python library xcube has been developed to streamline the process of accessing EO data and presenting them in analysis-ready data cubes that comply with Climate and Forecast (CF) metadata conventions. xcube is a versatile toolkit designed to access, prepare, and disseminate EO data in a cloud-compatible and user-friendly manner. One key component of the software is its data store framework, which provides a unified interface for accessing various cloud-based EO data sources. This framework employs a plug-in architecture, enabling easy integration of new data sources while maintaining a consistent user experience. Each data store supports a standardized set of functionalities, abstracting the complexities of underlying APIs. This ensures that users have a consistent toolset for accessing and managing data from various, distributed providers, providing this data in the form of well-established Python data models such as those offered by xarray or geopandas. To date, several data store plug-ins have been developed for prominent cloud-based APIs, including the Copernicus Climate Data Store, ESA Climate Change Initiative (CCI), Sentinel Hub, and the SpatioTemporal Asset Catalog (STAC) API. These tools are already employed in various ESA science missions, simplifying data access for researchers and service providers. Ongoing developments focus on creating additional data stores, including support for the new EOPF product format for the Sentinels, alongside a multi-source data store framework. The latter will facilitate the integration of multiple federated data sources and incorporate advanced preprocessing capabilities such as sub-setting, reprojection, and resampling. By ingesting diverse datasets into a single analysis-ready data cube and recording the entire workflow, this approach significantly enhances the reproducibility and transparency of the data cube generation process. Prepared data cubes can be stored in multiple formats, with Zarr as the preferred choice. Zarr is a chunked format optimized for cloud storage solutions like Amazon S3. Once generated, these data cubes can be disseminated through xcube Server, which provides standard APIs such as STAC, OGC Web Map Tile Service (WMTS), OGC API - coverages, and many more. A client for these APIs is the built-in tool xcube Viewer – a single-page web application used to visualize and analyse data cubes and vector data published by xcube Server APIs. The xcube framework integrates seamlessly into the broader Pangeo ecosystem, leveraging its compatibility with Python libraries such as xarray, Dask, and Zarr. This ensures efficient data handling, scalable computation, and cloud-optimized storage. Beyond the Pangeo community, xcube’s standardized outputs using the xarray data model make them broadly applicable for researchers working with N-D spatiotemporal, multivariate datasets. In summary, xcube offers an open, scalable, and efficient solution for accessing, preparing, and disseminating EO data in analysis-ready formats. By providing standardized interfaces, robust preprocessing capabilities, and cloud-native scalability, xcube empowers researchers to focus on scientific analysis while ensuring reproducibility and interoperability across diverse datasets.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: D.04.06 - POSTER - Advancements in cloud-native formats and APIs for efficient management and processing of Earth Observation data

#zarr #stac #parquet #cloud-native #cog

Earth Observation (EO) data continues to grow in volume and complexity as the next generation satellite instruments are being developed. Furthermore, novel advanced simulation models such as the Digital Twins (DTs) deployed in the scope of the Destination Earth (DestinE) project generate immense amount of multidimensional data (few PB/day in total) thanks to the High Performance Computing (HPC) technology. Cataloguing, processing and disseminating such broad variety of data sets is a huge challenge that has to be tackled in order to unleash the full potential of EO. Storage and analytics of vast volumes of data have been moved from the on-premise IT infrastructure to large cloud computing environments such as Copernicus Data Space Ecosystem (CDSE), DestinE Core Service Platform (DESP), Google Earth Engine or Microsoft Planetary Computer. In this respect, robust multidimensional data access interfaces leveraging the latest cloud-native data formats (e.g. COG, ZARR, Geoparquet, vector tiles) and compression algorithms (e.g. ZSTD) are indispensable to enable advanced cloud-native APIs (e.g. openEO, Sentinel Hub) and data streaming (e.g. EarthStreamer). Moreover, metadata models have to be standardized and unified (e.g. STAC catalogue specification) among different data archives to allow interoperability and fast federation of various data sources. This session aims at presenting the latest advancement in data formats, data compression algorithms, data cataloguing and novel APIs to foster EO analytics in cloud computing environments.

Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Cloud-based framework for data cubes extraction of extreme events

#cloud-native #stac #zarr

Authors: Marcin Kluczek, Jędrzej Bojanowski S., Jan Musiał, Dr. Mélanie Weynants, Fabian Gans, Khalil Teber, Miguel Mahecha D.
Affiliations: CloudFerro S.A., Max Planck Institute for Biogeochemistry, Leipzig University
The growing need for detailed analysis of extreme environmental events requires advanced data processing and storage solutions. This work presents a cloud-based framework designed to extract multivariate data cubes of extreme events through the fusion of Sentinel-1 radar data and Sentinel-2 optical imagery. This framework supports advanced environmental monitoring within the ARCEME (Adaptation and Resilience to Climate Extremes and Multi-hazard Events) project, which focuses on global multi-hazard event assessments and aims to improve our understanding of cascading extreme events that affect ecosystems and society. The framework utilizes the SpatioTemporal Asset Catalogs (STAC) API to streamline Copernicus Earth Observation (EO) data access and management. This integration of cloud storage, multithreaded processing, and API-driven data access provides a robust solution for efficiently handling EO data in studies of extreme climate events. Key to the framework is the use of cloud-native storage in the Zarr format, which enables chunked, compressed data storage, optimizing both performance and resource utilization. Zarr’s compatibility with Dask allows for multithreaded, parallel data access, significantly accelerating data cube generation and analysis. The CREODIAS cloud infrastructure supports concurrent task execution, ensuring scalability and speed in handling large Earth Observation data, essential for real-time monitoring and large-scale analyses of extreme events. This work presents a comprehensive cloud-based framework for generating multitemporal data cubes of cascading extreme events, with a focus on efficient data filtering, preprocessing, and global-scale event detection. The framework integrates multi-hazard events from the ARCEME event database, which combines climate reanalysis data and reported impacts from cascading droughts and extreme precipitation events across diverse regions. By leveraging the STAC API, the framework streamlines data access and management, while cloud-native storage in the Zarr format ensures efficient chunking and compression. Additionally, multithreaded processing with Dask accelerates data cube generation, enabling scalable global studies of extreme events and their complex spatiotemporal dynamics and interactions.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Data representations for non-regular EO data: A case study using scatterometer observations from Metop ASCAT

#cloud-native #zarr

Authors: Sebastian Hahn, Clay Harrison, Wolfgang Wagner
Affiliations: TU Wien
Earth Observation (EO) data from instruments like the Advanced Scatterometer (ASCAT) onboard the series of Metop satellite present unique challenges for data representation. Unlike optical or SAR raster data, which can be seamlessly integrated into regular multidimensional data cubes, ASCAT observations are irregular, with each observation carrying its own unique timestamp. This irregularity requires alternative data models for efficient storage, access, and processing. Despite their prevalence, non-regular EO datasets are often overlooked in discussions about data modeling, with most approaches - particularly in cloud environments - favoring standard, well-structured raster formats. In this study, we explore three specialized data representations tailored to manage non-regular data: indexed ragged arrays, contiguous ragged arrays, and the incomplete multidimensional array representation. These models address the challenge of varying feature lengths within collections by employing different strategies for handling irregularities, such as padding with missing values for simplicity or leveraging compact, variable-length representations. We present these models using widely adopted cloud-native data formats (e.g. zarr) demonstrating their practical applicability with ASCAT swath and time series data. This work highlights the importance of addressing non-standard cases in EO data representation, which are often overshadowed by solutions tailored for regular raster data. The adoption of alternative data models implemented with cloud-native data formats ensures that these datasets can be integrated into existing EO data pipelines.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Video compression for spatio-temporal Earth System Data

#zarr

Authors: Oscar J. Pellicer-Valero, MSc Cesar Aybar, Dr Gustau Camps-Valls
Affiliations: Image Processing Lab (IPL), Universitat de València
The unprecedented growth of Earth observation data over the last few years has opened many new research avenues. Still, it also has posed new challenges in terms of storage and data transmission. In this context, lossless (no information is lost) and lossy (some information is lost) compression techniques become very attractive, with satellite imagery tending to be highly redundant in space, time, and spectral dimensions. Current approaches to multichannel image compression include (a) general-purpose lossless algorithms (e.g., Zstandard), which are frequently paired with domain-specific formats like NetCDF and Zarr; (b) image compression standards such as JPEG2000 and JPEG-XL; and (c) neural compression methods like autoencoders. While neural methods show much promise, they need more standardization, require extensive knowledge to apply to new datasets, are computationally expensive, and/or require specific hardware, limiting their practical adoption to general datasets and research scenarios. Most importantly, all methods fail to properly exploit temporal correlations in time-series data. To tackle these issues, we propose a simple yet effective solution: xarrayvideo, a Python library that leverages standard video codecs to compress multichannel spatio-temporal data efficiently. xarrayvideo is built on top of two technologies: ffmpeg and xarray. On the one hand, ffmpeg, a video manipulation library widely available and accessible for all kinds of systems, contains well-optimized implementations of most video codecs. On the other hand, xarray, a Python library for working with labeled multi-dimensional arrays, hence making xarrayvideo compatible with the existing geospatial data ecosystem. Combining both allows for seamless integration with existing workflows, making xarrayvideo easy to use for any dataset with minimal effort by the researcher. In summary, we introduce the following contributions: First, we present a new Python library, xarrayvideo, for saving multi-dimensional xarray datasets as videos using a variety of video codecs through ffmpeg. Second, we showcase its utility through a set of compression benchmarks on three real-world multichannel spatio-temporal datasets: DeepExtremeCubes, DynamicEarthNet and ERA5, as well as a Custom dataset, achieving Peak Signal-to-Noise Ratios (PSNRs) of 40.6, 55.9, 46.6, and 43.9 dB at 0.1 bits per pixel per band (bpppb) and 54.3, 65.9, 62.9, and 56.1 dB at 1 bpppb, surpassing JPEG2000 baselines in the majority of scenarios by a large margin. Third, we redistribute through HuggingFace a compressed version of the DeepExtremeCubes dataset (compressed from 3.2 Tb to 270 Gb at 55.8-56.8 dB PSNR) and the DynamicEarthNet dataset (compressed from 525 Gb to 8.5 Gb at 60.2 dB PSNR), hence serving as illustrative examples, as well as providing to the community a much more accessible version of these datasets. With xarrayvideo, we hope to solve the issues emerging from increasingly large Earth observation datasets by making high-quality, efficient compression tools accessible to everyone.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Optimizing Partial Access to Sentinel-2 Imagery With JPEG2000 TLM Markers

#parquet #cog

Authors: Jérémy Anger, Thomas Coquet, Carlo de Franchis
Affiliations: Kayrros, ENS Paris-Saclay
Efficient access to remote sensing data is critical for the success of applications such as agriculture and human activity monitoring. In the context of Sentinel-2, data products are distributed as JPEG2000 files at Level-1C and Level-2A processing levels. Optimizing data access involves minimizing downloads to regions of interest, optimizing latency, and reducing unnecessary decompression and decoding. Currently, the Copernicus Data Space Ecosystem (CDSE) platform provides new mechanisms, such as HTTP range requests thanks to the S3 protocol, which allow partial file downloads—a significant improvement over the previous SciHub platform. These enhancements are well known when exploiting cloud-optimized formats like Cloud Optimized GeoTIFF (COG) and Parquet files. Sentinel-2 imagery is distributed in JPEG2000 format. Considering for example a 10m band, the encoder compresses the data and organizes the image into 121 independent 1024×1024 internal tiles. Each tile is encoded sequentially, with headers (Start of Tile markers, or SOTs) indicating the length of the associated codestream. While this allows tiles to be located by sequentially fetching and interpreting the headers, retrieving a specific tile currently requires multiple HTTP range requests: up to 120 small requests (<1 KB) to determine the last tile's location and a final larger request (~1 MB) for the tile data. Although efficient in terms of data size and decoding effort, this approach incurs high latency for users and infrastructure overhead for the CDSE provider. A solution to these inefficiencies lies in utilizing TLM (Tile-Part Length Marker) headers, an optional feature in the JPEG2000 standard. TLM markers, stored in the main file header, allow direct computation of any tile's location without sequential parsing. With TLM markers, accessing a specific tile requires only two HTTP range requests: one to fetch the main header (~4 KB) containing TLM markers and another for the desired tile's data. This approach reduces the average number of requests from 61 to just 2, significantly lowering latency and system load. Additionally, this configuration offers performance similar to COGs while avoiding a major file format change. Discussions with ESA to enable TLM markers in future products are ongoing. Adopting TLM markers requires minimal modification to the existing JPEG2000 encoding pipeline, as most mainstream JPEG2000 libraries already support this feature. However, historical products (Collection 0 and Collection 1) lack TLM markers, and re-encoding these files would be prohibitively expensive. An alternative solution involves generating external TLM metadata files for past datasets, enabling rapid access to tile locations. For instance, TLM metadata for all products of a specific MGRS tile would require less than 100 MB of storage and could be distributed efficiently. In conclusion, enabling TLM markers in new Sentinel-2 products would provide substantial benefits to the remote sensing community, improving data accessibility with minimal impact on encoding processes. For already encoded imagery, the generation of external TLM metadata offers a viable pathway to achieving similar efficiency gains. These advancements align with the goal of reducing barriers to high-resolution geospatial data access and optimizing resource usage on both client and provider sides.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Metadata Requirements for EO Products

#stac

Authors: Katharina Schleidt, Stefania Morrone, Stefan
Affiliations: DataCove E.u., Epsilon Italia, Stiftelsen NILU
As Copernicus matures and ever more satellites are providing a wealth of data, we are also seeing an increase in the diverse data products being generated from the raw satellite data. The Copernicus Services enable access to ever more data products, spread across the six Copernicus Services. As these products are also gridded data, just like the raw source data, the same mechanisms are utilized for data discovery, metadata provision and data access. However, due to the different nature of the data being provided as derived products vs. the original raw source data, this leads to various issues in identifying and accessing relevant datasets. In this paper we will take STAC, the SpatioTemporal Asset Catalog, as an example, as we utilized this technology in the FAIRiCUBE Project. Further, we will focus on the challenge of finding data pertaining to a specific observable property, e.g. Surface Soil Moisture, Average Precipitation or Imperviousness. In the STAC Common Metadata, in addition to basic metadata such as title, provider or license, as one would expect to find as common metadata, we find structures for the description of instruments and the bands they deliver, all tailored towards satellite or drone data. Thus, in order to correctly describe a derived data product, one must look to the STAC extensions. Here, the same pattern becomes apparent, with the datacube extension only providing an informal textual description of the properties being conveyed. Finding data products on specific observable properties of interest remains a painstaking task. This gap in relevant metadata stems from two different communities encountering each other. The satellite community is relatively small, the types of raw data provided fairly constrained and well known within the community. In contrast, the terrestrial environmental science communities have long dealt with the challenge of multitudes of observable properties. This gap in relevant metadata for the description of derived products applies across technologies for gridded data, as most stem from the satellite domain and are only slowly being tailored for use with terrestrial products. In the provision of terrestrial geospatial data products, both conceptual models and vocabularies/ontologies have been utilized to better describe WHAT is actually being conveyed by the data. The ISO/OGC Observations, Measurements and Samples (ISO 19156) standard is comprised of a conceptual model providing guidance on provision of observational metadata. State of the art for indication of what data is being provided has long been references to common vocabularies or ontologies, providing the necessary concepts under stable URIs. In recent years these resources have become enriched with deeper semantics, e.g. the I-ADOPT framework for the FAIR representation of observable variables, enabling powerful search options. In order to fully reap the benefits of the increasing number of derived data products, the metadata systems used to describe them will have to evolve together with the types of derived data being made available. The EO community can gain valuable insights as to how to best describe EO derived products by taking concepts from terrestrial geospatial data on board.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: GeoHEIF - Organizing geospatial images into data cubes inside a HEIF file format.

#cog

Authors: Joan Maso, Nuria Julia, Dirk Farin, Brad Hards, Martin Desruisseaux, Jérôme St-Louis, Alba Brobia
Affiliations: CREAF (Gruments), Imagemeter, Silvereye, Geomatys, Ecere
The Open Geospatial Consortium (OGC) GeoTIFF standard included the concept of georeference in the popular TIFF format. That was possible due to an extendable structure defined in the original format that is based on an Image File Directory (IFD). The IFD structure allows for multiple images in the same file. While this has been widely used to store multiband imagery in a single file (e.g. Landsat channels) or to include a multiresolution image (as it is done in the Cloud Optimized GeoTIFF; COG), there is no standard way to organize several images to create a datacube. The High Efficiency Image File Format (HEIF) is a digital container format for storing images and image sequences, developed by the Moving Picture Experts Group (MPEG) and standardized in 2015. A single HEIF file can contain multiple images, sequences, or even video and incorporate the latest advances in image compression. The structure of the content in boxes provides an extension mechanism comparable to the TIFF file. There is no standard mechanism to include georeference information in a HEIF file yet and the support of HEIF files in current GIS software is limited due to the format relatively recent introduction. This creates an opportunity for the OGC to define an extension for HEIF that describes how to include the georeference information in the file, taking advantage of the experience from the GeoTIFF but also considering the recent progress in the definition of multidimensional datacubes. The whole idea is to specify a multidimensional HEIF file, based on the aggregation of georeferenced 2D images that can optionally be structured in tiles and supporting multiresolution (a pyramidal structure, called “overviews” in COG). The fundamental datacube structure is already described in the ISO19123 that defines conceptual schema for coverages that separates the concept of domain and range. The domain consists of a collection of direct positions in a coordinate space, which can include spatial, temporal, and non-spatiotemporal (parametric) dimensions. The domain is structured in a number of axes that are also called dimensions. All the intersections of the different dimensions of the coverage can be seen as a hypercube or a hyper-grid. For each intersection of the direct positions of the dimensions we can associate one or more property values (called rangesets in the OGC Coverage Implementation Schema) populating the datacube. In its implementation in the HEIF file, we propose that the datacube can be decomposed in 2D planes (a.k.a. images) that are georeferenced using a Coordinate Reference System CRS and an affine transformation matrix (that in many cases will be ”diagonal” and define only a linear scaling and a translation of the image model into the CRS model). Each plane has a fixed “position” in the other N-2 dimensions (a.k.a. extra dimensions) forming a multidimensional stack of planes. The images will contain the values of a single property in the datacube. The HEIF file has an internal structure of property boxes (that provides a similar extensibility mechanism as the IFD structure in TIFF). The proposal described in this communication is to define property boxes for describing CRSs, extra dimensions, fixed positions of the extra dimensions and property types (coverage range types). In HEIF, each property box has a unique identifier that can be associated with HEIF entities. Since an image is an entity, each georeferenced image can be associated to the necessary property boxes to define its “position” in the datacube “stack” and the meaning of the values of its pixels (property types). It is worth noting that the 2D CRS dimensions, the extra dimensions and the property types are defined as URI that points to a semantic definition of the axes. The 2D CRS dimension points to a CRS vocabulary (commonly describing the EPSG codes) and the extra dimensions and the property types point to a concept in a variable vocabulary (such as QUDT) and to unit of measure vocabulary (commonly a UCUM ontology). Once consolidated in HEIF, this approach can be applied also to a new version of the GeoTIFF standard. This talk will present the current status of the OGC GeoHEIF standard as advanced in the OGC Testbed-20, and the OGC GeoTIFF Standard Working Group.
Add to Google Calendar

Friday 27 June 13:00 - 14:30 (X5 - Poster Area)

Poster: Cloud-Optimized Geospatial Formats Guide

#cloud-native #zarr

Authors: Emmanuel Mathot, Aimee Barciauskas, Alex Mandel, Kyle Barron, Zac Deziel, Vincent Sarago, Chris Holmes, Matthew Hanson, Ryan Abernathey
Affiliations: Development Seed, Radiant Earth, Element84, Earthmover
Geospatial data is experiencing exponential growth in both size and complexity. As a result, traditional data access methods, such as file downloads, have become increasingly impractical for achieving scientific objectives. With the limitations of these older methods becoming more apparent, cloud-optimized geospatial formats present a much-needed solution. Cloud optimization enables efficient, on-the-fly access to geospatial data, offering several advantages: - Reduced Latency: Subsets of the raw data can be fetched and processed much faster than downloading files. - Scalability: Cloud-optimized formats are usually stored on cloud object storage, which is infinitely scalable. When combined with metadata about where different data bits are stored, object storage supports many parallel read requests, making it easier to work with large datasets. - Flexibility: Cloud-optimized formats allow for high levels of customization, enabling users to tailor data access to their specific needs. Additionally, advanced query capabilities allow users to perform complex operations on the data without downloading and processing entire datasets. - Cost-Effectiveness: Reduced data transfer and storage needs can lower costs. Many of these formats offer compression options, which reduce storage costs. Providing subsetting as a service is feasible, but it entails ongoing server maintenance and introduces extra network latency when accessing data. This is because data must first be sent to the server running the subsetting service before reaching the end user. However, with the use of cloud-optimized formats and the right libraries, users can directly access data subsets from their own machines, eliminating the need for an additional server. When designing cloud-optimized data formats, it's essential to acknowledge that users will typically access data over a network. Traditional geospatial formats are often optimized for on-disk access and utilize small internal chunks. However, in a network environment, latency becomes a significant factor, making it crucial to consider the potential number of requests that may be generated during data access. This understanding can help improve the efficiency and performance of cloud-based data retrieval. The authors have contributed to libraries for manipulating and storing geospatial data in the cloud. They authored a guide (https://guide.cloudnativegeo.org/) designed to help understand the best practices and tools available for cloud-optimized geospatial formats. We hope that readers will be able to reuse lessons learned and recommendations to deliver their cloud native data to users in applications and web browsers and contribute to the wider adoption of this format for large scale environmental data understanding. Keywords: Cloud-Native Raster, Geospatial Data, Cloud Optimized GeoTIFF, Zarr, TileDB, Satellite Imagery, Remote Sensing, Data Processing, Cloud Computing
Add to Google Calendar

Friday 27 June 10:52 - 11:12 (EO Arena)

Demo: D.01.19 DEMO - EDEN service in the platformInteracting with DestinE Data Portfolio

#cloud-native

The EDEN service demonstration will provide an in-depth look at how Destination Earth (DestinE) data portfolio, including Digital Twins, Copernicus data and services as many other datasets made available in the federated data sources. The session will focus on showcasing both the platform’s human-friendly and machine-to-machine interfaces designed to support both technical and non-technical users belonging to EO community.
The demonstration will showcase selected case studies on air quality monitoring and forecasting for the analysis of natural phenomena and human activities from satellite and model-based data, illustrating the benefit of Analysis-Ready data for the development of cloud web-based services. Participants will gain a practical understanding of how the platform provides native and cloud-native data.
Demo Session Structure:
- Platform overview (5 min):
- An introduction to EDEN service and core functionalities: Finder, Harmonised Data Access API.
- Data Portfolio

- Case Studies (15 min):
- Dust events, whose frequency is increasing due to changing atmospheric conditions, transport fine particles over long distances, with severe consequences on air quality and visibility across Europe.
- Wildfires, boosted by rising temperatures and prolonged droughts, release massive amounts of pollutants, further degrading air quality
- Case study execution through JupyterLab

- Q&A Session: Open discussion to address participant questions.

We encourage all LPS participants to register and create an account on the DestinE Platform (https://platform.destine.eu/) and read more about EDEN service its features:
https://platform.destine.eu/services/service/eden/
https://platform.destine.eu/services/documents-and-api/doc/?service_name=eden

=S=Speakers:
  • Simone Mantovani - MEEO
  • Alessia Cattozzo - MEEO
  • Federico Cappelletti - MEEO
Add to Google Calendar

Friday 27 June 13:07 - 13:27 (EO Arena)

Demo: D.04.21 DEMO - Empowering EO Projects with Cloud-Based Working Environments in APEx

#cloud-native #stac

The APEx Project Tools provide EO projects with ready-to-use, cloud-native, configurable working environments. By leveraging a comprehensive suite of pre-configured tools—including a project website, JupyterHub environment, STAC catalogue, visualization tools and more—projects can quickly establish their own collaborative environment without the complexity of managing its infrastructure. By providing robust, scalable, and user-friendly environments, the APEx Project Tools foster greater collaboration and support the accessibility and reuse of project outcomes within the EO community.

This demonstration will showcase how APEx enables seamless access to flexible and scalable working environments that can be tailored to a project’s needs. Participants will be guided through the key project tools and their capabilities, illustrating how they can support activities such as data processing, visualization, and stakeholder engagement. The session will provide insights into the different instantiation options available, from project-specific portals to interactive development environments and geospatial analysis tools. By highlighting the ease of integration between these components, the session will demonstrate how APEx facilitates the rapid deployment of tailored project environments that align with project objectives.

By attending this session, EO project teams will gain a deeper understanding of how APEx streamlines the deployment of cloud-based tools, reducing technical barriers and allowing researchers to focus on scientific innovation. With APEx handling the infrastructure, teams can dedicate more time to developing and sharing impactful EO solutions, ensuring broader adoption and engagement within the community.

Speakers:


  • Bram Janssen - VITO
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall F2)

Presentation: FORDEAD 2.0: Monitoring forest diseases with Sentinel-2 time series using cloud-based solutions

#stac

Authors: Jean-Baptiste Feret, Dr Florian de Boissieu, Remi Cresson, Elodie Fernandez, Kenji Ose
Affiliations: TETIS, INRAE, AgroParisTech, CIRAD, CNRS, Université Montpellier, European Commission, Joint Research Centre, Ispra, Italy
Climate change is leading to an increase in severe and sustained droughts, which in turn contributes to increase the vulnerability of forest ecosystems to pests, particularly in regions that usually experience high water availability. As a consequence, bark beetle outbreaks occurred at unprecedented levels over the past decade in Western Europe, resulting in high spruce tree mortality. Forest dieback caused by bark beetle infestations poses significant challenges for monitoring and management. To address this issue, there is an urgent need of operational monitoring tools allowing large-scale detection of bark beetle outbreaksto better understand outbreak dynamics, quantify surfaces and volumes impacted, and help forestry stakeholders in decision making. Ideally, such tools would incorporate early warning systems. The complexity of bark beetle dynamics, coupled with the spatial and temporal variability of forest dieback, necessitates advanced monitoring solutions that leverage remote sensing technologies. Remotely-sensed detection of bark beetle infestation relies on detecting the symptoms expressed by trees in response to attack. Infested trees experience different stages, starting with the early ‘green-attack stage’, mainly characterized from the ground by visual identification of the boring holes in the bark, with no change in color of the foliage.The following ‘yellow-attack stage’ and ‘red-attack stage’ present changes in foliage color induced by changes in pigment content. The ultimate ‘grey-attack stage’ appears with foliage loss. Early detection is crucial for pest management but the remotely sensed identification of the green-attack stage is challenging due to the lack of visible symptoms. However, moderate changes in foliage water content occur during this stage, which can be detected in the near infrared and shortwave infrared domains. Satellite imagery acquired by optical multispectral missions such as Sentinel-2, may then provide relevant information to identify such tenuous changes early. FORDEAD (FORest Dieback And Degradation) is a method designed to identify forest anomalies using satellite images time series. Developed in response to the bark beetle outbreaks that has occurred in France since 2018, FORDEAD aimed above all to provide an operational monitoring system. In this context, FORDEAD analyses the seasonality of a spectral index sensitive to vegetation water content computing: the Continuum Removal in the ShortWave InfraRed (CR-SWIR). Since 2020, FORDEAD has been applied to produce quarterly maps of bark beetle outbreak covering about 25% of the French mainland territory. The results are promising regarding early detection capabilities. Indeed, the success rate reaches 70% in detecting of the ‘green-attack stage’, with a low false positive rate. The method has been implemented in a python package in order to ease the transfer to the national forestry services and forest management services, and more broadly to all potentially interested users. Beyond the method, storage and computing resources are crucial to access and process satellites images, in particular high resolution time series such as sentinel-2.The emerging cloud-based solutions provide scalable computing resources that enable near-real-time analysis, integrating diverse datasets. The combination of appropriate storage solutions, cloud optimized data format and processing standards are now reaching maturity for large-scale geospatial processing in a free and open source environment. The synergy between remote sensing and cloud-based platforms presents opportunities for forest monitoring, enabling improved detection, prediction, and response strategies. A new version of FORDEAD has been developed and optimized to take advantage of cloud-based solutions, including seamless access to Spatio Temporal Assets Catalogs (STAC). This enhanced version provides a versatile toolkit dedicated to multiple usages from pixel/plot scale analysis for calibration and validation with field data, to large-scale monitoring over national extent. FORDEAD has been integrated into the cloud infrastructure of the THEIA-Land data center. This French service aims at producing and distributing higher level remote sensing data products, providing technical support and access to dedicated methods for both remote sensing experts and user community. FORDEAD is also compatible with the European infrastructures currently being created, such as the Copernicus Data Space Ecosystem (CDSE). This contribution aims at supporting forest management practices and long-term ecological studies to help understand and predict the spatial and temporal dynamics of bark beetle outbreaks, and to mitigate their impact. We strongly argue for more systematic integration of free and open standards in remote sensing data analysis frameworks, including improved geospatial and in situ data interoperability, modular software design for an easier integration of alternative methods in the multiple stages of an algorithm. By promoting these standards, we foster a more collaborative approach, ensure accessibility for a diverse range of stakeholders and thus meet the needs of forest sector and public policies.
Add to Google Calendar

Friday 27 June 14:30 - 16:00 (Hall F2)

Presentation: DETER-RT: An improved, highly customizable SAR-based deforestation detection system for the Brazilian Amazon

#pangeo

Authors: Juan Doblas, Mariane Reis, Stéphane Mermoz, Claudio A. Almeida, Thierry Koleck, Cassiano Messias, Luciana Soler, Alexandre Bouvet, Sidnei Sant'Anna
Affiliations: Globeo, INPE, CNES, IRD
Introduction The last decade has seen a rapid development of automated near-real time deforestation detection (NRT-DD) systems over tropical forests. This progress has been driven by the growing availability of orbital images, particularly those from the Landsat and Copernicus initiatives, including both optical and SAR-based datasets. The need for effective monitoring of tropical forest disturbances has grown as these forests play a crucial role in climate regulation and biodiversity conservation. Most current NRT-DD systems operational over tropical regions rely on a fixed set of parameters, which are either part of a trained machine learning model (such as GLAD-L [1]) or a set of physics-based rules calibrated by expert knowledge applied over processed time-series (e.g., RADD [2] or TropiSCO [3]). These parameters, once determined, apply globally, governing the detection algorithm in a uniform manner. While this approach facilitates ease of implementation and has yielded vast amounts of valuable data on the state and evolution of tropical forests [4], several limitations exist when applied in practice by local agents. Specific challenges include: - Lack of Flexibility: Fixed global parameters cannot adequately account for the diverse forest types and deforestation patterns across different tropical regions. - Customization Limitations: Existing systems do not allow users to adjust algorithm parameters to meet specific needs, such as reducing commission errors. This might be the case of forestation deterring field teams, which needs almost absolute certainty before engaging a field action over a given warning. - Access Issues: Products from most NRT-DD systems are available only as raster images, which can be difficult or impossible to download. This presents a significant challenge for local authorities or communities with limited resources. System Overview Here, we introduce DETER-R-TropiSCO (DETER-RT), a new SAR-based NRT-DD system resulting from collaboration between the scientific teams of the National Spatial Research Institute in Brazil (INPE) and the French National Center for Space Studies (CNES), facilitated by GlobEO researchers. DETER-RT is a hybrid system that analyses Sentinel-1 data using features from two existing projects— DETER-R [5] and TropiSCO —and is developed using open-source code, leveraging the PANGEO paradigm [6] and CNES HPC computing capabilities. The collaborative nature of this project harnesses CNES’s computational resources alongside INPE’s deep understanding of regional forest dynamics and GlobEO’s operational expertise. DETER-RT takes an innovative approach to deforestation monitoring by allowing users to fine-tune detection algorithms to suit their specific needs. This system also integrates advanced knowledge-based routines absent in previous models, including: - Sensitivity maps: These maps allow users to vary detection parameters across different regions, enhancing spatial adaptability. - Proximity Sensitivity Modulation: The system modulates detection sensitivity based on the distance to previously recorded deforestation, which helps to adjust alerts more precisely. The precise function modeling spatial dependence has been modeled based on statistical analysis of the reference deforestation data, following the methodology proposed in [5]. - Morphological Post-Processing: Post-treatment routines help refine the shape and characteristics of detected anomalies to improve data quality. - Vectorized Deforestation Warnings: An extension of the system enables the export of vectorized deforestation warnings, which can facilitate easier integration into geographic information systems (GIS) and practical use by stakeholders. Tuning and Validation The DETER-RT system underwent an extensive calibration and validation process. For calibration, reference data from INPE’s PRODES and Mapbiomas Alertas [7] projects were utilized. The initial calibration step involved setting up alpha maps for the different ecoregions within the Amazon biome to account for regional variation. Other key parameters, including those affecting post-processing routines, were also adjusted to align with INPE’s specific requirements. For the validation procedure several automated and expert-guided procedures are being used to assess both omission and commission errors across multiple parameters sets and to evaluate the timeliness of the alert system compared to existing NRT-DD systems. The system has demonstrated a strong performance, featuring a relatively low omission rate (~20%) and significantly reduced commission errors—an advancement over the current state-of-the-art in the region. At the time of the writing, the validation tasks are mostly finished. Operationalization For operational purposes, the DETER-RT system is divided into two sub-systems: - Change Ratio Image Computation: This computationally intensive task runs daily using CNES infrastructure, processing Sentinel-1 data to generate change ratio images. The output is then uploaded to INPE’s network. - Anomaly Extraction and Warning Generation: The second subsystem operates within INPE's network, analyzing the ratio images to extract anomalies. These anomalies are vectorized and issued as deforestation warnings, allowing users to have full control over detection parameters and the ability to adjust them as needed. Currently, the system is operational across the entire Amazon Basin, having processed over 200,000 Sentinel-1 images. Beyond scientific innovation, DETER-RT serves as a showcase of successful international collaboration, demonstrating how coordinated efforts between different research institutions can help address specific, urgent environmental challenges. Conclusion DETER-RT stands as a significant advancement in deforestation monitoring by providing a highly adaptable SAR-based detection system capable of accommodating regional specificity. The ability to customize parameters, the use of advanced spatial analysis techniques, and the export of vectorized warnings collectively address many of the practical challenges faced by local communities and environmental authorities. The system's success showcases the power of international partnerships and innovative technology in providing real-world solutions to pressing environmental issues, contributing not only to scientific knowledge but also to more effective governance of natural resources. References [1] M. Hansen et al., "Humid Tropical Forest Disturbance Alerts Using Landsat Data," Environmental Research Letters, 2016. [2] J. Reiche et al., "Forest disturbance alerts for the Congo Basin using Sentinel-1," Environmental Research Letters, 2021. [3] S. Mermoz et al., "Continuous Detection of Forest Loss in Vietnam, Laos, and Cambodia Using Sentinel-1 Data," Remote Sensing, vol. 13, 2021. [4] WRI, Global Forest Review, 2024, update 8. Washington, DC: World Resources Institute. Available online at https://research.wri.org/gfr/global-forest-review. [5] J. Doblas et al., "DETER-R: An Operational Near-Real Time Tropical Forest Disturbance Warning System Based on Sentinel-1 Time Series Analysis," Remote Sensing, vol. 14, 2022. [6] R. Abernathey, et al. (2017): Pangeo NSF Earthcube Proposal. [7] MapBiomas, Alert Project - Validation and Refinement System for Deforestation Alerts with High-Resolution Images, accessed in 2024.
Add to Google Calendar

Friday 27 June 14:30 - 16:00 (Hall F2)

Presentation: Global Mangrove Watch (GMW) Radar Alerts for Mangrove Monitoring (RAMM) - a cloud-based deep learning system to detect mangrove loss

#stac

Authors: Benjamin Smith, Dr. Pete Bunting, Dr Victor Tang, Lammert Hilarides, PhD Andy Dean, Dr Frank Martin
Affiliations: Hatfield Consultants, Aberystwyth University, Wetlands International, European Space Agency
Global Mangrove Watch (GMW) publishes annual global data of the extent of mangrove forests and since 2019 a mangrove deforestation alert system using Copernicus Sentinel-2 imagery (Bunting et al. 2023). The optical image alert system is limited by cloud cover, often delaying the detection of mangrove loss and diminishing its impact for mangrove conservation. Sentinel-1 Synthetic Aperture Radar (SAR) penetrates cloud cover and offers potential to provide consistent monthly alerts. However, in coastal regions analytical methods must address the complex interaction of the SAR signal with mangrove canopy and water. To meet GMW’s goal of monthly alerts, under ESA OpenEO funding, we prototyped the Radar Alerts for Mangrove Monitoring (RAMM) system as an event-driven scalable system to process, detect, and validate alerts using SAR data. Deployed in CREODIAS cloud environment on a Kubernetes cluster, all worker processes were able to access Earth Observation (EO) data through the EODATA repository with appropriate S3 credentials. Data discovery and ingestion is made available by SpatioTemporal Asset Catalog (STAC) queries. Deploying on Kubernetes permits the dynamic scaling of worker processes from zero to a set maximum for when scheduled or intermittent jobs are submitted; this reduces compute footprint and financial and environmental costs of operation. All outputs are stored in Binary Large OBject (BLOB) data storage. RAMM is implemented over the GMW baseline extent (currently 2020) in a two stage per pixel approach, designed to maximize efficiency. The first stage is a simple rule-based approach based on thresholding of backscatter value and difference from the previous year’s median backscatter value, which reduces effects of seasonal and tidal variability and radar speckle. The thresholds are designed to minimize false negatives (i.e. capture all possible mangrove loss) The second stage is a 1-Dimensional Convolutional Neural Network (1D-CNN) that is trained on a dataset of confirmed alerts and false positives (both produced by GMW). The integration of false positives into the training dataset encourages the model to recognise and flag falsely identified alerts provided by the rule-based first stage as deep learning algorithms are known to extract fine-grained patterns between domain distributions. The RAMM system is triggered by a HTTP event to an Application Programmatic Interface (API) server which then initialized a job to populate a Message Queue (MQ) with tasks (messages that are consumed one at a time by worker processes until completed by preventing message buffering on the worker). Leveraging a MQ enables jobs to be consumed concurrently across a set of workers. If any single task errors or is slow to complete, the queue may be consumed by other available workers. As the first stage workers consume the coordination job populated MQ, subsequent messages are then pushed to a second MQ that then triggers the second stage to concurrently scale up and begin processing. The first and second stage upload intermediate (for debugging) and final outputs to the dedicated BLOB storage. The first stage takes in a task and retrieves the provided GMW mangrove extent, disseminated by the initialisation coordination job. Ingesting the subsequent VH polarisation SAR data for the monthly assessment and performing a simple rule-based approach. For the signal values that have been identified, they are set to a binary mask GeoTIFF and pushed to BLOB storage. The path of this data is then delivered to the second stage MQ. As the second stage MQ is populated, worker processes begin to scale up in response to the message events. The second stage ingests the first stage binary mask of identified signals and for each of the geolocated pixels performs a temporal validation with a 1D-CNN using the previous 7 months of sequential acquisitions; given Sentinel-1 has a revisit cadence of roughly four to five days this equates to about 40 acquisitions. Upon completion of validation, the second stage determines if the first stage identified alert is a true- or false-positive. The stage 1 and 2 identified locations are encoded into a GeoJSON Feature Collection and stored in the BLOB storage. Demonstration sites in Guinea-Bissau and North Kalimantan, Indonesia, were selected for prototyping and serve as the benchmark to scale analyses to the global mangrove coverage. An assessment of separability between true- and false-alert signals provided evidence for the deep learning models efficacy potential. The model, trained over described dataset, iterated for 200 epochs using an initial learning rate of 1e-4 (0.001), loss defined by binary cross-entropy, and learning rate scheduling by the Adam optimizer. The final layer employed a sigmoid activation function to output a probability percentage, which is assessed to [0,1] exclusively to satisfy the binary labelling desired. The model achieved a test accuracy of 72%. RAMM was implemented on scalable infrastructure with minimal overhead required to keep the basics operating, the deep learning model was trained using only CPU, and all training leveraged data situated close to the compute as to minimise network usage; all to minimise the environmental impact of operating on Cloud systems in datacenters. Due to the dynamic nature of workload handling, RAMM can process the two demonstration sites in 10 minutes or less and expecting a linear growth as it is applied to the wider, global, coverage for mangroves. RAMM has great potential to contribute to mangrove conservation as a complement to the annual GMW optical mangrove extent data and optical-based alerts system.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall K2)

Session: D.04.06 Advancements in cloud-native formats and APIs for efficient management and processing of Earth Observation data

#zarr #stac #parquet #cloud-native #cog

Earth Observation (EO) data continues to grow in volume and complexity as the next generation satellite instruments are being developed. Furthermore, novel advanced simulation models such as the Digital Twins (DTs) deployed in the scope of the Destination Earth (DestinE) project generate immense amount of multidimensional data (few PB/day in total) thanks to the High Performance Computing (HPC) technology. Cataloguing, processing and disseminating such broad variety of data sets is a huge challenge that has to be tackled in order to unleash the full potential of EO. Storage and analytics of vast volumes of data have been moved from the on-premise IT infrastructure to large cloud computing environments such as Copernicus Data Space Ecosystem (CDSE), DestinE Core Service Platform (DESP), Google Earth Engine or Microsoft Planetary Computer. In this respect, robust multidimensional data access interfaces leveraging the latest cloud-native data formats (e.g. COG, ZARR, Geoparquet, vector tiles) and compression algorithms (e.g. ZSTD) are indispensable to enable advanced cloud-native APIs (e.g. openEO, Sentinel Hub) and data streaming (e.g. EarthStreamer). Moreover, metadata models have to be standardized and unified (e.g. STAC catalogue specification) among different data archives to allow interoperability and fast federation of various data sources. This session aims at presenting the latest advancement in data formats, data compression algorithms, data cataloguing and novel APIs to foster EO analytics in cloud computing environments.

Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall K2)

Presentation: Embracing Diversity in Earth Observation with HIGHWAY

#stac

Authors: Luca Girardo, Mr. Simone Mantovani, Henry de Waziers, Mr. Giovanni Corato
Affiliations: Esa Esrin, adwäisEO, MEEO S.r.l.
The European Space Agency (ESA) boasts a wide array of Earth Observation (EO) missions, including Earth Explorer, Heritage and Third-Party Missions. Each of these Mission provides valuable datasets enabling researchers, decision-makers, and scientists to gain deeper insights into the planet's systems. Among these missions, SMOS (Soil Moisture and Ocean Salinity), CryoSat, Proba-V, SWARM, Aeolus, and EarthCARE collectively allow to collect different type of information exploiting a wide range of sensors such as radiometers, optical sensor, multispectral, radar and altimeter. This richness of sensors allows generating a wide spectrum of data products for the monitoring of different physical variables at different spatial and temporal scales. This diversity is essential for capturing the multifaceted dynamics of Earth’s systems, but it also presents challenges in terms of data accessibility, integration, and usability. Indeed, these datasets vary in format, content, and data types. To address this complexity, HIGHWAY offers an innovative solution that bridges the gap between the heterogeneous nature of EO data and the seamless usability required by end users. HIGHWAY provides unicity, adopting the Earth Observation Processing Framework (EOPF) data model as unified approach to guarantee the adequate level of data harmonisation and providing OGC standard services to discover, view and access these diverse datasets while preserving their unique attributes. This capability is driven by several key features: 1. Digital Twin Analysis Ready Cloud Optimized (DT-ARCO) files: HIGHWAY transforms disparate datasets into standardized, cloud-optimized formats designed for analysis readiness. These files are tailored to meet the rigorous demands of modern data analysis workflows, particularly for Digital Twin engines that require high-quality, pre-processed inputs for training and predictions. 2. Unique and seamless endpoint for users: HIGHWAY simplifies access to data by consolidating multiple data sources into a single, intuitive interface. Users can explore and retrieve datasets without needing to navigate the complexity of individual mission archives or disparate data formats. 3. Advanced cataloguing standards: HIGHWAY incorporates state-of-the-art cataloguing protocols, including OpenSearch with Geo And Time Extensions, STAC (SpatioTemporal Asset Catalogue), WMS (Web Map Service), and WCS (Web Coverage Service). These standards enable efficient querying, visualization, and retrieval of spatial-temporal data, enhancing the user experience and supporting diverse application requirements. 4. Native and cloud-optimized data access: HIGHWAY ensures that data is accessible in both in its native and cloud optimized format to meet the different needs of researchers and digital twins. In particular, ARCO data can unlock the potential of large cloud or HPC processing system. One of HIGHWAY’s standout features is its ability to retain the specificity of each dataset while integrating them into a unified system. This is particularly critical for the development and deployment of Digital Twin engines, which rely on the precise characteristics of EO data to produce accurate predictions and insights. By maintaining the integrity of the original datasets, HIGHWAY ensures that these advanced analytical models can fully leverage the richness and diversity of ESA’s EO products. The success of HIGHWAY is underpinned by a robust, high-performance infrastructure that seamlessly combines on-premises, cloud, and HPC (High-Performance Computing) environments. This infrastructure is designed to accommodate the growing demands of EO data users, supporting advanced workflows such as large-scale data analysis, real-time processing, and machine learning. HIGHWAY is also future-ready, incorporating data caching strategies that prioritize importance and relevancy. This ensures efficient data retrieval, reducing latency and enabling faster decision-making for time-sensitive applications. HIGHWAY’s transformative approach to EO data management and access positions it as a critical enabler for the next generation of Earth science applications. By addressing the challenges of data diversity and accessibility, HIGHWAY unlocks the full potential of ESA’s EO missions, empowering users with the tools and resources needed to tackle complex environmental challenges. As the demand for actionable insights from EO data continues to grow, HIGHWAY is ready to evolve, introducing new capabilities that anticipate and meet the requirements of future users. In summary, HIGHWAY embodies the principles of innovation, integration, and inclusivity, turning the challenges of data diversity into opportunities for advancement. By providing a unique and seamless endpoint, leveraging state-of-the-art interoperable standards, and ensuring analysis-ready data, HIGHWAY not only simplifies access to EO data but also enhances its usability for cutting-edge applications. With its high-performance infrastructure and future-oriented design, HIGHWAY stands as a cornerstone for advancing Earth science and fostering sustainable solutions for a changing planet.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall K2)

Presentation: Key Innovations, Challenges, and Open-Source Solutions in Building the Copernicus Data Space Ecosystem STAC Catalog

#stac

Authors: Marcin Niemyjski
Affiliations: CloudFerro
Spatio-temporal Asset Catalog (STAC) has gained significant recognition in both public and commercial sectors. This community-developed standard is widely used to access open data and by commercial providers as an interface for accessing paid resources, such as data from private constellations. However, existing implementations of the standard require optimization for handling datasets typical of big data scales. The Copernicus Program is the largest and most successful public space program globally. It provides continuous data across various spectral ranges, with an archive exceeding 84 petabytes and a daily growth of approximately 20TB, both of which are expected to increase further. The openness of its data has contributed to the widespread use of Earth observation and the development of commercial products utilizing open data in Europe and worldwide. The entire archive, along with cloud-based data processing capabilities, is available free of charge through the Copernicus Data Space Ecosystem initiative. This paper presents the process of creating the STAC Copernicus Data Space Ecosystem catalog—the largest and most comprehensive STAC catalog in terms of metadata globally. It details the process from developing a metadata model for Sentinel data, through efficient indexing based on the original metadata files accompanying the products, to result validation and backend system ingestion. A particular highlight is that this entire process is executed using a single tool, eometadatatool, initially developed by DLR, further enhanced, and released as open-source software by the CloudFerro team. Eometadatatool facilitates metadata extraction from the original files accompanying Copernicus program products and others (e.g., Landsat, Copernicus Contributing Missions) based on a CSV file containing the metadata name, the name of the file in which it occurs, and the path to the key within the file. By default, the tool supports product access via S3 resources, configurable through environment variables. The CDSE repository operates as an S3 resource, offering users free access. The development process contributed to the evolution of the standard by introducing version 1.1 and new extensions (storage, eo, proj) that better meet user needs. The paper discusses the most significant modifications and their impact on the catalog’s functionality. Particular attention is devoted to performance optimization due to the substantial data volume and high update frequency. The study analyzes the configuration and performance testing (using Locust) of the frontend layer (stac-fastapi-pgstac) and backend (pgstac). Stac-fastapi-pgstac was implemented on a scalable Kubernetes cluster and subjected to a product hydration process, leveraging Python's native capabilities for this task. The pgstac schema was deployed on a dedicated bare-metal server with a PostgreSQL database, utilizing master-worker replication, enabled through appropriate pgstac configuration. The presented solution empowers the community to utilize the new catalog fully, leverage its functionalities, and access open tools that enable independent construction of STAC catalogs compliant with ESA and community recommendations.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall K2)

Presentation: openEO - STAC Integration for Enhanced Data Access and Sharing

#cloud-native #stac

Authors: Ir. Victor Verhaert, Vincent Verelst, Ir. Jeroen Dries, Dr. Hans
Affiliations: VITO Remote Sensing
The openEO API serves as a standardized interface for accessing and processing Earth Observation (EO) data. It is deeply integrated within the Copernicus Dataspace Ecosystem (CDSE), providing users with easy access to vast collections of satellite data and computational resources. However, many specialized EO workflows require data beyond CDSE, such as those hosted on platforms like Microsoft’s Planetary Computer or private repositories. To address this limitation, openEO has expanded its capabilities by enhancing its integration with the STAC (SpatioTemporal Asset Catalog) ecosystem. This advancement significantly broadens the range of data sources that can be accessed, processed, and shared within openEO workflows. STAC, an emerging standard for organizing and cataloging geospatial data, is widely adopted for its simplicity and flexibility. It provides a unified framework for indexing and querying EO data from various sources, making it an essential tool for modern geospatial analysis. By extending its support for STAC, openEO aligns itself with the broader trend toward cloud-native geospatial workflows and ensures compatibility with diverse data providers. One of the key developments in openEO is the introduction of the load_stac functionality. This feature enables users to query and access STAC-compliant datasets across multiple platforms, regardless of whether the data resides in public repositories, such as the Planetary Computer, or private repositories tailored to specific projects. This functionality goes beyond CDSE, allowing users to integrate datasets from different sources into a single, cohesive workflow. By combining public and private data collections, openEO empowers users to address unique research and operational needs while maintaining flexibility, scalability and privacy. In addition to expanded data access, openEO now supports saving workflow outputs directly into STAC-compliant formats. By adhering to STAC’s metadata standards, the outputs can be cataloged and shared with ease, fostering greater collaboration and reproducibility within the EO community. By integrating STAC for both data input and output, openEO enhances not only the accessibility of EO data but also the sharing and scalability of derived products. These capabilities are critical for enabling FAIR (Findable, Accessible, Interoperable, Reusable) data principles in geospatial workflows. The ability to retrieve data from diverse sources, process it in the cloud, and store outputs in standardized formats ensures that data flows remain seamless and efficient, even as datasets grow in size and complexity. By integrating deeply with STAC, openEO provides users with a robust, adaptable platform for modern EO analysis. Whether working with massive public datasets or proprietary collections, users can design and execute workflows that meet their specific needs without being constrained by data availibility. This session will delve into the technical details of openEO’s enhanced STAC integration. We will demonstrate the use of the load_stac functionality to query and process datasets from platforms like the Planetary Computer, as well as private repositories. Additionally, we will showcase how processed data can be exported in STAC-compliant formats, highlighting its utility for data sharing and reproducibility. Practical examples will include combining multiple datasets into unified workflows and saving analysis outputs for collaborative projects.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall K2)

Presentation: OpenSTAC: an open spatiotemporal catalog to make Earth Observation research data findable and accessible

#stac

Authors: Dr. Serkan Girgin, Jay Gohil
Affiliations: Faculty of Geo-information Science and Earth Observation (ITC)
In line with Open Science practices and FAIR principles, researchers are publishing their Earth Observation (EO) related research data at public research data repositories, such as Zenodo and Figshare. Files that are in raster grid data formats such as GeoTIFF, NetCDF, and HDF, as well as supplementary vector data, are common in these research datasets. Although such data files include detailed spatiotemporal information, which is very useful to find data for specific regions and time periods, unfortunately this information is currently not effectively utilized by the research data repositories and the researchers are asked to enter spatial and temporal information manually as part of the dataset metadata, which is usually limited to a textual description or simple metadata attributes. Moreover, the repositories also do not provide effective tools and interfaces to search research data by location, e.g. by specifying a geographical extent. Therefore, EO-related research data largely becomes "invisible" to the researchers and can only be found if some keywords match textual location description. This highly limits the findability and accessibility of research data with spatiotemporal characteristics. On the other hand, there are many initiatives that aim to facilitate access to EO data by using modern tools and technologies. One such initiative is the SpatioTemporal Asset Catalog (STAC), which is an emerging open standard designed to enhance access to geospatial data, especially on the Cloud. STAC provides a unified framework for organizing and describing geospatial assets, making it easier for users to discover, access, and work with EO data. It enables data providers to create catalogs of geospatial assets, each with detailed metadata, including spatial and temporal information, formats, and links to data files. This standardized structure improves data discoverability and interoperability across various software tools and platforms, streamlining the process of finding and accessing geospatial data. OpenSTAC leverages the capabilities of the STAC ecosystem and aims to create an open spatiotemporal catalog of public research datasets published at major research data repositories. For this purpose, geospatial data files available in research datasets are analyzed, and spatiotemporal information embedded in these files are extracted. This information is used to create a global STAC catalog, OpenSTAC, which enables the researchers to easily find and access EO research data by using a wide range of open-source tools provided by the STAC ecosystem, including visual data browsers, command line tools, and data access libraries in various languages, e.g. Python, R, and Julia . Hence, it significantly improves the FAIRness of EO research data. This talk will provide a detailed information about the methodology developed to monitor the research data repositories to identify published geospatial datasets, to collect spatiotemporal metadata of datasets by using existing metadata and by analysing and extracting additional information from the geospatial data files, and to update a STAC-based spatiotemporal catalog of the datasets by using the collected information. The methodology's implementation through open-source software will be presented, providing insights into its functionality and practical applications. Additionally, a live demonstration of the operational OpenSTAC platform will showcase its features, capabilities, and real-world applicability, highlighting its role in facilitating seamless integration and execution of the methodology.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall K2)

Presentation: The Future of Data Discovery at CEDA: The DataPoint API

#stac #kerchunk #virtualizarr #zarr

Authors: Mr Daniel Westwod, Mr Rhys Evans, Prof. Bryan, N. Lawrence, Mr David Hassell
Affiliations: Centre for Environmental Data Analysis, STFC, NCAS, Department of Meteorology, University of Reading
A paradigm shift is underway among many data storage centres as the need for cloud-accessible analysis-ready data increases. There are significant issues relating to findability and accessibility with current storage formats, and the existing archives of data which are not suited for aggregation and use in cloud applications. Several technologies are available such as so-called cloud-optimised data formats such as Zarr and Icechunk, and more traditional formats such as NetCDF. Whatever tools are used, aggregation methods are needed to expose simpler views of the underlying objects and/or files. At CEDA we have begun to create and ingest these new file formats, as well as develop new search services to enable fast access to our data. We have also created an API called DataPoint, capable of connecting to our systems and abstracting much of the complexity of different file types, to create the best environment for accessing our cloud data products. Data Storage Typical data access use cases require parts of a file or dataset to be read independently (either because only a part of the data is required, or else the entire data is required but the client manages the reading of it in a piecemeal fashion), and there are generally two approaches that support the reading of only the requested parts: - Reformat: break them up into separate objects that can be individually requested. - Reference: Provide a mechanism to get a specific range of bytes corresponding to a data chunk from a larger object. Whichever approach is used, it is often necessary to reformat or repack the data to provide performant access, but that requires duplication of the data. Whether this step is necessary will depend on a combination of what client tools are being used and what the primary mode of access is (in terms of how it might slice into the data chunks). The client tool landscape is changing rapidly, and so flexibility is needed in organisation such as CEDA. We need to deal with data stored in Zarr chunks, in NetCDF files with kerchunk indexes, and with aggregations defined using a range of mechanisms including the newly formally defined CF Aggregation format. Over the last two years we've developed a tool called ‘padocc’ to handle large-scale generation of Kerchunk reference files or new cloud data stores with Zarr. We are actively working on this tool to provide performance enhancements and are considering the inclusion of upcoming packages like VirtualiZarr to generate virtual datasets. The generated files have been ingested into the CEDA Archive and are accessible to all users - except that no one knows how to use or even find them. Since these technologies are relatively new to most of our user communities, the mechanisms for accessing these types of data are not well known or well understood, and they need to exist alongside more established formats such as NetCDF/HDF. Metadata Records CEDA are also investigating the SpatioTemporal Asset Catalogue (STAC) specification to allow for user interfaces and search services to be enhanced and facilitate interoperability with user tools and our partners. We are working to create a full-stack software implementation including an indexing framework, API server, web and programmatic clients, and vocabulary management. All components are open-source so that they can be adopted and co-developed with other organisations working in the same space. To create the CEDA STAC catalog we have developed the "stac-generator", a tool that utilises a plugin architecture to allow for more flexibility at the dataset level. A range of input, output, and "extraction methods" can be configured to enable metadata extraction across CEDA's diverse archive data and beyond at other organisations. Elasticsearch (ES) was chosen to host the indexed metadata because it is performant, highly scalable and supports semi-structured data - in this case the faceted search values related to different data collections. We have also developed several extensions to the STAC framework to meet requirements that weren't met by the core and community functionality. These include an end-point for interrogating the facet values, as queryables, and a free-text search capability across all properties held in the index. The developments of our search system have also included pilots for the Earth Observation Data Hub (EODH) and a future version of the Earth System Grid Federation (ESGF) search service, for which we have created an experimental index containing a subset of CMIP6, CORDEX, Sentinel 2 ARD, Sentinel 1, and UKCP data to investigate performance and functionality. Discovering and Accessing Data DataPoint is the culmination of these developments to create a single point of access to the data archived at CEDA; the so-called 'CEDA Singularity'. It connects directly to our STAC catalogs and can be used to search across a growing portion of our data holdings to find specific datasets and metadata. What sets DataPoint apart from other APIs is the ability to directly open datasets from cloud formats without the requirement of manual configuration. DataPoint reads all the required settings from the STAC record to open the dataset, to make the interface much simpler for the typical user. With DataPoint, a user can search across a vast library of datasets, select a specific dataset matching a set of constraints and then simply open the result as a dataset. DataPoint handles the extraction of the link to the cloud formats stored in our archive to present a dataset that looks the same regardless of what type or format the data is stored. At all points the data is lazily loaded to provide fast access to metadata, enabling users to get a summary of data before committing to opening and transferring what could be a large volume of data. Currently our STAC catalogs represent only a small fraction of the total CEDA archive which spans more than 40 years of data, totalling over 25 Petabytes. The next step towards greater data accessibility will be to dramatically expand our STAC representations as well as the formats required for DataPoint. We have well established pipelines for generating both, which will become immediately available to all DataPoint users when published to our production indexes.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Hall K2)

Presentation: Atmosphere Virtual Lab: Access atmospheric satellite data as a datacube

#stac #zarr

Authors: Sander Niemeijer
Affiliations: S[&]T
The Atmopshere Virtual Lab aims to provide users with the tools to simplify the analysis and visualisation of atmospheric satellite data. In the past it has provided this by means of a jupyterlab environment that covered the Atmospheric Toolbox components, such as the popular HARP toolset, and dedicated interactive visualisation components for notebooks. One of the main challenges for using data from missions such as Sentinel-5P is having to deal with the data in L2 format. Many types of analysis require a L3 regridding step in order to arrive at an analysis ready form. To remove this step, the AVL has evolved into a cloud hosted platform, providing a wide range of atmospheric satellite data in an analysis ready L3 format, with the data being continuously updated. It leverages popular standards such as zarr for data storage and STAC for discovery to expose the data cubes. AVL brings a novel approach by its use of pyramiding to provide zoom levels in all dimensions, both spatially and temporally, and facilitate fast access through efficient chunking mechanisms. The AVL cloud service comes with a public web client that allows for easy browsing and visualisation of the data cube content and a cloud hosted jupyterlab environment for advanced analysis purposes. We present the current status of the AVL service, its capabilities and design, and plans for the future.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Room 0.96/0.97)

Presentation: Long-time series for ERS-1/2 and Envisat SAR data using Analysis Ready and Composite products approach.

#cog

Authors: Dr. David Small, Fabiano Costantini, Clement Albinet, Dr Sabrina Pinori, Lisa Haskell
Affiliations: University of Zurich, Telespazio UK, ESA/Esrin, Serco Spa
Over the last few years, through the IDEAS-QA4EO service and now through the QA4EO-2 project, ESA’s Heritage Space Programme section has supported several activities in relation to continuous improvement and maximising the usability of archive heritage mission data from SAR instruments alongside the current and future SAR missions. To that effect, Telespazio UK developed a CEOS-ARD prototype processor that enabled the production of CEOS-ARD “Normalised Radar Backscatter” (NRB) output that support the following: • Immediate analysis (by means of ensuring that CEOS-ARD requirements related to radiometric terrain correction (backscatter normalisation), projection of DEM etc. are implemented), • Interoperability (by ensuring that the same gridding and DEM are used as in the Sentinel-2 mission, thus expanding interoperability with Sentinel-1 and potentially the future Sentinel-1 NG, ROSE-L and BIOMASS missions), • Cloud computation capability (by developing the output product in the Cloud Optimised GeoTIFF (COG) format), • Open science compliance (by developing an open-source software for the processor). In order to ensure the correctness of RTC computation, the IDEAS-QA4EO service undertook a project to support an open source RTC processor led by the University of Zurich. The processor has been tested on thousands of Sentinel-1 backscatter products, and supports input from calibrated GRD or SLC product types. The aim of SAR radiometric terrain correction (RTC) is to compensate geometrical slope distortion by knowing the topographic variations within a scene so they can be used not only to orthorectify a SAR image into a map coordinate system, but also to correct for the influence of terrain on the image radiometry. Radiometric normalisation of SAR imagery has traditionally used the local incidence angle together with empirically derived coefficients to try to “flatten” the SAR imagery. However, this approach has been proven to be inefficient as it is an oversimplified model of how terrain slope distorts local radar backscatter. A better method based on SAR images simulations has been developed to model the local area “seen” by the area in the plane perpendicular to the look direction, accounting inherently for foreshortening, layover, and shadow distortions. A map of that local area is used to “flatten” the radar image by performing the normalisation from radar cross section (RCS) to normalised radar cross section (NRCS) in the form of terra-flattened gamma nought. As a further step towards the generation on long-term data series, Telespazio UK has started a project under QA4EO-2, with support from the University of Zurich and Aresys toward development of a composite backscatter processor for Envisat/ASAR with possible later extension to ERS-1/2 heritage data. In this processor, multiple relative orbits (or tracks) are integrated into a single backscatter value representing a set time window. To that end, the local area maps used for the normalisation in the previous RTC step is employed again, this time to calculate for each pixel the appropriate local weighting factors of each track’s local backscatter contribution. The time window can be moved regularly forward in time, and the calculations repeated with newer data to generate multiple seamless wide-area backscatter estimates. The benefits of composite products are well known to users of data from optical sensors: cloud-cleared composite reflectance or index products are commonly used as an analysis-ready data (ARD) layer. No analogous composite products have until now been in widespread use that are based on spaceborne radar satellite backscatter signals. In this work, we present a methodology to produce wide-area ARD composite backscatter images. They build on the existing heritage of geometrically and radiometrically terrain corrected (RTC) level-1 products. By combining backscatter measurements of a single region seen from multiple satellite tracks (incl. ascending and descending), they can provide wide-area coverage with low latency. The analysis-ready composite backscatter maps provide flattened backscatter estimates that are geometrically and radiometrically corrected for slope effects. A mask layer annotating the local quality of the composite resolution is introduced. The multiple tracks available (even from multiple sensors observing at the same wavelength and polarisation) are combined by weighting each observation by its local resolution. The process generates seamless wide-area backscatter maps suitable for applications ranging from wet snow monitoring to land cover classification or short-term change detection. At the conference a complete overview of this long-time data series evolution journey will be presented in detail, aiming to present first sets of backscatter composites.
Add to Google Calendar

Friday 27 June 11:30 - 13:00 (Room 0.96/0.97)

Presentation: The TIMELINE Project: Unlocking Four Decades of AVHRR Data for Long-Term Environmental Monitoring in Europe

#stac

Authors: Stefanie Holzwarth, Dr. Sarah Asam, Dr. Martin Bachmann, Dr. Martin Böttcher, Dr. Andreas Dietz, Dr. Christina Eisfelder, Dr. Andreas Hirner, Matthias Hofmann, Dr. Grit Kirches, Detmar Krause, Dr. Julian Meyer-Arnek, Katrin Molch, Dr. Simon Plank, Dr. Thomas Popp, Philipp Reiners, Dr. Sebastian Rößler, Thomas Ruppert, Alexander Scherbachenko, Meinhard Wolfmüller
Affiliations: German Aerospace Center (DLR), German Remote Sensing Data Center (DFD), Brockmann Consult GmbH
The TIMELINE project (TIMe Series Processing of Medium Resolution Earth Observation Data assessing Long-Term Dynamics In our Natural Environment), led by the German Remote Sensing Data Center (DFD) at the German Aerospace Center (DLR), is an initiative aimed at harmonizing and leveraging four decades of AVHRR (Advanced Very High Resolution Radiometer) data. Since the 1980s, daily AVHRR observations at ~1.1 km resolution over Europe and North Africa have been systematically processed, calibrated, and harmonized to create a unique dataset for long-term environmental and climate-related studies covering more than 40 years in time. TIMELINE addresses key challenges in ensuring data consistency across different AVHRR sensors by correcting for satellite orbit variations and calibration drift. The resulting dataset is aggregated into Level 3 daily, 10-day, and monthly composites, minimizing noise and gaps to support robust and reliable time series analyses. The TIMELINE product suite includes key geophysical parameters such as Normalized Difference Vegetation Index (NDVI), snow cover, fire hotspots, burnt area maps, Land Surface Temperature (LST), Sea Surface Temperature (SST), and cloud properties. Rigorous validation against independent Earth observation and in-situ datasets ensures product accuracy, enabling reliable trend analysis over decades. For instance, TIMELINE supports investigations of climate-related phenomena such as shifting vegetation green-up dates and Urban Heat Island effects. To ensure accessibility and interoperability, TIMELINE products are freely available under an open data policy. By adopting Spatio Temporal Asset Catalog (STAC) metadata standards, the dataset is seamlessly integrable with modern Earth observation platforms, fostering broader use in research and applications. The project also emphasizes continuous improvement through iterative reprocessing and incorporation of the latest advancements in methodology, calibration, and data standards as well as through the integration of recent AVHRR data sets. Updated product versions are regularly released, ensuring that users have access to the most accurate and reliable information for their analyses. This conference contribution will provide a comprehensive overview of the TIMELINE project’s progress, innovations, and contributions to the Earth observation community.
Add to Google Calendar

Friday 27 June 08:30 - 10:00 (Room 1.31/1.32)

Presentation: Cloud-Native Raster Data: Revolutionizing Geospatial Analysis

#cloud-native #stac #zarr #cog

Authors: Vincent Sarago, Zach Deziel, Emmanuel Mathot
Affiliations: Development Seed
The geospatial data landscape is undergoing a radical transformation, driven by the exponential growth of remote sensing, satellite imagery, and environmental monitoring technologies. Traditional approaches to raster data management are rapidly becoming obsolete, unable to keep pace with the increasing volume, complexity, and computational demands of modern geospatial analysis. This talk offers a comprehensive exploration of cloud-native raster data formats, illuminating the technological revolution that is reshaping how we store, access, and process geospatial information. Raster data—the fundamental building block of geographic imaging—has long been constrained by significant technical limitations. Historically, researchers and data scientists faced formidable challenges: downloading entire massive datasets, managing prohibitive storage costs, and navigating performance bottlenecks that could stall critical research and analysis. The transition to cloud-native formats represents a paradigm shift, offering unprecedented efficiency, scalability, and accessibility. This presentation will provide an in-depth examination of the cloud-native raster ecosystem, focusing on groundbreaking technologies that are redefining geospatial data processing. Attendees will gain insights into: ● The evolution from traditional file-based formats to cloud-optimized solutions ● Detailed analysis of cutting-edge formats like Cloud Optimized GeoTIFFs (COGs), Zarr ● Technological innovations that enable partial reads, streaming access, and efficient multi-dimensional data handling ● Practical challenges in raster data management and how cloud-native approaches provide elegant solutions ● Emerging standards and tools, including STAC (SpatioTemporal Asset Catalog) and OGC guidelines The talk will dive deep into the technical mechanisms that make cloud-native formats so powerful. Cloud Optimized GeoTIFFs (COGs), for instance, allow for partial data retrieval, dramatically reducing download times and computational overhead. Zarr and TileDB introduce revolutionary approaches to multi-dimensional array storage, enabling parallel processing and efficient handling of massive datasets across spatial and temporal dimensions. Practical demonstrations will showcase real-world applications using TiTiler, including: ● Dynamic tiling techniques ● Efficient COG creation and validation ● Streaming large-scale geospatial datasets ● Integration with machine learning and advanced analysis workflows Beyond technical capabilities, the presentation will explore the broader implications for various domains: climate research, environmental monitoring, urban planning, agriculture, and beyond. As datasets continue to grow in size and complexity, cloud-native raster formats offer a glimpse into the future of geospatial analysis—a future characterized by unprecedented accessibility, performance, and insight. This talk is designed for data scientists, geospatial professionals, researchers, and technologists seeking to understand and leverage the latest innovations in raster data processing. Attendees will leave with a comprehensive understanding of cloud-native technologies, practical insights into implementation, and a vision of the transformative potential of modern geospatial data management.
Add to Google Calendar

Friday 27 June 08:30 - 10:00 (Room 1.31/1.32)

Presentation: Distributed access to Marine Data with Integrity through a the value chain framework

#cloud-native #stac #zarr #cog

Authors: Piotr Zaborowski, dr Raul Palma, Bente Lilja Bye, Arne-Jorgen Berre, dr Marco Amaro Oliveira, Babis Ipektsidis, Sigmund Kluckner
Affiliations: Open Geospatial Consortium Europe, PSNC, BLB, SINTEF, INESC, Netcompany, IEEE
Effective management and utilization of marine data are critical for advancing our understanding of oceanic systems and addressing global challenges such as climate change, biodiversity loss, and sustainable resource management, as well as local problems like efficient navigation, microplastic monitoring, fisheries management, and power plant management. How can the alignment to standards and vocabularies be effectively implemented across the entire value chain of marine data without unnecessary data providers' burden? What modern approaches can be utilized to formalize definitions in marine data management? How does the integration of linked data impact the interoperability of different marine data sources and services? The authors present a comprehensive approach to data management, embracing EO, marine observations, and citizen science data along the processing pipelines, ensuring coherent access to source and analytical data. Built within the Iliad Ocean Twin project for the marine environment, it focuses on harmonizing, preserving, and enforcing integrity through an operational framework with abstract conceptual models and practical tools and implementations. Technological advancements on various fronts show growing diversity in how data is acquired, stored, and processed, benefitting from the distributed cloud, maturing analytical workflows, and operationalizing EO research-based service. While common data sources like Sentinel and Copernicus Marine Services from the Copernicus missions are well known and provided in standard formats, their details differ depending on the provider. Practically, the application of tools is strongly dependent on the convention used, for example, regarding band distribution along the asset files for EO, the level of support of the CF convention in the NetCDF-like formats, and the best-case selected templates. In practice, the questions about compliance usually require analyzing and reformatting data to the suitable structures. The variety of protocols and formats observed in the project is even higher for the in-situ observations. Standards like ISO/OGC Observations and Measurement share a model with W3C/OGC Semantic Sensor Network (SSN) ontology and Sensor Things API as one of the standards suites, but in many cases, raw data is minimized and meta information obscured on the transmission level and likewise stored on the time series specialized databases. In such instances, meta information must be provided on the side and available on the harmonization layer. This layer will enable data produced or made available via different data sources to be represented according to standard data models and vocabularies, with well-defined semantics, and exposed via standard APIs so that they can be further examined and processed by one or more software tools in a unified manner, leveraging the total value of all the available data. In ILIAD, this common model is provided by the Ocean Information Model (OIM), which harmonizes and aligns relevant cross-domain standards (particularly from OGC and W3C) with domain models, bridging various views on the ocean data and providing a formal representation enabling unambiguous translations between them. ILIAD also provides the mechanisms to transform/lift data into this common model and to integrate it with other related datasets, providing the harmonized data layer that the different data analytics tools can exploit. Finally, ILIAD exposes this harmonized data via standard OGC APIs (e.g., SensorThings API, Features API, etc.) to boost the potential for interoperability with existing and future components. Such harmonization is necessary not only to enable reliable and trustful processing but also to be critical for AI, including explainable AI that will need to understand relationships between information from various sources. Here, proper distinction between similar and same observation types is necessary in a machine-readable format. While the meaning is changing with advanced language models, problems of ambiguous naming like those analyzed in [1] have not yet been solved, and the ambiguity has not been clarified by standards. In practice, on the data consumer side, multiple scenarios were met from online data streaming, analytical clients in Python and R using batch processing but benefiting from data trimming and spatial queries, and interactive visualization tools. On the groundwork of the widely adopted standards and conventions of the international marine, nautical, geomatics, and earth sciences, ICT advancements bring both opportunities and challenges related to data harmonization. Applications require high-resolution data access for numerical modeling and analytics and multi-resolution support for downscaling and visualization, which are discrepant. Likewise, minimized storage and processing costs are preclusive and require tradeoffs. Effectively, various data access methods must be offered. In the Iliad, project number of legacy standards-based services were provided (*DAP, WMS, WCS, WFS, opensearch) together with more modern (STA, Features API, STAC, Environmental Data Retrieval API, Coverages API) with many local data streams. Due to the complex environment, data harmonization was implemented during data check-in and access, depending on the legacy state. OGC location building blocks enable consistency and proper understanding of interrelations, while their application for non-expert users is a significant burden and needs to be scaled up. These definitions include formal schemas, data structures, and vocabularies mappings to common canonical models, examples, tools, and documentation. For example, this way, regardless of the convention used, like variables in EO or data types (variables in NetCDF, parameters in EDR, observable properties in SOSA/STA) and their representations in various APIs, including direct access to data chunks, data can be interpreted in the same way. As a side effect of development parallel to major OGC OWS updates to APIs and standardization of cloud-native formats, experiments have benefited and contributed to the advancements proving cross standards compliance, including non-geospatial standards like data space suites. Implementations have proved the value of both cloud native formats with their block-based access and modern metadata schemes aligned with general-purpose ICT. As a starting point, they offer promising capabilities for direct access and as an underlying layer for advanced APIs. On the other hand, the project shared issues of their efficiency for multi-resolution pyramids for global coverage and still limited support on the application side. Web APIs have proven more effective for platforms that do not require data harmonization on the check-in and need to support basic recalculations on the fly, like reprojections and resolution reductions. The proposed framework integrates advanced data processing and management using applied semantic technologies to facilitate seamless access to diverse marine datasets. Key features include: - Integration with the Ocean Information Model - a multilayer data description ontology marrying spatiotemporal concepts, data dimensionality, domain vocabularies like CF convention, Ocean Essential Variables, marine observables vocabularies, and model outcomes. - Operational Framework: Using standards and best practices for Earth Observation exploitation, like EOEPCA, and adding post-processing templates establishes protocols and guidelines to enforce semantic consistency and data quality throughout the data lifecycle. Interoperability: Enhancing data interoperability through the adoption of open data standards (COG, GeoZarr) and OGC APIs, enabling efficient data sharing and integration for a variety of applications from scientific to visualization - User-Centric Access: Iliad pilots worked closely with the end-users, which imposed to address the requirements of researchers, policymakers, and the general public in accessing and utilizing marine data effectively. The presentation will highlight case studies demonstrating the application of this framework in various marine pilots using variety of EO, climate, marine data. By ensuring coherent data access with preserved semantics, the initiative aims to enhance the reliability and usability of marine data, ultimately supporting informed decision-making and sustainable ocean governance. [1] Strobl, P.A., Woolliams, E.R. & Molch, K. Lost in Translation: The Need for Common Vocabularies and an Interoperable Thesaurus in Earth Observation Sciences.Surv Geophys (2024). https://doi.org/10.1007/s10712-024-09854-8
Add to Google Calendar

Friday 27 June 08:30 - 10:00 (Room 1.31/1.32)

Presentation: The EO DataHub: federating public and commercial EO data sources to deliver an innovative analysis platform for the UK EO sector

#stac #kerchunk #zarr #cog

Authors: Philip Kershaw, Piotr Zaborowski, Alastair Graham, Daniel Tipping, Dave Poulter, Rhys Evans, Fede Moscato, Prof John Remedios, Jen Bulpett, Alasdair Kyle, Alex Hayward, Alex Manning
Affiliations: Centre for Environmental Data Analysis, RAL Space, STFC, National Centre for Earth Observation, Oxidian, Telespazio UK, Open Geospatial Consortium
The Earth Observation (EO) Data Hub is a new national platform that has been developed to serve global EO data and analytics capabilities for UK research, government and business use. It seeks to address challenges facing the EO community identified in the findings of user engagement studies which consistently point to the need to better integrate between different data sources and platforms and provide access to assured datasets in a format which is readily usable for analysis and application. The goals of the Hub then can be summarised in three core objectives: 1) deliver unified access to data from multiple sources, integrating the data outputs from UK-specific expertise in EO and climate science together with data more broadly from other public and commercial sources; 2) provide an integrated environment of tools and services as building blocks to enable developers and EO technical experts to process and transform data creating new value-added products; 3) provide dedicated quality assurance services to better inform users about fitness-for-purpose of data for a given application. Nearing the completion of its initial 2-year pathfinder phase of development, the Hub has been funded through a wider UK EO investment package from UK government. The project is led by NERC’s National Centre for Earth Observation - Leicester University and the Centre for Environmental Data Analysis, at RAL Space, STFC Rutherford Appleton Laboratory. An initial consortium was formed amongst public sector organisations including the Met Office, Satellite Applications Catapult, National Physical Laboratory and UK Space Agency. These were joined by industry partners brought in via three major procurements first, to implement the Hub Platform software (Telespazio), second to provide commercial data sources (Airbus, Planet and Earth-i) and finally implement exemplar applications (SparkGeo, Spyrosoft and Oxidian) to test and validate the Hub Platform’s capabilities as a tool to accelerate to EO data application development. The Hub platform draws directly from ESA’s EO Exploitation Platform Common Architecture (https://eoepca.org) and OGC standards to build an interoperable eco-system of Open-Source tools and services. The major components are: 1) Resource catalogue – a searchable inventory of content provided by or via the Hub (primarily EO datasets) 2) Workflow runner (WR) – based on the OGC ADES (Application Deployment and Execution Service) 3) Workflow and Analysis System (WAS) – Jupyter Notebook service 4) Data Access Services (DAS) – interfaces for access to data for clients to the Hub and integrations with data providers 5) Quality Assurance service (QA) – support for running quality assessments of datasets and storing them in a searchable inventory For the purposes of this submission, we focus on the Resource Catalogue - the ability to assemble discovery metadata from multiple data provides and provide a unified search interface – and the Data Access Services. The DAS provide the interface between the Hub and data providers. Initial data was selected for the platform based on UK strengths and the anticipated needs of the target user community: - Climate observations: The UK Earth Observation Climate Information Service (EOCIS) addresses 12 categories of global and regional essential climate variables. It includes new climate data at high resolution for the UK specifically. - Climate projections – Global: CMIP6; regional: CORDEX and UK high-resolution - Met Office UKCP (UK Climate Projections) - Commercial satellite data: Planet - Planetscope and Skysat; Airbus - Optical and SAR archive - Sentinel data access: Leveraging the SentinelHub and the CEDA Archive including ARD for Sentinel 1 and Sentinel 2 over UK and territories This data integration presented two high-level challenges – the relative disconnect between climate and EO domains and the fundamentally different access process between open public datasets and commercial data products. Climate data is almost entirely represented by gridded CF-netCDF typically equivalent to Level 3 data products and above in the EO world; satellite data is based on scenes and uses alternative data formats such as COG (Cloud-Optimised GeoTIFF). STAC (Spatio-Temporal Asset Catalog) was selected as the standard to support data discovery based on its increasing adoption in the EO sector, its active development community and extensible model making it flexible for the inclusion of new data types. Though STAC has its origins in EO, significant prior work carried out by CEDA for the Earth System Grid Federation (a distributed infrastructure for sharing climate projections data) has established a profile for use of STAC with climate model outputs stored using CF-netCDF, Zarr data formats and Kerchunk a technology to aggregate netCDF files and present them as a Zarr-like interface. Commercial providers Planet and Airbus each provide their own interfaces for data discovery and consequently different strategies were adopted for integration with the Hub. With Planet, it was relatively trivial to develop a STAC façade to their data discovery API. However, the sheer volume of Planet’s data catalogue meant that harvesting all its content into the Hub’s central catalogue would be prohibitive. As a compromise, the top-level catalogues are harvested into the Hub’s catalogue, but subsidiary metadata (STAC items and assets) are discovered by invoking the dedicated STAC proxy service. For Airbus, metadata harvesting was implemented using an established Python client toolkit and the content translated into STAC format. Commercial EO data access follows a flow from discovery to ordering and finally delivery to the user’s chosen location. In addition to the discovery aspects, the project team has been working together to develop a unified interface for data ordering so that a user can select the desired products from Planet and Airbus and arrange for them to be staged into their group workspace on the Hub Platform for subsequent use. This staging can be built as workflow packages. Data adapters enable plug-and-play configuration of future data sources and standards based harmonised data access. Besides a ‘horizontal’ integration across different data providers into the Hub, a ‘vertical’ integration could also be considered i.e. the flow from data producer, provision via the Hub Platform and access by a consuming client application. Fortunately, with new climate observations datasets being developed as part of the EOCIS project (https://eocis.org), it has been possible to have direct dialogue with the data producers and influence how the data is being produced to best meet end-application developers’ needs. The EOCIS project team has agreed to generate STAC metadata files alongside netCDF data products to better facilitate indexing of content into search catalogues. Further still, it has been possible to consider the formulation of STAC content tailored to the specific needs of consumer applications such as TiTiler, an Open-Source map tiling service implementation. The STAC metadata can be integrated at the point of data production to include data ranges and default colour table settings needed by TiTiler. Besides the agreement of metadata content, work has also included the selection of appropriate data chunking strategies for serialization of the data to optimize it for analysis by applications. Recent work to integrate an Xarray interface into TiTiler has meant that it is possible to use it as a unified solution for visualisation of climate (netCDF and Kerchunk and Zarr derivatives) and EO datasets (COG format). This has been used to advantage in SparkGeo’s Climate Asset Risk Analysis Tool (CLARAT) and Spyrosoft’s EventPro Landcover analysis web application. The third application supplier, Oxidian has been tasked with developing integrations and training to facilitate use of the EO DataHub with other applications and platforms. This includes a Python client for the Hub, pyeodh together with example Jupyter Notebooks as well as integrations with QGIS (https://qgis.org). Running in parallel with the data discovery and access capabilities, a QA function has been under development. NPL working with Telespazio have developed a system of QA workflows whereby quality assessments of data products can be carried out using the Hub Platform’s Workflow Runner. These assessments are serialized into a dedicated QA repository which is linked to the data discovery search index, providing users with quality information appended to the data of interest. The current work focuses on QA for optical sensor satellite data but could be expanded to support characterization for other sensor types or even for uncertainty information for climate projections. In summary then, the EO DataHub has sought to build a unique offering which applies UK EO science expertise, but which also builds on the existing capabilities from commercial data providers. The development has demonstrated the value of a federated approach to integrate data sources across both public and commercial providers and EO and climate domains as well as benefits and demands of the public cloud infrastructure deployments concluded in the comprehensive CI/CD and accounting framework. Working in an integrated team for applications, Hub middleware and data suppliers has borne fruit in addressing some of the challenges to narrow the gap between data access and effective utilisation in applications. As next steps, the team are now building on the work of the pilot applications to establish a broader engagement with early adopters and bring the service up to a full operational footing.
Add to Google Calendar

Friday 27 June 14:30 - 16:00 (Room 1.34)

Presentation: Monitoring grassland and pastures at global scale: A multi-source approach based on data fusion

#cloud-native #stac

Authors: Leandro Leal Parente, Lindsey Sloat, Vinicius Mesquita, Laerte Ferreira, Radost Stanimirova, Tomislav Hengl, Davide Consoli, Nathália Teles, Maria Hunter, Ichsani Wheeler, Carmelo Bonannella, Steffen Fritz, Steffen Ehrmann, Ana Paula Mattos, Bernard Oliveira, Carsten Meyer, Martijn Witjes, Ivelina Georgieva, Mustafa Serkan Isik, Fred Stolle
Affiliations: OpenGeoHub Foundation, World Resources Institute, Remote Sensing and GIS Laboratory (LAPIG/UFG), International Institute for Applied Systems Analysis (IIASA), German Centre for Integrative Biodiversity Research (iDiv)
Covering about 40% of the Earth’s surface, grassland and pastures are critical for carbon sequestration, food production, biodiversity maintenance, and cultural heritage for people all over the world. Aiming to provide monitoring solutions for these key ecosystems, the Land & Carbon Lab’s Global Pasture Watch (GPW) initiative is developing four globally consistent time-series datasets: (i) 30-m grassland class and extent, (ii) 30-m short vegetation height, (iii) 1-km livestock densities, and (iv) 30-m bi-monthly gross primary productivity. Conceptualized as building blocks, these products were designed and implemented in a flexible way enabling, for example, local calibration based on in-situ reference points or existing area estimates, and fusion with other land cover products. This study presents the first results of integrating GPW products into a harmonized global map that delineates active grazing areas, pastures with different management intensities, and natural graze and browse-lands. The methodology applies globally and continentally derived thresholds for grassland classes, livestock densities, dominant short vegetation heights and productivity trends, to assign the final class in the integrated product. Aiming to consider the differences in quality and accuracy of the input datasets, a per-pixel uncertainty layer is provided together with the integrated map, enabling a spatial and temporal visualization/analysis of the integration errors. Despite inherent challenges and limitations, the implemented approach is entirely open (based on open source and open datasets) enabling different user's communities to adapt it to regional/local contexts and specific use cases. To further enhance usability and improve accuracy, GPW is actively promoting on-line tools (Geo-Wiki) for collecting, organizing and incorporating user feedback in future collections of the products, through additional reference data, local knowledge and new machine learning methods. All input and output data, including reference samples, are publicly available as cloud-native formats in Zenodo, SpatioTemporal Asset Catalog (STAC) systems and Google Earth Engine. The source code is publicly accessible at https://github.com/wri/global-pasture-watch.
Add to Google Calendar

Of course, some events may have been missed while compiling this list. If you know of other LPS events involving CNG, please create a new issue describing the event and/or add the event yourself (i.e. edit this page, add a new event 'item', and then submit a PR).