Data Management

The Data Division of the Computing and Data Centre of Alfred Wegener Institute is responsible for the publication of data products of environmental research and the research infrastructures from AWI, MARUM and beyond. It provides services and support for data workflows, software tools and data science. The service is provided by the following groups.

Question? Need support? Contact us at: data@awi.de

 

Data Logistics Support

The Data Logistics Support (DLS) group provides services to transfer raw data from measuring devices (SENSOR, DShip, Data-INGEST, sensor network, satellite data transmission, NRT, ...) to the AWI's central storage and archive systems. The corresponding measuring devices can be used either mobile or installed on board the research platforms as well as in the laboratory. It does not matter whether the devices transmit the raw data via satellite from e.g. the Antarctic or whether they are transferred after the expedition using mobile data media like hard disks or USB sticks.

The group members provide comprehensive help and advice in connecting your devices. Our service also includes the metadata description of the measuring device and its use (SENSOR and DShip).

The aim of the data support group is to relieve the scientists of the technical workflows to transmit their data to leave more time to focus on their scientific questions. Practically this means: You describe the device and the raw data transfer just once, after this each deployment is automatically documented. That's all. Subsequently, all raw data are  available in the AWI’s workspace and  archived in PANGAEA with a rich set of metadata.

 

 

Data Science Support

The Data Science Support Group (DSS) focuses on several key areas: data science, data analysis, pattern recognition, artificial intelligence, machine learning, and bioinformatics (multiomics, single-cell analysis, metabarcoding). Leveraging our extensive expertise in AI and machine learning algorithms, we efficiently process large and computationally intensive datasets from the field of bioinformatics.

A core aspect of our work involves comprehensive data management, with a particular emphasis on sample management and the handling of molecular data. We also utilize geographic information systems (GIS) for precise data visualization.

Another crucial component of our mission is to train scientists in data science, bioinformatics, and GIS. Our training programs aim to deepen participants' understanding and skills in handling scientific datasets and equip them with modern data analysis tools and techniques.

Interactive visualization platform (maps.awi.de) and analysis platforms in collaborative development environments (jupyterhub.awi.de and cloud.awi.de)

 

Software Engineering

The Software Engineering (SE) group develops infrastructure components and systems for scientific data management and core compute services. Following SCRUM-based principles during the development process, we are agile in coping with new requirements from science. As an integrative DevOp team, we are also maintaining and running developed applications.

The software development portfolio ranges from metadata management to describe platforms, devices and sensors to automatic data acquisition and transformation up to long-term archiving and publication with PANGAEA. This also includes solutions for web portals and data visualization. In short: Observation to Archive and Analysis - O2A.

Most applications are developed and provided as services for the science community as web applications. The technology stack ranges from Python scripting for data intensive processes over strong Java middleware and backend developments to web technologies. Of course, data modelling, databases and big data approaches are part of our daily work.

We are involved in projects with strong technology focus exploring and using the newest technology for data storage, handling, processing and analyses in tight collaboration with system experts inside and outside of AWI.

 

 

PANGAEA

The PANGAEA group provides essential services for scientific project data management, long-term data archiving and preservation, data publication, and dissemination of quality approved metadata according to the FAIR data principles. Every dataset published is fully citable including  a persistent unique Digital Object Identifier (DOI).

PANGAEA - Data Publisher for Earth & Environmental Science is a joint facility of the Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research (AWI) and the Centre for Marine Environmental Sciences (MARUM) at the University of Bremen.  It is collaboratively developed with the DLS, DSS and SE group as well as its thousands of international users to provide comprehensive data archiving and publication following the FAIR.

Founded in 1992, PANGAEA has demonstrated its long-term perspective by a certification of the ICSU World Data System  and the CoreTrustSeal, and is accredited by the WMO as Data Collection and Processing Center (DCPC) (link zu. The system is operated in compliance with the Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities.

The successful cooperation between PANGAEA and the publishing industry enables the cross-referencing of scientific publications and archived datasets. PANGAEA is the recommended data repository of numerous international scientific journals.

Screenshot of the PANGAEA landing page indicating the dynamic and diversity of datasets published in this long-term archive.

PANGAEA

Head of Data Division
Prof. Dr. Frank Oliver Glöckner

Deputy Head of Data Division
Dr. Angela Schäfer

 

Team of Data Division

Data Logistics Support
Sebastian Immoor

Data Science Support
Dr. Sonja Hänzelmann

Software Engineering
Dr. Roland Koppe

PANGAEA
Dr. Janine Felden

 

 

O2A: The Observation to Archive and Analysis Framework

A generic and sustainable framework enabling the seamless flow of device (sensors) observations to archives and analysis. This framework builds upon international OGC standards for metadata and data interoperability and is meant to assist scientists in developing enhanced data products and in facilitating the data re-use. AWI's data flow framework consists of seven modular and extensible components as depicted below. 

Learn more about our Data Flow Framework.
Have also a look to technical details and documentation in our Wiki here.

Marine Data

The Marine Data Portal is the single-entry point to near real-time data, platforms, expeditions and data visualisation of the German marine research vessels, infrastructures and beyond. Explore the large collection of sensors and our interactive maps. Create your own dashboard to explore the real-time data flow and analyse your data in the workspace.