How to make your data FAIR
REMBI
The Recommended Metadata for Biological Images
What is REMBI?
REMBI is a community-developed framework that was published in 2021 by experts from the light, electron, and X-ray microscopy communities. It provides practical recommendations for describing biological imaging data with standardized metadata.
REMBI is structured in a clear and accessible way, making it a helpful starting point for researchers who are new to metadata annotation. Its flat structure makes it easy to navigate and offers a practical orientation on what to consider when documenting microscopy experiments.
Check out the REMBI publication!
Why use REMBI?
Scientific data should be Findable, Accessible, Interoperable, and Reusable (FAIR). REMBI supports these principles by defining which metadata should accompany imaging datasets.
Metadata provides the essential context that allows researchers to understand, interpret, and reuse data correctly. Without clear documentation, even high-quality imaging data can lose much of its scientific value.
REMBI delivers structured information that benefits different user groups, including:
- Biologists and life scientists, who need experimental and sample context
- Imaging scientists, who require detailed technical and acquisition parameters
- Computer vision researchers, who depend on well-described data for analysis, modeling, and algorithm development
What does REMBI define?
REMBI is organized into eight main sections that cover the key aspects of biological imaging experiments.
This clear structure makes it flexible enough to support many different imaging methods and research areas, while still being easy to understand and use.
It covers, for example:
- The microscope used and acquisition settings
- The sample type and how it was prepared
- Image processing and analysis steps
- Relevant experimental conditions
- Time points and spatial scales
Together, these elements help ensure that imaging data can be properly understood and reused.
Where is REMBI used?
BioImage Archive – REMBI is used as the metadata model for submissions
EMPIAR (Electron Microscopy Public Image Archive) – REMBI recommendations are adopted
Cell-IDR (Cell Image Data Resource) – REMBI metadata are implemented
Tissue-IDR – REMBI recommendations are adopted.
Image Data Resource (IDR) – REMBI and related metadata recommendations are used
The 8 sections of REMBI
- Study represents the highest level of metadata. It provides an overarching description of your project, including related funding sources and associated publications.
- Study component functions as an organizational unit that structures your data, for example by experiment type or sample category. Each Study Component may contain one or more of the following elements: Biosample, Specimen, Image Acquisition, Image Data, Image Correlation, and Image Analysis (the latter two only if applicable).
- Biosample describes the biological material that was imaged, such as the organism and species, specific cell lines, or genetic background.
- Specimen metadata outlines how the sample was prepared prior to imaging.
- Image acquisition documents the technical details of how the images were captured, including methods and acquisition parameters.
- Image data contains metadata at the individual image level.
- Image correlation (optional) provides metadata on the alignment or integration of images obtained from different imaging modalities.
- Analyzed data (if applicable) includes metadata describing the image processing and analysis workflows applied to the data, where relevant.
Need Help? We’re Here for You!
If you have further questions about REMBI, need support implementing the REMBI guidelines for your datasets, or would like assistance preparing your data for submission to a repository, we are here to support you.
Simply contact our Help Desk, and we will work with you to find the best way to assist you.
MIFA
Metadata, Incentives, Formats, and Accessibility
What is MIFA?
MIFA (Metadata, Incentives, Formats, and Accessibility) is a community-driven framework of guidelines designed to improve how AI-ready datasets, especially annotated image sets used in bioimage analysis, are shared and reused. It was developed by experts from multiple research institutions to address common barriers in dataset reuse and to support better, more standardized data sharing practices in the scientific community.
Check out the MIFA publication!
Why use MIFA?
AI methods, particularly in biological image analysis, depend on access to high-quality annotated data. However, such datasets are often difficult to reuse because they lack consistent standards for metadata, organization, and sharing. MIFA helps overcome these challenges by providing recommendations that make datasets easier to find, understand, and repurpose, which can accelerate the development of new AI tools and improve reproducibility in research.
What does MIFA define?
The MIFA guidelines outline recommendations in four key areas:
Metadata – what descriptive information should accompany a dataset to make it understandable and interoperable.
Incentives – suggestions for encouraging researchers to create, document and share high-quality datasets.
Formats – guidance on preferred data and annotation formats so datasets work across tools and platforms.
Accessibility – how datasets should be published and made discoverable so others can access and reuse them easily.
These components work together to ensure datasets are well-described, usable by others, and organized in a way that promotes reuse in AI development.
Where is MIFA used?
MIFA is intended for use by researchers and data curators in fields that rely on annotated image datasets for AI training and analysis, particularly in biological imaging. It serves as a guideline for dataset creators, repository managers, and tool developers to standardize how data is described and shared, ultimately enabling easier reuse across the scientific community
BioImage Archive – MIFA guidelines are implemented in the annotation files
The 4 main categories of accompanying metadata
To maximize the reuse and long-term value of an AI imaging dataset, documenting metadata in four key areas is recommended:
- Study-Level Metadata
Provides the overall scientific context:
– Short description of the biological question or application
– Authors and related publications
– Use of persistent identifiers, ontologies, and controlled vocabularies
– Clear licensing information (open licenses such as CC0 or CC BY are recommended)
– Links to trained AI models stored in dedicated repositories (e.g., BioImage Model Zoo)
- Image Metadata
Describes how the images were generated:
– Use the REMBI guidelines to document imaging details
- Annotation Metadata
Explains how the data were annotated:
– Authors and contact information
– Annotation overview, type, method, confidence level, criteria and coverage
– Link to the corresponding source images
– Information about transformations and spatial properties
– Date of last modification
- Versioning Metadata
Tracks dataset development over time:
– Metadata and timestamps for each dataset version
– Clear description of changes between versions
– Proper credit to original annotators
– Maintained links to original images, including creator information and licensing
