ISB Datasets

Manually Curated Integrated Structural Biology Multi-Technique Datasets

About this portal

This site lists manually curated examples of integrated structural biology work that combines multiple measurement and modelling approaches on the same biological question. The goal is to make multimodal studies easier to discover, to highlight how techniques are combined in practice, and to point to publications and public data where they exist. Each listed dataset includes at least one modality (or primary data component) collected at SciLifeLab together with complementary methods from partner labs or external repositories. Entries are not an exhaustive catalogue of all Swedish or SciLifeLab structural biology; they illustrate coherent multimodal datasets and how they are described and shared.

What is a structural biology multimodal dataset?

A structural biology multimodal dataset is a coherent collection of primary (raw) and derived data from two or more distinct structural-biology measurement modalities (e.g., cryo-EM, NMR, XL-MS, HDX-MS), all referring to the same biological system (and defined experimental conditions), and explicitly linked so the modalities can be jointly analysed, compared, or integrated into a unified structural or biophysical interpretation.

In practice, it is not simply “a folder with different files”. A multimodal dataset has a few defining properties:

  • Common referent: one sample or entity (or a controlled set of variants), with consistent identifiers, composition, construct sequences, buffers, states, and timepoints.
  • Multiple modalities: at least two techniques that measure complementary aspects (shape, distances, dynamics, heterogeneity, interactions, and similar).
  • Explicit linkage metadata: machine-readable relationships between modalities, plus shared coordinate frames or mappings when relevant.
  • FAIR-by-design packaging: standardised metadata and sharing conditions so the dataset is findable and reusable across platforms and repositories.

What is not a multimodal dataset

  • Single-modality variation only: the same technique under different conditions, replicates, or timepoints.
  • Unlinked multi-technique data: two modalities measured “somewhere in the project” but without explicit sample or condition mapping—that is multi-technique work, not a multimodal dataset in this sense.