Electronic
Laboratory Notebook

 

The emerging popularity of Open Science has lead to more and more data being publicly shared across the community. Openly shared data require rich metadata in order to ensure FAIR principles. Further, with larger numbers of smaller datasets being shared it becomes critical to be able to combine these datasets in order to use them successfully. This requires a structured way to describe experimental setup and context.

Our goal is to provide an open electronic laboratory notebook (ELN) platform that enables the description, capture, and search of metadata related to a neuroimaging experiment. The platform consists of three metadata components:

Description | Consists of a structured metadata specification based on controlled terminology that describes the experimental context (e.g. project, lab), the laboratory setup (devices used), and the paradigm (stimuli used, responses etc.)

Capture | The capture component is a user friendly way for users to provide metadata into the platform. This is integrated into the XNAT data management tool.

Discovery | The data search component will enable semantically enabled queries across the metadata via the NEXUS platform.

With this platform we hope to provide the community with a toolset that will enable researchers to describe their experiments in a machine readable manner, share datasets with rich, semantically enabled metadata and perform in-depth searches across shared datasets.

 
 

Access ELN Source Code

Access ELN Source Code

Meet the Team

  • Adeel Ansari

    Knowledge Engineer Lead

  • Mohanna Ramaratnam

    Sr Programmer Analyst

  • Praveen Sripad

    Scientific Software Engineer

  • Niccolò Bonacchi

    Data Architect

  • Tanya Brown

    Scientific Project Manager

  • James Dickson

    Senior Director of Clinical Solutions and Support

  • Blake Griggs

    Sr Account Manager

  • Sean Hill

    Scientist

  • Daniel Marcus

    Chief Scientific Officer

  • Lucia Melloni

    Scientist

The aim is for reproducibility.

We developed an ELN platform that enables reproducible workflows of COGITATE. The platform enables researchers to define experiment protocols, capture and manage experiment metadata associated with these protocols, and publish the resulting data sets to a semantically enabled search engine.

 

Supported by the
Templeton World Charity Foundation


TWCF0486: A Collaborative ELN for Open, Shared and Reproducible Data-Driven Science; https://doi.org/10.54224/20486


TWCF0485: A user-friendly ELN to accelerate brain research
(Pilot Study)

 

Key Features of our ELN

 
  • Making it open will allow adherence by the scientific community – contrary to paywall subscriptions to companies. Engaging the community will also have the potential to define standards e.g., defining what is the minimal information to be ingested in an ELN. Datasets could then be given a DOI, their provenance could be traced and those datasets of higher quality leading to a number of further discoveries can be acknowledged with citations which can further incentivize data sharing and documentation. In sum, a well-made open source electronic laboratory notebook is necessary to foster open and reproducible science

  • We are committed to FAIR principles to increase the transparency, interpretability, and reproducibility of brain research. In accordance with the FAIR Guiding Principles (Wilkinson et al, 2016), our software enables researchers to clearly define, annotate, curate, and share their studies with rich metadata and defined protocols that enable better discovery and reuse. This work compliments existing resources supported by the GO FAIR initiative and other programs committed to FAIR principles. Currently there is no FAIR software that provides adequate tools for defining neuroscience experimental protocols and then capturing associated laboratory data from MRI, EEG, MEG and other common neuroscience modalities. Tools like the Castor Electronic Data Capture (EDC) platform focus more on form-based clinical assessments rather than instrument-based data collection. Similarly, the Data Stewardship Wizard guides researchers in developing FAIR data management plans but does not focus on defining specific experimental protocols and capturing associated study data. We therefore believe our proposed work fills a critical unmet need in the neuroscience community. Specific implementation strategies to support FAIR science are detailed below.

  • We propose to develop an open source laboratory notebook that incorporates the neuroimaging modalities of COGITATE. These also prove to be the most common neuroscience techniques used for human neuroscience i.e., functional magnetic resonance imaging (fMRI), electro-magnetoencephalography (EEG), and electrocorticography (ECOG). We aim to use this study as a live testbed for the proposed ELN, to ensure its utility and usability, while also following best practices for open source software development to ensure scalability to other projects.

  • The existing software allows a study coordinator to define a Protocol within the XNAT web application, consisting of a set of visits, and for each visit, a set of experiments to be conducted. Optionally, assessments on these experiments (eg. QA or processing) can be specified, as well. The defined Protocol guides data collection, offering an intuitive interface for end users to easily initiate experiments per its guidelines. The work proposed here will extend our current Protocol concept which can be integrated with protocols.io data structures. Protocols.io is an emergent technology for documenting scientific protocols predominantly as machine-readable, structured content that is highly complementary to our proposed platform. Protocols.io has been initially used to document molecular and computational protocols which can be cited through a DOIs from a paper’s materials and method section. Our focus is primarily on human cognitive neuroscience, for which capabilities of protocols.io can be expanded and complemented with our current Protocol concept. Users will thus be able to import protocols.io definitions created elsewhere for use in our app and to export Protocols from our app into the protocols.io format, obtaining a DOI. Utilizing the underlying protocols.io structure will enable our app and downstream data management and search components to validate data compliance and search criteria.

  • The data capture component of the ELN will be implemented as a web interface as well as an iPad app that provides an elegant user-friendly experience that users are eager to interact with as they collect data.

    The database will also track and record the processing and analysis workflows performed by scientists on the data as machine readable provenance records. The database system will build on the open source XNAT informatics platform, which is widely used for managing experimental data and related metadata across many areas of neuroscience research.

    XNAT is a web-based software platform designed to facilitate common management and productivity tasks for imaging and associated data. It consists of an image repository to store raw and post-processed images, a database to store metadata and non-imaging measures, and user interface tools for accessing, querying, visualizing, and exploring data. XNAT supports all common imaging methods, and its data model can be extended to capture virtually any related metadata. XNAT includes a DICOM workflow to enable exams to be sent directly from scanners, PACS, and other DICOM devices. XNAT’s web application provides a number of productivity features, including data entry forms, searching, reports of experimental data, upload/download tools, access to standard laboratory processing pipelines, and an online image viewer.

  • The database component will be searchable through a semantically-enabled search engine, to be built on Blue Brain Nexus - Knowledge Graph. Nexus is currently used by CAMH for its BrainHealth Databank, the Blue Brain Project, Human Brain Project and eBrains.

    Nexus combines a data store (Cassandra), triple store (BlazeGraph) and document store (ElasticSearch) to provide a high performance, scalable and flexible metadata store, data management and search engine capability. The architecture uses an event stream to asynchronously maintain multiple optimized search indices while supporting large numbers of concurrent users. Nexus uses JSON-LD, RDF, and SHACL natively to represent and validate data schemas. This enables highly general data representations, supporting community developed and maintained data schema standards, including schema.org, bioschemas.org, and neuroshapes.org.

    Existing data schema standards are adopted wherever possible including those developed for BIDS and the NeuroImaging Data Model (NIDM). These community models support provenance tracking and reproducible pipeline analysis of brain imaging data. Other schemas including DATS (supported by schema.org) enable the registration and search of datasets and is a widely adopted standard, including by Google Dataset Search. Neuroshapes.org is a community effort (and INCF special interest group) to establish open standards for neuroscience data including single cell morphology, electrophysiology, connectivity, image stacks, brain atlases and computational models. These schema repositories will be integrated to adopt standard web-compatible interoperable data schema

    Nexus provides a web interface, command line interface, REST API, Python software development kit (SDK), and JavaScript SDK to facilitate integration with other platforms and developing web interfaces. For security, Nexus supports granular permissions systems to provide access across data repositories, allowing for combinations of private, protected and public information, depending on a user’s role within a study, accessing only specific data types or cohorts as determined by data access governance. Building on Nexus, we will develop a web portal and search engine to discover, access, track and share data.

    The search engine will address the core functional layers of interoperability, for sharing data between teams and allowing a central source for open datasets:

    Findability: Nexus ensures that data are findable using PIDs, detailed metadata and a flexible graph database for search and discovery. Unique persistent identifiers are used throughout the Knowledge Graph structure. These identifiers ensure that data can be uniquely located within the graph and recalled rapidly when needed. The core and extensible schemas include highly detailed metadata, to ensure data are findable as volumes scale. The semantic structure of the Knowledge Graph supports flexible semantic search to rapidly traverse the database. Data are coordinated using a naturalistic structure allowing for easy and rapid identification of pertinent dataset.