As a collaborative, collective initiative spanning the globe, Covid-19 NMR aims to accelerate NMR characterization of SARS-CoV-2 proteins and RNA, and make the results publicly available to advance further research on the structure, dynamics, and interactions of these viral biomolecules. These studies will advance understanding of SARS-CoV-2 biology at the molecular level and help accelerate drug discovery efforts.
Goethe University, together with the company SIGNALS, has lauchned a Covid19-NMR homepage (www.covid19-nmr.com). As part of the rules of participation, members of the Covid19-NMR have agreed to provide instantaneous updates of research results on this internet platform.
The success of the initiative hinges on knowledge sharing, including plasmids and constructs, expression and purification protocols, NMR sample conditions, empirical NMR data, and assignments. This whitepaper outlines a data stewardship plan that addresses the needs for sharing NMR data within the consortium, public access to the data, and long-term data persistence, in accord with the FAIR data principles: Findable, Accessible, Inter-operable, and Re-usable. The plan entails a hierarchy of approaches, with varying levels of FAIRness. Each level of the hierarchy offers data persistence. The three components of the data stewardship plan are summarized in the table below.
All groups participating in the project will be assigned a unique ID/identifier. This identifier should be associated with all data uploads. The mechanism for including the identifier varies depending on the platform.
Level 1: LOGS repository
LOGS is a scientific data management system used for NMR data management and sharing for users within the consortium. It is implemented on central servers in Frankfurt for the COVID19 NMR project. LOGS supports automatic upload directly from NMR spectrometers.
Participants with access to LOGS will use built-in automatic upload (automatic upload is configurable for each instrument used in the measurements). Participants without access to LOGS receive LOGS accounts and will upload data with the provided functionality.
LOGS possesses inbuilt functionality facilitating the export of NMR results to the NMR-STAR file format, enabling BMRB depositions.
Level 2: BMRbig upload (BMRbig.org)
BMRbig is a write-once “front porch” for BMRB that accommodates arbitrary data and can serve as a staging area for assembling subsequent BMRB entries. Covid19-NMR Consortium data should be identified by entering the Covid19-NMR ID in the Organization field on main upload page. Uploads are assigned a Digital Object Identifier (DOI) that can be used to access the data. Data can be augmented by linking subsequent data uploads to the “parent” upload, but once uploaded data cannot be modified or deleted.
Upload of data to BMRbig is accomplished through a web interface that harvests minimal metadata. The Covid19-NMR ID should be included in the “organization” field on the upload web page. Upload to BMRbig directly from LOGS will be implemented.
Level 3: BMRB deposition
Deposition of data into BMRB entails collection of more metadata than the other data stewardship levels, but achieves the highest level of FAIRness as the data is findable based on metadata content. BMRB entries are also subject to curation, including the use of controlled vocabularies, to ensure the consistency and accuracy of the archive. Deposition of data into BMRB is accomplished via BMRBdep, or via OneDep for depositions accompanied by structural data. BMRB IDs and DOIs are assigned to BMRB depositions on submission but are not valid until entry release. LOGS supports export of NMR data and metadata in NMR-STAR format, to facilitate BMRB depositions.
BMRB and PDB deposition of Covid-19 NMR and structure data (respectively) is the ultimate goal, and BMRB will serve as the NMR data repository of record for publications emanating from the consortium.