Help Login Create account

Terms of use

GigaScience Data Use Policies

GigaScience Database

GigaScience and the BGI impose no restrictions on access to, or use of, the data provided. Data generated by the BGI and its collaborators is released into the public domain in accordance with community norms for scholarly communication. Data associated with GigaScience journal is specifically released under a creative commons CC0 license. Such policies are only sustainable if scientific credit is generated for all parties involved, and the BGI is playing its part in developing a global research environment that rewards data sharing.

BGI Data

As one of the world’s largest biological data producers, the BGI’s goal is to maximize the use of its data by providing it to the research community in a timely manner. At the same time, BGI recognizes the need for researchers to be appropriately credited for their scientific contribution and investment in data generation. It is therefore expected that all researchers both honor agreements in line with the Fort Lauderdale and the Toronto International Data Release Workshop data sharing principles and appropriately acknowledge the contributions of others.

Accordingly, raw data such as individual sequence read traces are submitted to the relevant database as soon as they have exited our quality control pipelines. Whole genome sequence assemblies are released as soon as possible following appropriate quality analysis. Our repository contains draft versions of genome sequence assemblies, and we ask that you understand that these represent preliminary data, subject to omissions and errors. In addition, whole genome assemblies are likely to change upon the availability of new data, and our website will document new assembly versions as they are released.

In recognition of the extensive effort that underlies these projects, we ask that you appropriately acknowledge the use of any preliminary data. To aid researchers to find, access, and reuse data we have issued citable digital object identifiers for each dataset. Our recommended format for a data citation of each dataset is listed on each page. This recommendation is in accordance with the adopted guidelines by the genome sequencing community in a statement of principles for the distribution and use of large-scale sequencing data: Community Resource Projects and the resulting NHGRI policy statement. If you have any questions regarding the use of this data, please contact us at web address: In line with research norms we request that you ask us before publishing analyses of the sequences on a genome scale. We welcome collaborative interaction to provide the community with improved whole genome analyses and annotations.


Some of the data provided from external sources may be subject to third-party constraints. Users are solely responsible for establishing the nature of and complying with any such intellectual property restrictions.

GigaScience and the BGI provide this data in good faith, but make no warranty, express or implied, nor assume any legal liability or responsibility for any purpose for which they are used.