山大:专注于分析宏基因组中氮循环基因的数据库——NCycDB
  • NCycDB是一个人工校正后的氮循环相关基因数据库,与COG,eggNOG,KEGG和Subsystems等通用数据库相比,收录量更大且更准确;
  • 共收录了68个基因家族的基因,涵盖了8个氮循环过程,目前共有219146个参考序列,其中鉴定了1958组蛋白质同系物;
  • 应用NCycDB分析了52个海洋宏基因组样品中的氮循环基因家族,发现氮循环基因家族的结构和组成与纬度和温度的关系最为密切;
  • 表明NCycDB可在分析宏基因组样品中氮循环基因的研究中发挥促进作用。
主编推荐语
高春辉
山东大学海洋科学与技术研究所在Bioinformatics杂志发表论文,介绍了一个专门用于分析环境样品中氮循环相关基因的数据库。
关键字
延伸阅读本研究的原文信息和链接出处,以及相关解读和评论文章。欢迎读者朋友们推荐!
图片
Bioinformatics [IF:5.61]

NCycDB: a curated integrative database for fast and accurate metagenomic profiling of nitrogen cycling genes

NCycDB:一个用于快速准确的分析宏基因组中氮循环基因的有监管综合数据库

10.1093/bioinformatics/bty741

2018-08-28, Article

Abstract & Authors:展开

Abstract:收起
Motivation: The nitrogen (N) cycle is a collection of important biogeochemical pathways in the Earth ecosystem and has gained extensive foci in ecology and environmental studies. Currently, shotgun metagenome sequencing has been widely applied to explore gene families responsible for N cycle processes. However, there are problems in applying publically available orthology databases to profile N cycle gene families in shotgun metagenomes, such as inefficient database searching, unspecific orthology groups, and low coverage of N cycle genes and/or gene (sub)families.
Results: To solve these issues, this study built a manually curated integrative database (NCycDB) for fast and accurate profiling of N cycle gene (sub)families from shotgun metagenome sequencing data. NCycDB contains a total of 68 gene (sub)families and covers eight N cycle processes with 84,759 and 219,146 representative sequences at 95% and 100% identity cutoffs, respectively. We also identified 1,958 homologous orthology groups and included corresponding sequences in the database to avoid false positive assignments due to “small database” issues. We applied NCycDB to characterize N cycle gene (sub)families in 52 shotgun metagenomes from the Global Ocean Sampling expedition. Further analysis showed that the structure and composition of N cycle gene families were most strongly correlated with latitude and temperature. NCycDB is expected to facilitate N cycle studies via shotgun metagenome sequencing approaches in various environments. The framework developed in this study can be served as a good reference to build similar knowledge-based functional gene databases in various processes and pathways.

First Authors:
Qichao Tu

Correspondence Authors:
Qichao Tu,Lu Lin

All Authors:
Qichao Tu,Lu Lin,Lei Cheng,Ye Deng,Zhili He

评论