宏基因组仿真数据生成软件:CAMISIM
  • 宏基因组测序的微生物群落数据集高度多样化,要有效评估多种宏基因组分析软件的性能,需要大量的基准数据集;
  • CAMISIM是生成微生物群落和宏基因组样本仿真数据的程序;
  • 步骤包括:群落设计(选择群落成员及其基因组,分配相对丰富),生成宏基因组测序仿真数据,数据处理(制定分箱和组装的金标准);
  • 通过CAMISIM产生的仿真数据可以用来测试不同进化差异、测序深度和错误率的宏基因组数据在不同分析软件上表现,指导最优方法的选择。
主编推荐语
高春辉
CAMISIM是特别为生成宏基因组和菌群测序仿真数据而设计的软件。该软件生成的数据集,可用于评测不同宏基因组分析工具的性能,指导软件使用时的参数设置,优化分析流程等。
关键字
延伸阅读本研究的原文信息和链接出处,以及相关解读和评论文章。欢迎读者朋友们推荐!
图片
Microbiome [IF:11.607]

CAMISIM: simulating metagenomes and microbial communities

CAMISIM:模拟宏基因组和微生物群落

10.1186/s40168-019-0633-6

2019-02-08, Article

Abstract & Authors:展开

Abstract:收起
BACKGROUND: Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required.
RESULTS: We describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series, and differential abundance studies, includes real and simulated strain-level diversity, and generates second- and third-generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes, we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT, and metaSPAdes, on several thousand small data sets generated with CAMISIM.
CONCLUSIONS: CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with standards of truth for method evaluation. All data sets and the software are freely available at https://github.com/CAMI-challenge/CAMISIM.

First Authors:
Adrian Fritz,Peter Hofmann

Correspondence Authors:
Alice C McHardy

All Authors:
Adrian Fritz,Peter Hofmann,Stephan Majda,Eik Dahms,Johannes Dröge,Jessika Fiedler,Till R Lesker,Peter Belmann,Matthew Z DeMaere,Aaron E Darling,Alexander Sczyrba,Andreas Bremges,Alice C McHardy

评论