# generand
Generate random genomic data in FASTA/FASTQ, SAM, or VCF format, output
to the standard output.
Output data is completely random and suitable for basic testing and
benchmarking of new programs, or for quickly generating small samples for
academic purposes.
Generand is especially useful for generating large test inputs on-the-fly
so that they need not be manually generated and/or stored.
Currently the randomization is very simplistic, but we will likely add
paramaters in the future to allow generating more realistic data for specific
situations.
# Usage
generand fasta sequences sequence-length
generand fastq sequences sequence-length
generand sam chromosomes alignments-per-chromosome sequence-length
generand vcf chromosomes calls-per-chromosome samples
# Description
generand fast[aq] sequences sequence-length
generates a FASTA or FASTQ stream of
"sequences" sequences, each of length "sequence-length". The sequence
content is random with a uniform distribution of bases, so that GC content
should be very close to 50%.
PHRED scores in FASTQ streams are generated in blocks of equal scores and
are mostly high-quality. The last few scores are lower quality and
independent to simulate Illumina sequencing, where quality tends to drop
near the end of each read.
generand sam chromosomes alignments-per-chromosome sequence-length
generates a SAM stream with chromosomes * alignments-per-chromosome total
alignments. It outputs increasing indexes for QNAME and CHROM, randomly
increasing POS, random QUAL scores, and random sequences and PHRED scores
as stated for FASTQ above.
generand vcf chromosomes calls-per-chromosome samples
generates a VCF stream with chromosomes * calls-per-chromosome calls.
It outputs chromosomes with increasing indexes, randomly increasing POS,
uniformly random REF and ALT, uniformly random QUAL scores, and random
sample columns including GT (genotype), AD (allelic depth) and DP (depth).
REF counts are always >= ALT counts in the AD data and DP = REF count + ALT
count.
迷荆
- 粉丝: 65
- 资源: 4720
最新资源
- 基于Springboot+Vue植物健康系统-毕业源码案例设计(95分以上).zip
- 基于Springboot+Vue制造装备物联及生产管理ERP系统-毕业源码案例设计(高分毕业设计).zip
- 基于Springboot+Vue智慧图书管理系统设计与实现-毕业源码案例设计(高分项目).zip
- 多模态大语言模型领域进展分享
- 伟创SD600方案伺服EtherCAT电路图说明书代码
- 基于Matlab实现无线传感器网络WSN仿真(源码).rar
- 西门子S7-1200控制v90伺服PN通讯完整项目程序 1.PN总线通讯控制v90伺服 ModbusRTU通讯 西门子HMI人机界面控制 2.程序可以直接复制使用,全套EPLAN图纸;包括设备图纸
- 技术革新引领未来-生成式AI塑造核心发展引擎白皮书
- Matlab dSPACE 永磁同步电机控制 基于dspace的永磁同步电机矢量控制系统模型,可在dspace实验平台开展实验
- 大厂扫地机器人 源代码,freertos实时操作系统,企业级应用源码,适合需要学习嵌入式以及实时操作系统的工程师,32端代码能实现延边避障防跌 落充电等功能 硬件驱动包含 陀螺仪姿态传感器bmi1
- 利用MATLAB GUI设计平台,设计多算法雷达一维恒虚警检测CFAR可视化界面,通过选择噪声类型、目标类型、算法类型,手动输入相关参数,可视化显示噪声波形与目标检测的回波-检测门限波形图
- 基于pymodis库的MODIS卫星数据自动化下载方法与应用
- 基于Python+Flask+Vue深度学习的肿瘤辅助诊断系统源码+文档说明(高分毕设)
- 改进蚁群算法 改进flod算法对路径进行双向平滑度优化,提高路径的平滑度 自己研究编写的Matlab路径规划算法 蚁群算法路径规划 自己研究算法对比 可自行设置起始点,目标点,自由更地图
- 2024年智算平台运维运营技术研究报告
- 新能源汽车VCU开发模型及控制策略 目前各大行业都纷纷跨行做新能源汽车,紧缺VCU工程师,特别是涉及新能源三电系统,工资仅仅低于无人驾驶、智能驾驶岗位 内容如下: 新能源汽车整车控制器VCU学习
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
评论0