RNA-seq & tools

上传人:沈*** 文档编号:141099916 上传时间:2022-08-23 格式:DOCX 页数:9 大小:32.34KB
收藏 版权申诉 举报 下载
RNA-seq & tools_第1页
第1页 / 共9页
RNA-seq & tools_第2页
第2页 / 共9页
RNA-seq & tools_第3页
第3页 / 共9页
资源描述:

《RNA-seq & tools》由会员分享,可在线阅读,更多相关《RNA-seq & tools(9页珍藏版)》请在装配图网上搜索。

1、【转】RNA-seq&tools(2012-07-02 16:28:56) 转载转载自 nacoo2000最终编辑 亲自走路http:/woldlab.caltech.edu/rnaseq/http:/rna-http:/rna-http:/hannonlab.cshl.edu/fastx_toolkit/galaxy.html#fasta_collapserhttp:/www.ebi.ac.uk/Tools/rcloud/http:/rna-http:/rna-http:/bioinformatics.oxfordjournals.org/content/early/2011/01/13/b

2、ioinformatics.btr012.full.pdfsplice RNA-Seqhttp:/www.web-Alternative splicing detection using RNA-seqThe GATK can be used with any BAM file for analysis. The CountLoci and CountReads walkers are really for QC. The Pileup and DepthOfCoverage walkers we actually use for analysis, and may be informativ

3、e for RNA-seq data analysis. But we do not have any specific analysis tools for RNA-seq data, and this is not a development priority for us. Perhaps another member of the community is writing GATK-based tools for RNA-seq analysis, though, and will chime in.http:/en.wikipedia.org/wiki/RNA-Seq转录本结构研究(

4、基因边界鉴定、可变剪切研究等),转录本变异研究(如基因融合、编码区SNP研究),非编码区域功能研究(Non-coding RNA研究、microRNA前体研究等),基因表达水平研究以及全新转录本发现。转录组(transcriptom)广义上指某一生理条件下,细胞内所有转录产物的集合;狭义上指所有mRNA的集合。蛋白质是行使细胞功能的主要承担者,蛋白质组是细胞功能和状态的最直接描述,而由于目前蛋白质实验技术的限制,转录组成为研究基因表达的主要手段。转录组是连接基因组遗传信息与生物功能的蛋白质组的必然纽带,转录水平的调控是目前研究最多的,也是生物体最重要的调控方式。转录组即特定细胞在某一功能状态下

5、所能转录出来的所有RNA的总和,包括mRNA和非编码RNA。转录组研究是基因功能及结构研究的基础和出发点,通过新一代高通量测序,能够全面快速地获得某一物种特定组织或器官在某一状态下的几乎所有转录本及基因序列,已广泛应用于基础研究、临床诊断和药物研发等领域。转录组测序可以得到特定条件下所有mRNA转录本的丰度信息,从而发现新的转录本和可变剪接体。应用领域: 1. 非编码区域功能研究:Non-coding RNA研究、microRNA前体研究等 2. 转录本结构研究:UTR鉴定、Intron边界鉴定、可变剪切研究、Start codon鉴定等 3. 基因转录水平研究 4. 全新转录区域研究 基于

6、Illumina 高通量测序平台的转录组测序技术使能够在单核苷酸水平对任意物种的整体转录活动进行检测,在分析转录本的结构和表达水平的同时,还能发现未知转录本和稀有转录本,精确地识别可变剪切位点以及cSNP(编码序列单核苷酸多态性),提供最全面的转录组信息。相对于传统的芯片杂交平台,转录组测序无需预先针对已知序列设计探针,即可对任意物种的整体转录活动进行检测,提供更精确的数字化信号,更高的检测通量以及更广泛的检测范围,是目前深入研究转录组复杂性的强大工具。http:/rna-http:/rna-=toolsA reasonably thorough table of next-gen-seq s

7、oftware available in the commercial and public domainIntegrated solutions*CLCbio Genomics Workbench-de novoand reference assembly of Sanger, Roche FLX, Illumina, Helicos, and SOLiD data. Commercial next-gen-seq software that extends the CLCbio Main Workbench software. Includes SNP detection, CHiP-se

8、q, browser and other features. Commercial. Windows, Mac OS X and Linux.*Galaxy- Galaxy = interactive and reproducible genomics. A job webportal.*Genomatix- Integrated Solutions for Next Generation Sequencing data analysis.*JMP Genomics- Next gen visualization and statistics tool from SAS. They arewo

9、rking with NCGRto refine this tool and produce others.*NextGENe-de novoand reference assembly of Illumina, SOLiD and Roche FLX data. Uses a novel Condensation Assembly Tool approach where reads are joined via anchors into mini-contigs before assembly. Includes SNP detection, CHiP-seq, browser and ot

10、her features. Commercial. Win or MacOS.*SeqMan Genome Analyser- Software for Next Generation sequence assembly of Illumina, Roche FLX and Sanger data integrating with Lasergene Sequence Analysis software for additional analysis and visualization capabilities. Can use a hybrid templated/de novo appro

11、ach. Commercial. Win or Mac OS X.*SHORE- SHORE, for Short Read, is a mapping and analysis pipeline for short DNA sequences produced on a Illumina Genome Analyzer. A suite created by the 1001 Genomes project. Source for POSIX.*SlimSearch- Fledgling commercial product.Align/Assemble to a reference*BFA

12、ST- Blat-like Fast Accurate Search Tool. Written by Nils Homer, Stanley F. Nelson and Barry Merriman at UCLA.*Bowtie- Ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of 25 million reads per hour on a typical workstation with 2 gigab

13、ytes of memory. Uses a Burrows-Wheeler-Transformed (BWT) index.Link to discussion thread here. Written by Ben Langmead and Cole Trapnell. Linux, Windows, and Mac OS X.*BWA- Heng Lees BWT Alignment program - a progression from Maq. BWA is a fast light-weighted tool that aligns short sequences to a se

14、quence database, such as the human reference genome. By default, BWA finds an alignment within edit distance 2 to the query sequence. C+ source.*ELAND- Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome. Written by Illumina author Anthony J. Cox fo

15、r the Solexa 1G machine.*Exonerate- Various forms of pairwise alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. Authors are Guy St C Slater and Ewan Birney from EMBL. C for POSIX.*GenomeMapper- GenomeMapper is a short read mapping tool designed for accurate read alignmen

16、ts. It quickly aligns millions of reads either with ungapped or gapped alignments. A tool created by the 1001 Genomes project. Source for POSIX.*GMAP- GMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences. Developed by Thomas Wu and Colin Watanabe at Genentec. C/Perl for Unix.*gnum

17、ap- The Genomic Next-generation Universal MAPper (gnumap) is a program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. It seeks to align reads from nonunique repeats using statistics. From

18、 authors at Brigham Young University. C source/Unix.*MAQ- Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina with preliminary functions to handle ABI SOLiD data. Written by Heng Li from the Sanger Centre. Features extensive supporting tools for DIP/SNP det

19、ection, etc. C+ source*MOSAIK- MOSAIK produces gapped alignments using the Smith-Waterman algorithm. Features a number of support tools. Support for Roche FLX, Illumina, SOLiD, and Helicos. Written by Michael Str?mberg at Boston College. Win/Linux/MacOSX*MrFAST and MrsFAST- mrFAST & mrsFAST are desi

20、gned to map short reads generated with the Illumina platform to reference genome assemblies; in a fast and memory-efficient manner. Robust to INDELs and MrsFAST has a bisulphite mode. Authors are from the University of Washington. C as source.*MUMmer- MUMmer is a modular system for the rapid whole g

21、enome alignment of finished or draft sequence. Released as a package providing an efficient suffix tree library, seed-and-extend alignment, SNP detection, repeat detection, and visualization tools. Version 3.0 was developed by Stefan Kurtz, Adam Phillippy, Arthur L Delcher, Michael Smoot, Martin Shu

22、mway, Corina Antonescu and Steven L Salzberg - most of whom are at The Institute for Genomic Research in Maryland, USA. POSIX OS required.*Novocraft- Tools for reference alignment of paired-end and single-end Illumina reads. Uses a Needleman-Wunsch algorithm. Can support Bis-Seq. Commercial. Availab

23、le free for evaluation, educational use and for use on open not-for-profit projects. Requires Linux or Mac OS X.*PASS- It supports Illumina, SOLiD and Roche-FLX data formats and allows the user to modulate very finely the sensitivity of the alignments. Spaced seed intial filter, then NW dynamic algo

24、rithm to a SW(like) local alignment. Authors are from CRIBI in Italy. Win/Linux.*RMAP- Assembles 20 - 64 bp Illumina reads to a FASTA reference genome. By Andrew D. Smith and Zhenyu Xuan at CSHL. (published in BMC Bioinformatics). POSIX OS required.*SeqMap- Supports up to 5 or more bp mismatches/IND

25、ELs. Highly tunable. Written by Hui Jiang from the Wong lab at Stanford. Builds available for most OSs.*SHRiMP- Assembles to a reference sequence. Developed with Applied Biosystems colourspace genomic representation in mind. Authors are Michael Brudno and Stephen Rumble at the University of Toronto.

26、 POSIX.*Slider- An application for the Illumina Sequence Analyzer output that uses the probability files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences. Authors are from BCGSC. Paper ishere.*SOAP- SOAP (Short Oligonucleotide Alignment

27、Program). A program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The updated version uses a BWT. Can call SNPs and INDELs. Author is Ruiqiang Li at the Beijing Genomics Institute. C+, POSIX.*SSAHA- SSAHA (Sequence Search and Alignment by Hashing Alg

28、orithm) is a tool for rapidly finding near exact matches in DNA or protein databases using a hash table. Developed at the Sanger Centre by Zemin Ning, Anthony Cox and James Mullikin. C+ for Linux/Alpha.*SOCS- Aligns SOLiD data. SOCS is built on an iterative variation of the Rabin-Karp string search

29、algorithm, which uses hashing to reduce the set of possible matches, drastically increasing search speed. Authors are Ondov B, Varadarajan A, Passalacqua KD and Bergman NH.*SWIFT- The SWIFT suit is a software collection for fast index-based sequence comparison. It contains: SWIFT fast local alignmen

30、t search, guaranteeing to find epsilon-matches between two sequences. SWIFT BALSAM a very fast program to find semiglobal non-gapped alignments based on k-mer seeds. Authors are Kim Rasmussen (SWIFT) and Wolfgang Gerlach (SWIFT BALSAM)*SXOligoSearch- SXOligoSearch is a commercial platform offered by

31、 the Malaysian basedSynamatix. Will align Illumina reads against a range of Refseq RNA or NCBI genome builds for a number of organisms. Web Portal. OS independent.*Vmatch- A versatile software tool for efficiently solving large scale sequence matching tasks. Vmatch subsumes the software tool REPuter

32、, but is much more general, with a very flexible user interface, and improved space and time requirements. Essentially a large string matching toolbox. POSIX.*Zoom- ZOOM (Zillions Of Oligos Mapped) is designed to map millions of short reads, emerged by next-generation sequencing technology, back to

33、the reference genomes, and carry out post-analysis. ZOOM is developed to be highly accurate, flexible, and user-friendly with speed being a critical priority. Commercial. Supports Illumina and SOLiD data.De novoAlign/Assemble*ABySS- Assembly By Short Sequences. ABySS is a de novo sequence assembler

34、that is designed for very short reads. The single-processor version is useful for assembling genomes up to 40-50 Mbases in size. The parallel version is implemented using MPI and is capable of assembling larger genomes. By Simpson JT and others at the Canadas Michael Smith Genome Sciences Centre. C+

35、 as source.*ALLPATHS- ALLPATHS: De novo assembly of whole-genome shotgun microreads. ALLPATHS is a whole genome shotgun assembler that can generate high quality assemblies from short reads. Assemblies are presented in a graph form that retains ambiguities, such as those arising from polymorphism, th

36、ereby providing information that has been absent from previous genome assemblies. Broad Institute.*Edena- Edena (Exact DE Novo Assembler) is an assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer. Edena is based on the traditional overlap layout p

37、aradigm. By D. Hernandez, P. Fran?ois, L. Farinelli, M. Osteras, and J. Schrenzel. Linux/Win.*EULER-SR- Short readde novoassembly. By Mark J. Chaisson and Pavel A. Pevzner from UCSD (published in Genome Research). Uses a de Bruijn graph approach.*MIRA2- MIRA (Mimicking Intelligent Read Assembly) is

38、able to perform true hybrid de-novo assemblies using reads gathered through 454 sequencing technology (GS20 or GS FLX). Compatible with 454, Solexa and Sanger data. Linux OS required.*SEQAN- A Consistency-based Consensus Algorithm for De Novo and Reference-guided Sequence Assembly of Short Reads. By

39、 Tobias Rausch and others. C+, Linux/Win.*SHARCGS- De novo assembly of short reads. Authors are Dohm JC, Lottaz C, Borodina T and Himmelbauer H. from the Max-Planck-Institute for Molecular Genetics.*SSAKE- The Short Sequence Assembly by K-mer search and 3 read Extension (SSAKE) is a genomics applica

40、tion for aggressively assembling millions of short nucleotide sequences by progressively searching for perfect 3-most k-mers using a DNA prefix tree. Authors are Ren Warren, Granger Sutton, Steven Jones and Robert Holt from the Canadas Michael Smith Genome Sciences Centre. Perl/Linux.*SOAPdenovo- Pa

41、rt of the SOAP suite. See above.*VCAKE- De novo assembly of short reads with robust error correction. An improvement on early versions of SSAKE.*Velvet- Velvet is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454. Need about 20-25X coverage

42、and paired reads. Developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI).SNP/Indel Discovery*ssahaSNP- ssahaSNP is a polymorphism detection tool. It detects homozygous SNPs and indels by aligning shotgun reads to the finished genome sequence. Highly repetitiv

43、e elements are filtered out by ignoring those kmer words with high occurrence numbers. More tuned for ABI Sanger reads. Developers are Adam Spargo and Zemin Ning from the Sanger Centre. Compaq Alpha, Linux-64, Linux-32, Solaris and Mac*PolyBayesShort- A re-incarnation of the PolyBayes SNP discovery

44、tool developed by Gabor Marth at Washington University. This version is specifically optimized for the analysis of large numbers (millions) of high-throughput next-generation sequencer reads, aligned to whole chromosomes of model organism or mammalian genomes. Developers at Boston College. Linux-64

45、and Linux-32.*PyroBayes- PyroBayes is a novel base caller for pyrosequences from the 454 Life Sciences sequencing machines. It was designed to assign more accurate base quality estimates to the 454 pyrosequences. Developers at Boston College.Genome Annotation/Genome Browser/Alignment Viewer/Assembly

46、 Database*EagleView- An information-rich genome assembler viewer. EagleView can display a dozen different types of information including base quality and flowgram signal. Developers at Boston College.*LookSeq- LookSeq is a web-based application for alignment visualization, browsing and analysis of g

47、enome sequence data. LookSeq supports multiple sequencing technologies, alignment sources, and viewing modes; low or high-depth read pileups; and easy visualization of putative single nucleotide and structural variation. From the Sanger Centre.*MapView- MapView: visualization of short reads alignmen

48、t on desktop computer. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China. Linux.*SAM- Sequence Assembly Manager. Whole Genome Assembly (WGA) Management and Visualization Tool. It provides a generic platform for manipulating, analyzing and viewing WGA data, regardless of input type.

49、 Developers are Rene Warren, Yaron Butterfield, Asim Siddiqui and Steven Jones at Canadas Michael Smith Genome Sciences Centre. MySQL backend and Perl-CGI web-based frontend/Linux.*STADEN- Includes GAP4. GAP5 once completed will handle next-gen sequencing data. A partially implemented test version i

50、s availablehere*XMatchView- A visual tool for analyzing cross_match alignments. Developed by Rene Warren and Steven Jones at Canadas Michael Smith Genome Sciences Centre. Python/Win or Linux.Counting e.g. CHiP-Seq, Bis-Seq, CNV-Seq*BS-Seq- The source code and data for the Shotgun Bisulphite Sequenci

51、ng of the Arabidopsis Genome Reveals DNA Methylation Patterning Nature paper byCokus et al.(Steve Jacobsens lab at UCLA). POSIX.*CHiPSeq- Program used by Johnson et al. (2007) in their Science publication*CNV-Seq- CNV-seq, a new method to detect copy number variation using high-throughput sequencing

52、. Chao Xie and Martti T Tammi at the National University of Singapore. Perl/R.*FindPeaks- perform analysis of ChIP-Seq experiments. It uses a naive algorithm for identifying regions of high coverage, which represent Chromatin Immunoprecipitation enrichment of sequence fragments, indicating the locat

53、ion of a bound protein of interest. Original algorithm by Matthew Bainbridge, in collaboration with Gordon Robertson. Current code and implementation by Anthony Fejes. Authors are from the Canadas Michael Smith Genome Sciences Centre. JAVA/OS independent. Latest versions available as part of theVanc

54、ouver Short Read Analysis Package*MACS- Model-based Analysis for ChIP-Seq. MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. M

55、ACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction. Written by Yong Zhang and Tao Liu from Xiaole Shirley Lius Lab.*PeakSeq- PeakSeq: Systematic Scoring of ChIP-Seq Experiments Relative to Control

56、s. a two-pass approach for scoring ChIP-Seq data relative to controls. The first pass identifies putative binding sites and compensates for variation in the mappability of sequences across the genome. The second pass filters out sites that are not significantly enriched compared to the normalized in

57、put DNA and computes a precise enrichment and significance. By Rozowsky J et al. C/Perl.*QuEST- Quantitative Enrichment of Sequence Tags. Sidow and Myers Labs at Stanford. From the 2008 publicationGenome-wide analysis of transcription factor binding sites based on ChIP-Seq data. (C+)*SISSRs- Site Id

58、entification from Short Sequence Reads. BED file input. Raja Jothi NIH. Perl.*See alsothis threadfor ChIP-Seq, until I get time to update this list.Alternate Base Calling*Rolexa- R-based framework for base calling of Solexa data. Projectpublication*Alta-cyclic- a novel Illumina Genome-Analyzer (Sole

59、xa) base callerTranscriptomics*ERANGE- Mapping and Quantifying Mammalian Transcriptomes by RNA-Seq. Supports Bowtie, BLAT and ELAND. From the Wold lab.*G-Mo.R-Se- G-Mo.R-Se is a method aimed at using RNA-Seq short reads to build de novo gene models. First, candidate exons are built directly from the

60、 positions of the reads mapped on the genome (without any ab initio assembly of the reads), and all the possible splice junctions between those exons are tested against unmapped reads. From CNS in France.*MapNext- MapNext: A software tool for spliced and unspliced alignments and SNP detection of sho

61、rt sequence reads. From the Evolutionary Genomics Lab at Sun-Yat Sen University, China.*QPalma- Optimal Spliced Alignments of Short Sequence Reads. Authors are Fabio De Bona, Stephan Ossowski, Korbinian Schneeberger, and Gunnar R?tsch. A paper isavailable.*RSAT- RSAT: RNA-Seq Analysis Tools. RNASAT

62、is developed and maintained by Hui Jiang at Stanford University.*TopHat- TopHat is a fast splice junction mapper for RNA-Seq reads. It aligns RNA-Seq reads to mammalian-sized genomes using the ultra high-throughput short read aligner Bowtie, and then analyzes the mapping results to identify splice junctions between exons. TopHat is a collaborative effort between the University of Maryland and the University of California, Ber

展开阅读全文
温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

copyright@ 2023-2025  zhuangpeitu.com 装配图网版权所有   联系电话:18123376007

备案号:ICP2024067431-1 川公网安备51140202000466号


本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。装配图网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知装配图网,我们立即给予删除!