|
Does anyone know an easy way to get a bed/gff or any text file with annotation information for a transcriptome? The exact issue is the following: Have aligned mRNAseq data to ensembl cDNA v56 fasta file. Created a custom genome in IGV so I can look at the alterations in my pileup file. What would help is to be able to load a bed/gff file that identifies at least the CDS space of each transcript and better yet the exons and amino acids if I can have my cake and eat it too. I've found lots of downloads that annotate the transcriptome to genome but not the entire transcriptome to the transcriptome. Clairification: I have data that maps the transcript segments to the various genome versions. What I'm hoping to find is annotations akin to a Genebank file for a specific transcript (ie. 1-2300=Gene, 46-2134=CDS, 1-500=Exon1, 501-1300=Exon2, 1301-2300=Exon3, etc...) but instead of each transcript I want it in a table format for the entire transcriptome. This does not seem possible on UCSC for instance as all tracks are mapped to the genome. |
|
The UCSC table browser is one place to obtain gene and mRNA tracks in .bed format http://genome.ucsc.edu/cgi-bin/hgTables?org=Human&db=hg19&hgsid=155229247&hgta_doMainPage=1 You might need to parse the files appropriately, but it seems they have the information you are looking for. |
Hi Jon, nice to see you over here! :)