Bgee: Gene Expression Evolution

Section name

Subsection 1 - H2 section

H3 title

The raw data in .sra format are downloaded from the Short Read Archive (SRA) database. The extracted reads, in fastq format, are mapped to regions of the reference genome, specified in a .gtf file: i) transcribed regions; ii) selected intergenic regions (see below); iii) exon junction regions.

The raw data in .sra format are downloaded from the Short Read Archive (SRA) database. The extracted reads, in fastq format, are mapped to regions of the reference genome, specified in a .gtf file: i) transcribed regions; ii) selected intergenic regions (see below); iii) exon junction regions.

H3 title

The mapping of the reads is performed using TopHat2, which internally uses the Bowtie2 aligner. The maximum number of mappings allowed for a read is set to 1. The intergenic regions are chosen in such a way that the distribution of their lengths matches the distribution of lengths of the transcriptome. The minimal distance of boundaries of intergenic regions to the nearest gene is 5 kb. Reads that map to the features are summed up using the htseq-count software. The RPK (read per kilobase) value for every feature is obtained by dividing the number of reads that match a given feature by its length.

Back to the top

Subsection 2 - H2 section

Back to the top

Subsection 3 - H2 section

Back to the top