How do AltAnalyze probeset-to-gene and probeset-to-exon mappings differ from those in the Affymetrix annotation files?
Answer: We have found some discrepancies between probeset and transcript cluster mapping to Ensembl genes. Mainly, these issues relate to a probeset or transcript cluster missing the proper mapping. Thus, in AltAnalyze, probesets are aligned to Ensembl genes based on their relative genomic position. If a single probeset maps to only one Ensembl gene or just outside that Ensembl gene (e.g., in the 5' or 3' UTR of the gene) in a one-to-one manner, then the probeset is associated with the gene. When assigning a UTR, exon or intron annotation to a probeset, AltAnalyze checks to see if the probeset coordinates first overlap with an exon, next with an intron and last in a UTR region. Currently, a probeset only has to overlap with an exon to be associated with an exon annotation, however, in some cases, the probeset will only partially overlap with the exon and can bleed into the intron. Although this can lead to improper exon alignments, this does not affect the AltAnalyze analysis, since the probeset to gene assignment will still be valid. This also doesn't affect the probeset to microRNA binding site and protein associations, since these are made by direct sequence alignment. Thus, since some probesets do not align directly to an Ensembl mRNA, but are annotated as overlapping with an Ensembl exon, these probesets can be called by AltAnalyze as regulated but may not appear in DomainGraph (since the alignment strategies vary slightly).