• I went right on to the practical applications and the results, without spending enough time motivating.
    • First, we want a reliable set of benchmarks that we can run a genome annotation through to get a quick first look at a genome. Is it normal? Are the genes much longer/shorter than related genomes? What about the lengths of intergenic regions?
    • Second, we want to really understand how genes are organized in the eukaryotic genome.
      • What do we really know about how genes are situated?
      • There are well-known cases of densely packed genes, such as virulence resistance genes and HOX clusters, the latter resulting from gene duplications that have held position over evolutionary time.
      • But how widespread is this throughout the genome? To what extent are genes densely packed in the genome? What implications does this have for transcription and for molecular evolution?
    • Third, genome assemblies and annotations vary widely in quality. iLoci provide a granular representation of the genome, which we can use to select those parts we're confident in. Want to train a hidden Markov model for gene finding? Want to analyze codon usage? Want to calculate dN/dS ratios? Don't do it on the whole genome, select the iLoci you're confident in!
      • Independent of how good the annotation is generally, these XXXX gene models are definitely reliable.
  • The iLocus shuffling procedure might need to be refined.
    • Currently, the delta extensions are retained when re-positioning the giLoci.
    • This will definitely bias the results away from ziLoci and miLoci.
    • Better to randomly distribute giLoci with delta=0.

Comments