Efficient Simulation of Allele-Specific Expression.
DOI:
https://doi.org/10.55632/pwvas.v92i1.720Keywords:
Epigenetics, Transcriptomics, SimulationAbstract
Diploid organisms such as animals and plants carry maternal and paternal variants of most of their genes. Preferential transcription of either gene variant is called ASE for allele-specific expression. In plant seeds, ASE has been observed at selective genes at selective developmental stages, so the process is presumably regulated by epigenetic factors such as genomic imprinting. The Informative Reads Pipeline (IRP) is software that we developed previously for the purpose of detecting ASE in RNA sequencing data obtained from plant seeds. To help us validate and generalize the software, we developed a sequence data simulator that harbors a parameterized model of ASE. Whereas the maternal/paternal ratio per gene is always unknown in real data, the simulator provides the opportunity to quantify IRP’s ability to recover the preset ratios from the data provided. The simulator generates and maps sequences using standard software. Simulating ASE at all combinations of all genes would be computationally prohibitive. Therefore, we introduced an optimization that reduces the generate+map computation from exponential to constant time. Correctness of the optimized simulator is demonstrated here.
Downloads
Published
How to Cite
Issue
Section
License
Proceedings of the West Virginia Academy of Science applies the Creative Commons Attribution-NonCommercial (CC BY-NC) license to works we publish. By virtue of their appearance in this open access journal, articles are free to use, with proper attribution, in educational and other non-commercial settings.