How the study of model organisms helps to diagnose and treat rare diseases.

Nowadays sequencing is getting more affordable and commonplace for genetic screening and scientific research, but it uncovers genes with unknown function and many mutations with unknown functional effect. Research of rare human diseases using model organisms is important, because patients suffering from diseases caused by genetic defects need to find out what’s causing their disease and how it can be treated. A review article in Genetics by Wangler et. al gives examples of how model organisms are being used to discover the cause of diseases and the function of genes. The authors also show how scientists are putting human genes and gene variants in model organisms, particularly the fruit fly, Drosophila, and zebrafish, and also describe new efforts and organizations that bring scientists and clinicians together to help diagnose rare undiagnosed diseases of patients.

 

Genetic research using model organisms is important, because many diseases have a genetic origin. Research of genes with conserved sequence or function in model organisms reveals similarities between proteins and disease pathways which are not obvious. Even when the phenotypes between species are different, often the molecular pathways or mechanisms are the same. (Wangler 2017) For example, the study of development of the Drosophila wing identified genetic machinery which causes skeletal and craniofacial defects in humans. As shown by pathogenic variants of genes in the Wnt pathway catalogued in Mendelian Inheritance in Man database (www.omim.org), these mutations cause disease in humans, specifically Robinow Syndrome.  There are many evolutionarily conserved genes which are involved in important biological processes. Homologous genes which are essential in Drosophila and have more than one homologue in humans are eight times more likely to be associated with Mendelian disease. (Wangler 2017, Yamamoto 2014) This demonstrates that research in model organisms has a predictive power, because genes associated with disease are identified and prioritized for further research.

 

Figure 2 shows the workflow of the collaboration of The Undiagnosed Disease Network (UDN) and the Model Organism Screening Center.  First patients apply at the UDN website (UDN Gateway) hosted by the Undiagnosed Diseases Network coordinating center. This organization has 7 clinical sites in the US. Then after they are accepted, they have sequencing done at their two sequencing cores and sometimes metabolomics at their Metabolomics core. If they can’t figure out the disease by combining genotype and phenotype, the patient’s gene/variant information and a description of their condition is sent to the Model Organism Screening Center (MOSC). First the MOSC does a MARRVEL (Model Organism Aggregated Resources for Rare Variant ExpLoration) search of human and model organism online databases at www.marrvel.org. They search for other patients who have similar gene variants. Then when a candidate gene variant is prioritized, it is studied in the Drosophila and zebrafish core. Study of genetic variants in the genome is complicated, and gene variants involved in disease need to be functionally evaluated in vivo in a model organism so that clinicians can correctly identify the problem and understand why it is happening.

 

Whole exome sequencing (WES) is used as a diagnostic tool to find mutations in the exons, or parts of genes coding for proteins. There are 180,000 exons which total about 30 million basepairs or 1% of the genome. (Ng et. al 2009 Nature) The DNA in these sections are transcribed to mRNA, and the mRNA is later translated into protein. The paper describes an example of a young child patient of the UDN whose mutation was identified by the UDN by whole exome sequencing, and then whose data was shared with other genomic scientists using the online GeneMatcher tool (genematcher.org) and emphasizes the importance of the prescreening step.

 

But what about the noncoding DNA, found outside the exons, within the genes in introns, and outside the genes, that makes up the other 99% of the human genome? Next generation sequencing of the whole genome shows all these other sequences too, but there is a lot more information to analyze. Whole genome sequencing (WGS) has resulted in the discovery that there are even RNA’s produced from enhancer regions and the act of transcription there may be important. Noncoding regulatory regions, such as enhancers, silencers, or promoters can also cause disease when mutated, but it depends on the mutation. Single nucleotide polymorphisms (SNP’s) identified by GWAS (genome-wide association studies) are often in noncoding regions, and through systematic genomic studies such as this one, (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5651703/), the authors identified and linked regulatory variants to human cancers.
As stated in the paper, teamwork is paramount for helping the patient. The work of scientists who research model organisms is vital for understanding how the disease is caused, and the work of clinicians directly helps the patient when they are provided with the information. The patients are identifying their own symptoms and trying to figure out what disease they have, and it is through teamwork of all these groups (scientists, clinicians, and patients) that diseases can be cured.

 

References

Wangler, M. F., Yamamoto, S., Chao, H.-T., Posey, J. E., Westerfield, M., Postlethwait, J., … Bellen. (2017). Model Organisms Facilitate Rare Disease Diagnosis and Therapeutic Research. Genetics207(1), 9–27. http://doi.org/10.1534/genetics.117.203067

Yamamoto, S., Jaiswal, M., Charng, W.-L., Gambin, T., Karaca, E., Mirzaa, G., … Bellen, H. J. (2014). A Drosophila genetic resource of mutants to study mechanisms underlying human genetic diseases. Cell159(1), 200–214. http://doi.org/10.1016/j.cell.2014.09.002

Ng, S. B., Turner, E. H., Robertson, P. D., Flygare, S. D., Bigham, A. W., Lee, C., … Shendure, J. (2009). Targeted capture and massively parallel sequencing of 12 human exomes. Nature, 461(7261), 272–276. JOUR. Retrieved from http://dx.doi.org/10.1038/nature08250

Liu, S., Liu, Y., Zhang, Q., Wu, J., Liang, J., Yu, S., … Wang, X. (2017). Systematic identification of regulatory variants associated with cancer risk. Genome Biology, 18(1), 194. article. http://doi.org/10.1186/s13059-017-1322-z

The control of transcription in development and human disease

Transcription, or the process of RNA Polymerase copying DNA into RNA, is an important control point in development, because it is the first step of gene expression. “Thousands of proteins (transcription factors, cofactors, and chromatin regulators) control gene expression programs that start and maintain specific cell states.” (Lee and Young 2013)

In the early embryonic development of the fruit fly, Drosophila melanogaster, a hierarchy of genes control segmentation and determine the body plan. The gene cascade starts with the maternal genes, of which the mRNA is deposited by the mother in the egg, and expressed early in the development of the embryo. The protein Bicoid is expressed in a gradient that is most concentrated in the anterior of the embryo, and decreases toward the posterior of the embryo and levels off. Bicoid activates the transcription of the zygotic gap gene hunchback in the anterior of the embryo. Other gap genes are expressed along the gradient in broad overlapping domains, and regulate the pair rule genes and set their boundaries. Both the gap genes and pair-rule genes code for DNA-binding transcription factors. The pair rule genes regulate the segment polarity genes. These genes establish the development of the 14 segments of the fruit fly’s body. Segmentation is a conserved process in vertebrates.

A segmentation gene, Runt, which is important for the proper expression of other segmentation genes, is involved in three developmental pathways in flies; sex determination, segmentation and, neurogenesis. Runt is the founding member of the RUNX transcription factor family, which has homologues in all animals. In mice, Runx1 is necessary for life, because it is involved in the development of hematopoiesis and the loss of Runx1 in mice is embryonic lethal. In humans, some mutations in Runx1 or AML1 are associated with leukemia, for example, one mutation associated with acute myeloid leukemia is a chromosomal translocation (t8:21) resulting in a fusion protein of AML-ETO. (Reikvam et. al 2011)  The mutant AML-ETO fusion protein has the N-terminal domain of AML1, and the C-terminal domains of ETO. Since it has the wrong C-terminal domain it misregulates transcription.

I have had experience studying transcription factors and enhancers using Drosophila genetics as a graduate student researcher in Dr. John Peter Gergen’s lab. Runt, Opa, Eve, and Ftz are pair-rule transcription factors that regulate the early expression of the gene sloppy-paired and this regulation is mediated by two enhancers, the distal early stripe element (DESE) and the proximal early stripe element (PESE). (Hang and Gergen 2017, Prazak et. al 2010) Enhancers are stretches of DNA, usually a few hundred basepairs, which contain clusters of transcription factor binding sites and mediate regulation by transcription factors.(Sakabe et. al 2012)   They can be located upstream or downstream of the gene or in an intron of the gene, and regulate the gene from a relative distance. Enhancers are often essential for the expression of developmental genes. (Sakabe et. al 2012)

Protein complexes are involved in transcription. Some enhancers affect transcription activation through a protein complex called the Mediator.
Another protein complex, Negative Elongation Factor (NELF), cause RNA Polymerase II (Pol II) to pause and stop making RNA, and repress the gene. A cyclin dependent kinase, P-TEFb, phosphorylates NELF and the C-terminal domain of Pol II, allowing for transition from promoter proximal pausing to transcription elongation. I will write more on this topic in another blog post.

One question I asked in my research was if an enhancer is mediating repression by preventing Pol II release, does that prevent another enhancer near the same promoter from activating transcription? Runt and Ftz repress DESE by preventing Pol II release, and Eve also represses PESE by preventing Pol II release. When these enhancers were each combined with an enhancer from the short gastrulation gene (sog), Runt and Ftz repressed the DESE-sog-lacZ gene, and Eve repressed DESE-sog-lacZ gene. This demonstrated that repression by preventing paused Pol II release is dominant.

 

References

Lee, Tong Ihn and Richard A. Young.  “Transcription regulation and its misregulation in disease” Cell. 2013 Mar 14; 152(6): 1237–1251.

Hang, S., & Gergen, J. P. (2017). Different modes of enhancer-specific regulation by Runt and Even-skipped during Drosophila segmentation. Molecular Biology of the Cell28(5), 681–691. http://doi.org/10.1091/mbc.E16-09-0630

Prazak L, Fujioka M, Gergen JP. Non-additive interactions involving two distinct elements mediate sloppy-paired regulation by pair-rule transcription factors. Developmental biology. 2010;344(2):1048-1059. doi:10.1016/j.ydbio.2010.04.026.

Reikvam et. al “Acute Myeloid Leukemia with the t(8;21) Translocation: Clinical Consequences and Biological Implications” Journal of Biomedicine and Biotechnology
Volume 2011 (2011), Article ID 104631, 23 pages http://dx.doi.org/10.1155/2011/104631

Sakabe et. al “Transcriptional enhancers in development and disease” Genome Biology https://genomebiology.biomedcentral.com/articles/10.1186/gb-2012-13-1-238