The role of structural DNA properties in transcription regulation.
Rol van structurele eigenschappen bij transcriptieregulatie
Meysman, Pieter; S0105269
Transcriptional regulation is an essential biological process as it is one of the main mechanisms available to all living organisms to control to the expression of their genes to their development and to adapt to the ever-changing conditions of the world that they live in. Research has suggested that the structure of the DNA molecule may impact transcription and its regulation in several different ways. The goal of this PhD is to investigate the role that the DNA structure plays in transcription and how this information can be used to the advantage of biological research.To achieve this goal, we created a general framework for representing structural DNA properties of functional genomic elements involved in transcription, which we have called CRoSSeD. The CRoSSeD framework was designed to use structural scales to derive the structural information and Conditional Random Fields to find relevant common characteristics in the DNA structure of the functional element. The advantages of this framework was demonstrated on the prediction of transcription factor binding sites. We found that CRoSSeD improved the accuracy of the classification of binding sites for given transcription factors and were able to make a set of novel target gene predictions for the model organism Escherichia coli. Further the model produced by the CRoSSeD framework could be directly linked to the binding mechanisms of the TF and this feature was used to construct the family-wide motif for the well-known LacI TF family.We also investigated the transcriptional behavior of the genes in several well characterized bacterial organisms with the eventual goal to link this analysis back to the influence of the DNA structure. To this end, we created comprehensive organism-specific cross-platform expression compendia for three bacterial model organisms (E. coli, Bacillus subtilis, and Salmonella enterica serovar Typhimurium). These compendia were made available to the general scientific community through an access portal, dubbed COLOMBOS, which provides a suite of tools for exploring, analyzing, and visualizing the data within these compendia. Additionally we developed a compendia management and creation system called COMMAND to support and expand the data contained within COLOMBOS. The expression compendia were then used to analyze the expression divergence of the orthologous genes of E. coli and S. Typhimurium. Here we found that the genes involved with essential cellular processes had conserved their expression domains while those associated with pathogenesis had diverged the most.