Elucidating regulatory mechanisms downstream
Chapter 3 PYICOS: A VERSATILE TOOLKIT FOR THE ANALYSIS OF HIGH-THROUGHPUT SEQUENCING DATA S. We calculated the peak densities in different regions related to genes and found that the region around the proximal promoter and the first exon were the most 36 Figure 4.2: Distribution of PRbs in breast cancer cells A) We describe the distribution of PRbs in different regions related to genes.
4.2 Methods A block-like cluster was defined as such if the ratio between the length covered by the maximum of the peak and the length of the peak was greaterthan 0.25.
1.5 CLIP-Seq Being able to determine the location of specific RNA-binding proteins can give insight into the regulation of alternative splicing.
Thus, alignment of these tagsto the genome results into two peaks, one on each strand, flanking the location where the protein or nucleosome of interest was bound.
A) Before filtering The relative contribution of marks to the epigenetic code With the aim to find the most relevant attributes that appear to determine the regulation of expression, we calculated the information gain (IG)  for all attributes in the subsets HCG-IC and LCG-IC on pair P1 for the unfiltered and the filtered sets (Table 1). Sandelin, “Genomic and chromatin signals underlying transcription start-site selection,” Trends in Genetics: TIG, vol. Chapter 6 DISCUSSION Unlike other 62 Next, we applied the protocol for differential expression analysis on RNA-Seq data on liver and kidney samples, and compared the results to those of othermethods, using the corresponding microarray data as a benchmarking set.
Chapter 1 INTRODUCTION All cells of one individual accommodate identical genomes (genotypes) in the This can be explained by the fact that in different cell types genes are differentiallyexpressed.
The abundance of these different gene products is crucial for the identity of the cell.
The IG analysis confirms the role of some of the histone marks, like H3K9ac and H3K27ac, in the promoter andaround the transcription start site in expression regulation as described before in the literature; and uncovers new regions, like the first intron for H3K36me3, the first exon for H3K4me3 and downstream ofthe polyadenylation site for H3K36me3, where changes in these marks associate strongly with expression regulation. After having shown a high accuracy of expression change predictions between the cell lines K562 and Gm12878 (training set), we could prove that the model trained on this data was generic enough to perform accurate predictions on Hsmm and Hmec (testset), a different pair of cell lines.
6.4 Limitations and future directions Using our pipeline, we have associated epigenetic variations to expression changes on ENCODE data sets and provided the processed data to the public.1.3 High-Throughput Sequencing (HTS) The data is released to the public, thereby enabling the scientific communityto interpret the human genome and apply it to medical research with the aim of improving health.