Anna Maria Langmüller,
"Detecting CNVs in the 1000 Genomes Project Data Using cn.MOPS and Relating the Results to Transcriptome Sequencing Data"
Detecting CNVs in the 1000 Genomes Project Data Using cn.MOPS and Relating the Results to Transcriptome Sequencing Data
Sprache des Titels:
Copy number variations (CNV) are structural variants in the human genome that can range from approximately 100 base pairs up to several mega bases. The detection and research of CNVs is of enormous scientific interest because some CNVs can cause a changed susceptibility to several diseases. Moreover, CNVs can influence gene expression level.
Although CNVs play an important role in human genetic variation, the determination of copy number variation regions (CNVr) from next generation sequencing (NGS) data is still challenging.
In the course of this master thesis we applied the data processing algorithm cn.MOPS (Copy Number estimation by a Mixture of PoissonS) to the low coverage sequencing files of the 1000 Genomes Project to obtain an accurate map of human CNVs.
Limiting the survey to a CNV map based on the findings of only one algorithm, namely cn.MOPS reduces the number of CNV candidates compared to previous surveys significantly and therefore increases the power of association tests. The association of the integer copy numbers to expression values of a gene located in a predefined search range of 200 kilo bases up- and downstream of a copy number variation region showed that there are a number of CNV-associated eQTLs (expression Quantitative Trait Loci) in the human genome.
The results of this work have confirmed many eQTLs from previous surveys. On top of that, using a one-way Analysis of Variance (ANOVA) instead of a simple rank correlation coefficient test in combination with a profound and accurate CNV map enabled us to detect new CNV-associated eQTLs.
Further experimental analyses are recommended so as to validate the results of this thesis.