Περίληψη: | The elucidation of the underlying mechanisms that link genotypes to expressed phenotypes is one of the main challenges that life sciences face today. One of the steps that can help us reach that goal, is the mapping of the protein-protein interaction (PPI) networks for various species, and especially for human. Pertaining to that, tens of thousands of scientific experiments have been conducted to date, each one uncovering parts of these vast networks. These results are then collected and recorded by primary PPI databases. Unfortunately, these databases exhibit limited overlap, use incompatible terminology and above all, describe their recorded interactions at different levels of genetic reference. Due to how genetic information is organized in living organisms, the mappings from one level of reference to another are non-reversible which results in non-isomorphic projections causing unavoidable introduction of ambiguous and false-positive interactions. The goal of this thesis is the development of a novel modeling and integration methodology that can be applied on multilayered, interconnected domains, called ontological integration, intended to be used on similar challenges as this. Through the application of this method, we developed a meta-database for the human protein interactome, called PICKLE 2.0 (Protein InteraCtion KnowLedgebasE). To facilitate the generation and maintenance of this database, an appropriate algorithm was developed, based on novel data structures that were specifically designed to provide crucial optimizations for biological data. PICKLE is available at: http://www.pickle.gr/.
|