Component-based development of large and complex software systems by small well defined building blocks improves the comprehension as well as the management and leads to reusable software modules and a scalable overall system. Accordingly, designing ontologies in a modular way is intuitively promising in order to benefit from the same advantages. However, the status quo is that the most publicly available ontologies are monolithic. For that reason the number as well as the size of available ontologies has increased with the growing utilization during the last years. In order to improve the efficient usage (e.g. through distributed and scoped reasoning for reasoners), to simplify the maintenance (e.g. through refactoring support) and to allow reusable components (e.g. through increased human understandability) there is a need to partition large ontologies into well-sized building blocks in a (semi-) automatic way. Especially from the viewpoint of the Semantic Web reusability is a crucial issue because an agreed common semantic model allows easy data integration and interoperability. Considering ontologies as networks of concepts connected through properties, utilizing network analysis techniques is a promising approach to analyze and partition ontologies. As a very well established discipline in science there are a lot of sophisticated methods, algorithms and tools for network analysis available. This work is driven by the belief that these methods can be modified and applied to ontologies, so that the ontology structure can be used to analyze the content and to identify regions, which can be seen as network "communities" representing subdomains of the ontology. Furthermore, the analysis of the structure enables a first evaluation of the usability by allowing different views, so that existing ontologies can be easier comprehended by ontology engineers. This is very important because refactoring and reusing existing models assume that these models are understood. In this regard, an adaptable structure-based ontology partitioning framework has been designed and implemented that utilizes community detection algorithms from the field of social network analysis. According to the motivation of the partitioning, the framework provides different configurable parameters. By this means the optimal solution for a certain motivation can be achieved. The proposed framework has been evaluated with a gold-standard approach for two concrete ontology partitioning cases. On the one hand, it was analyzed how term chunks from ontology documentation pages of thirteen ontologies can be reconstructed. On the other hand, it was investigated how the modules of four selected modular built ontologies can be reidentified. For both cases, 480 different combinations of configurations have been applied on each ontology. The performance of the framework has been measured with F-Measure similarity function applied on the reference model and the produced partitions. This resulted in very good as well as very bad results. For that reason, the next problem was to define a strategy to select the best configuration for the partitioning process based on the structure of the ontology and the motivation for partitioning. Two different approaches have been used in this regard. Firstly, the results with all ontologies and all configurations have been analyzed statistically. The values for the different parameters, which led to the best results, have been selected. Secondly, assuming that similar ontologies should be partitioned alike, each new ontology that should be partitioned has been compared to already partitioned ontologies with a distance metrics based on structural metrics. After the most similar ontology was identified, the configuration leading to the best results for the already known ontology has been applied on the new ontology. With both approaches similar tools could be outperformed significantly, whereas the similarity based approach led to minimally better results than the statistic approach. The overall result is that for both reconstructing term chunks as well as modular ontologies the reference models could be reproduced up to sixty percent. Even though this value is twice as good as the performance of the similar tools, this does not justify a fully automatic approach for ontology partitioning. However, it could be demonstrated that with the proposed framework at least a semi-automatic approach for ontology partitioning can be realized, that creates an acceptable first result that should be refined manually.
Komponentenbasierte Entwicklung von komplexen Softwaresystemen verbessert die Wartbarkeit und führt zu wiederverwendbaren Softwaremodulen. Ausgehend von dieser Erfahrung wird angenommen, dass die komponentenbasierte Entwicklung von Ontologien ähnliche Vorteile bringt. Allerdings sind die meisten Ontologien monolithisch aufgebaut, so dass mit der steigenden Anzahl online verfügbarere Ontologien auch die Größe und Komplexität mit angestiegen ist. Für die effiziente Nutzung, die einfache Wartbarkeit und die Möglichkeit wiederverwendbarer Komponenten bedarf es daher geeigneter Partitionierungstechniken. Insbesondere im Kontext von Semantic Web ist die Wiederverwendung von Ontologien von essentieller Bedeutung, da diese die webübergreifende Datenintegration und Interoperabilität heterogener Systeme ermöglichen. In dieser Arbeit wird ein strukturbasierter Ansatz zu Partitionierung von Ontologien verfolgt, in dem Ontologien als Netzwerke repräsentiert werden. Diesen wird eine Kantengewichtung hinzugefügt, welches die semantischen Beziehungen innerhalb der Ontologien berücksichtig. Darauf aufbauend wird ein konfigurierbarer Ansatz zur Partitionierung von Ontologien mit Hilfe von Community Detection Algorithmen aus dem Bereich der sozialen Netzwerke erarbeitet. Dabei liegt das Hauptaugenmerk auf zwei konkreten Anwendungsfällen für die Partitionierung, nämlich der Modularisierung von existierenden komplexen Ontologien zur Vereinfachung der Wartbarkeit und der Erzeugung von Begriffsgruppierungen für die Dokumentationsseiten zur Unterstützung der Wiederverwendbarkeit. Anforderungen für beide Fälle werden aus existierenden Lösungen extrahiert, welche im späteren Prozess in einem Goldstandardansatz als Referenzmodell auch zur Evaluation verwendet werden. In experimentellen Analysen des vorgeschlagenen Ansatzes werden die besten Parameterwerte für die jeweiligen Anwendungsfälle ermittelt. Mit diesen wird das System dann mit den bereits existierenden Werkzeugen zur Ontologiepartitionierung SWOOP und Pato verglichen. In diesem direkten Vergleich kann gezeigt werden, dass der hier erarbeitete Ansatz signifikant bessere Ergebnisse als die beiden Konkurrenten liefern kann. Allerdings sind die Ergebnisse nicht so gut, dass davon ausgegangen werden kann, dass ein vollständisch automatischer Prozess für die Partitionierung möglich ist. Der strukturbasierte Ansatz zur Partitionierung kann nur für eine semiautomatische Partitionierung verwendet werden, so dass die Nutzer die Ergebnisse manuell nachbessern müssen.