dc.contributor.author
Schäfermeier, Ralph
dc.contributor.author
Todor, Alexandru-Aurelian
dc.contributor.author
La Fleur, Alexandra
dc.contributor.author
Hasan, Ahmad
dc.contributor.author
Einhaus, Johannes
dc.contributor.author
Paschke, Adrian
dc.date.accessioned
2018-06-08T07:37:33Z
dc.date.available
2016-06-20T10:17:16.079Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/18369
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-22072
dc.description.abstract
Nowadays, a wide range of information sources are available due to the
evolution of web and collection of data. Plenty of these information are
consumable and usable by humans but not understandable and processable by
machines. Some data may be directly accessible in web pages or via data feeds,
but most of the meaningful existing data is hidden within deep web databases
and enterprise information systems. Besides the inability to access a wide
range of data, manual processing by humans is effortful, error-prone and not
contemporary any more. Semantic web technologies deliver capabilities for
machine-readable, exchangeable content and metadata for automatic processing
of content. The enrichment of heterogeneous data with background knowledge
described in ontologies induces re-usability and supports automatic processing
of data. The establishment of “Corporate Smart Content” (CSC) - semantically
enriched data with high information content with sufficient benefits in
economic areas - is the main focus of this study. We describe three actual
research areas in the field of CSC concerning scenarios and datasets
applicable for corporate applications, algorithms and research. Aspect-
oriented Ontology Development advances modular ontology development and
partial reuse of existing ontological knowledge. Complex Entity Recognition
enhances traditional entity recognition techniques to recognize clusters of
related textual information about entities. Semantic Pattern Mining combines
semantic web technologies with pattern learning to mine for complex models by
attaching background knowledge. This study introduces the afore-mentioned
topics by analyzing applicable scenarios with economic and industrial focus,
as well as research emphasis. Furthermore, a collection of existing datasets
for the given areas of interest is presented and evaluated. The target
audience includes researchers and developers of CSC technologies - people
interested in semantic web features, ontology development, automation,
extracting and mining valuable information in corporate environments. The aim
of this study is to provide a comprehensive and broad overview over the three
topics, give assistance for decision making in interesting scenarios and
choosing practical datasets for evaluating custom problem statements. Detailed
descriptions about attributes and metadata of the datasets should serve as
starting point for individual ideas and approaches.
en
dc.relation.ispartofseries
urn:nbn:de:kobv:188-fudocsseries000000000021-2
dc.rights.uri
http://www.fu-berlin.de/sites/refubium/rechtliches/Nutzungsbedingungen
dc.subject.ddc
000 Informatik, Informationswissenschaft, allgemeine Werke::000 Informatik, Wissen, Systeme::004 Datenverarbeitung; Informatik
dc.title
Corporate Smart Content Evaluation
refubium.affiliation
Mathematik und Informatik
de
refubium.affiliation.other
Institut für Informatik
refubium.mycore.fudocsId
FUDOCS_document_000000024703
refubium.mycore.reportnumber
TR-B-16-02
refubium.resourceType.isindependentpub
no
refubium.series.name
Freie Universität Berlin, Fachbereich Mathematik und Informatik
refubium.series.reportNumber
16-2
refubium.mycore.derivateId
FUDOCS_derivate_000000006788
dcterms.accessRights.openaire
open access