dc.contributor.author
Iosifidis, Vasileios
dc.contributor.author
Papadopoulos, Symeon
dc.contributor.author
Rosenhahn, Bodo
dc.contributor.author
Ntoutsi, Eirini
dc.date.accessioned
2023-02-02T10:22:24Z
dc.date.available
2023-02-02T10:22:24Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/37528
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-37242
dc.description.abstract
Class imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is a challenging task that requires domain knowledge and moreover, wrong adjustments might lead to overall predictive performance deterioration. In this work, we propose a novel cost-sensitive boosting approach for imbalanced data that dynamically adjusts the misclassification costs over the boosting rounds in response to model’s performance instead of using a fixed misclassification cost matrix. Our method, called AdaCC, is parameter-free as it relies on the cumulative behavior of the boosting model in order to adjust the misclassification costs for the next boosting round and comes with theoretical guarantees regarding the training error. Experiments on 27 real-world datasets from different domains with high class imbalance demonstrate the superiority of our method over 12 state-of-the-art cost-sensitive boosting approaches exhibiting consistent improvements in different measures, for instance, in the range of [0.3–28.56%] for AUC, [3.4–21.4%] for balanced accuracy, [4.8–45%] for gmean and [7.4–85.5%] for recall.
en
dc.format.extent
38 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
Class imbalance
en
dc.subject
Cost-sensitive learning
en
dc.subject
Cumulative costs
en
dc.subject
Dynamic costs
en
dc.subject.ddc
000 Informatik, Informationswissenschaft, allgemeine Werke::000 Informatik, Wissen, Systeme::004 Datenverarbeitung; Informatik
dc.title
AdaCC: cumulative cost-sensitive boosting for imbalanced classification
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.doi
10.1007/s10115-022-01780-8
dcterms.bibliographicCitation.journaltitle
Knowledge and Information Systems
dcterms.bibliographicCitation.number
2
dcterms.bibliographicCitation.pagestart
789
dcterms.bibliographicCitation.pageend
826
dcterms.bibliographicCitation.volume
65
dcterms.bibliographicCitation.url
https://doi.org/10.1007/s10115-022-01780-8
refubium.affiliation
Mathematik und Informatik
refubium.affiliation.other
Institut für Informatik
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
0219-3116
refubium.resourceType.provider
WoS-Alert