dc.contributor.author
Takano, Atsuko
dc.contributor.author
Cole, Theodor C. H.
dc.contributor.author
Konagai, Hajime
dc.date.accessioned
2024-03-14T10:23:17Z
dc.date.available
2024-03-14T10:23:17Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/42819
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-42535
dc.description.abstract
Digital extraction of label data from natural history specimens along with more efficient procedures of data entry and processing is essential for improving documentation and global information availability. Herbaria have made great advances in this direction lately. In this study, using optical character recognition (OCR) and named entity recognition (NER) techniques, we have been able to make further advancements towards fully automatic extraction of label data from herbarium specimen images. This system can be developed and run on a consumer grade desktop computer with standard specifications, and can also be applied to extracting label data from diverse kinds of natural history specimens, such as those in entomological collections. This system can facilitate the digitization and publication of natural history museum specimens around the world.
en
dc.format.extent
8 Seiten
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
Classification and taxonomy
en
dc.subject
Computational biology and bioinformatics
en
dc.subject
Data acquisition
en
dc.subject
Machine learning
en
dc.subject.ddc
500 Naturwissenschaften und Mathematik::570 Biowissenschaften; Biologie::570 Biowissenschaften; Biologie
dc.title
A novel automated label data extraction and data base generation system from herbarium specimen images using OCR and NER
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.articlenumber
112
dcterms.bibliographicCitation.doi
10.1038/s41598-023-50179-0
dcterms.bibliographicCitation.journaltitle
Scientific Reports
dcterms.bibliographicCitation.number
1
dcterms.bibliographicCitation.volume
14
dcterms.bibliographicCitation.url
https://doi.org/10.1038/s41598-023-50179-0
refubium.affiliation
Biologie, Chemie, Pharmazie
refubium.affiliation.other
Institut für Biologie
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.isPartOf.eissn
2045-2322
refubium.resourceType.provider
WoS-Alert