A novel automated label data extraction and data base generation system from herbarium specimen images using OCR and NER