OCR text derived from digitised books (unknown precise publication dates) in ALTO XML