I am working at National library in Slovenia on the IMPACT project of digitazing and OCR books from 19th century. It aims to significantly improve access to historical text and to take away the barriers that stand in the way of the mass digitisation of the European cultural heritage.
We are working also with xml files - there are about 5.000 files.
We are changing (find-raplace) some mistakes in them with Text crawler.
We have already correct some mistakes - background color, color of fonts, etc., but we can't find the Regular expression for finding (searching) and replacing (deleting) useless (superfluous) white spaces in some part of texts - at the end of the line.
Here is an example:
<Unicode>This is an example
of our text to show you
what we need to do.</Unicode></TextEquiv></TextRegion>
I have marked white spaces with the - we want to find them and delete them with the correct Regular expression.
Thank you for your answer in advance.