Dec 3, 2021
Sup Alberto, indeed reading from pdfs can sometimes give some headaches, honestly my approach is that when I can’t read the pdf, or the output is not ideal (like the text is changed in some way) I would either leave it or run some ocr, like I did in this article here: https://towardsdatascience.com/faster-notes-with-python-and-deep-learning-b713bbb3c186
Cheers hope it helps! :)