Sep 7, 2022
Yes, I definitely noticed that the quality was not in par with what I would expect from a working model. I Haven’t seen any particularly slick ways, just basic text cleaning, techniques that you Could get from a basic NLP course on coursera. I think what you can do is set up your experiments to first figure out what are the ideal prompts (amount of text, types of words and characters to avoid etc…) and then spend some time on the ideal cleaning procedures to get a clean text afterwords. CHeers! :)