(And ironically this problem is much easier now that we have LLMs to help us clean and massage textual data.)