Despite all the complaints about AI slop, there is something ironic about the fact that simply being exposed to it might be a net positive influence for most of society. Discord often begins from the simplest of communication errors after all...
Our experience (https://arxiv.org/abs/2410.16107) is that LLMs like GPT-4o have a particular writing style, including both vocabulary and distinct grammatical features, regardless of the type of text they're prompted with. The style is informationally dense, features longer words, and favors certain grammatical structures (like participles; GPT-4o loooooves participles).
With Llama we're able to compare base and instruction-tuned models, and it's the instruction-tuned models that show the biggest differences. Evidently the AI companies are (deliberately or not) introducing particular writing styles with their instruction-tuning process. I'd like to get access to more base models to compare and figure out why.