Had a call with Reddit to discuss pricing

>>robbie+(OP)
I have a story to tell, about the demise of one of the largest internet forums in my language.

About ten years ago, when smartphones just started appearing, the forum did not have a mobile version, and there are various 3rd party clients on the App Store or Android Market.

Later on, one of the largest 3rd party client was blocked, because of they hammering the forum's servers too hard,. Or something about caching and stealing ad revenue.

Then a couple years later, in 2017, the 3rd party client's devs launched its own forum reusing the client's name. It exploded in popularity and quickly took over as the most popular message board among the youth.

The old forum now has a sort of boomer or mentally ill stigma to it.

I hope to see Apollo go down this route.

Oh, and I think both forums in the story did not monetize as hard as reddit going to paid awards and memberships.

One more thought: Keep the Apollo UI or whatever thing the users are most familiar with. Most of them do not care if it is fediverse or open source or backed by web-scale k8s, they only want it to just work (tm) good enough to post things on it. Eat the lunch you prepared yourself.

>>gaudat+XX
The tech isn't the challenge with something like Reddit, even the comically inept Reddit leadership could figure it out after all.

The difficult part is finding a few hundred mods willing to work for you for free, filtering all the filth that tries to be posted.

Only if they have a solution for that can try going their own way.

>>Devast+Lf1
It became a lot more possible this year to start doing AI content moderation. It won't be perfect, of course, but human mods are probably worse.

and yes, I've used content moderation AIs in the past (like Google's Perspective API) and they're not really usable. OpenAI moderation endpoint, embeddings classification, or even just gpt3.5-turbo would work marvelously.

>>tornat+Sg1
Have you tested this on a small scale?

>>Michae+Ar1
I have a reddit post database with ~6 million unique post titles and through various manual means I've identified ~100,000 of these post titles that _are_ spam.

First, I parsed the posts for the most common phrases at varying lengths, hand identifying 3,730 individual strings that I felt indicated spam within the post title, post body, reddit username, reddit user description or comment bodies.

These strings are then checked against new or updated records and things are flagged as spam as needed.

It's been weeks since I've had to manually intervene and identify more spam strings - that's not to say I won't need to eventually as trends and techniques change (or, as it happens - reddit's api changes), but this was a fantastically successful means for identifying and analyzing obvious spam.

Beyond the above, I used what was a relatively simple approach to identify similar post titles to those that were determined to be spam for a "if you thought that was spam then you'll probably think this is too..." type feature that was very effective.

If reddit's api changes weren't happening I'd have already started training an ML model/NN or whatever chatGPT told me was the best one to use in order to classify these objects from the existing data.

Ironically, all of this was in order to offer moderation bots to subreddits to help handle the spam problem.

I started with scraping the API to play with meilisearch as a search engine but was just awestruck at the amount of _obvious_ spam that was getting through automod/reddit's own spam filtering (if there is any?) before being published/available via the API. I just didn't want to store all the metadata I was generating for all the spam posts and couldn't depend on reddit to police the issues on their end.

Now they're still unable to get a handle on spam - but also cutting off the developers trying to help them.

zlacker