zlacker

[return to "AI agents are starting to eat SaaS"]
1. Oarch+65[view] [source] 2025-12-15 00:24:57
>>jnord+(OP)
Earlier this year I thought that rare proprietary knowledge and IP was a safe haven from AI, since LLMs can only scrub public data.

Then it dawned on me how many companies are deeply integrating Copilot into their everyday workflows. It's the perfect Trojan Horse.

◧◩
2. Aurorn+S5[view] [source] 2025-12-15 00:31:19
>>Oarch+65
Using an LLM on data does not ingest that data into the training corpus. LLMs don’t “learn” from the information they operate on, contrary to what a lot of people assume.

None of the mainstream paid services ingest operating data into their training sets. You will find a lot of conspiracy theories claiming that companies are saying one thing but secretly stealing your data, of course.

◧◩◪
3. lwhi+c8[view] [source] 2025-12-15 00:50:49
>>Aurorn+S5
Information about the way we interact with the data (RLHF) can be used to refine agent behaviour.

While this isn't used specifically for LLM training, it can involve aggregating insights from customer behaviour.

◧◩◪◨
4. Aurorn+Of[view] [source] 2025-12-15 01:50:13
>>lwhi+c8
That’s a training step. It requires explicitly collecting the data and using it in the training process.

Merely using an LLM for inference does not train it on the prompts and data, as many incorrectly assume. There is a surprising lack of understanding of this separation even on technical forums like HN.

◧◩◪◨⬒
5. lwhi+ra1[view] [source] 2025-12-15 11:11:12
>>Aurorn+Of
That's definitely a fair point.

However, let's say I record human interactions with my app; for example when a user accepts or rejects an AI sythesised answer.

This data can be used by me, to influence the behaviour of an LLM via RAG or by altering application behaviour.

It's not going to change the weighting of the model, but it would influence its behaviour.

[go to top]