Establishing an etiquette for LLM use on Libera.Chat

aspenmayer · 1 hour ago

HN would benefit from a specific, explicit policy such as this.

reply

benatkin · 41 minutes ago

Nope.

> LLMs are allowed on Libera.Chat. They may both take input from Libera.Chat and output responses to Libera.Chat.

This wouldn't help HN.

Nor would the opposite policy, if only because it would encourage accusatory behavior.

reply

aspenmayer · 28 minutes ago

I have asked dang to comment on this issue specifically in the context of this post/thread.

The “opposite policy” is sort of the current status quo, per dang:

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

See this thread for my own reasoning on the issue (as well as dang’s), as it was raised recently:

https://news.ycombinator.com/item?id=41937993

You’ll need showdead enabled on your profile to see the whole thread, which speaks to the controversial nature of this issue on HN.

I agree that your mention of “encouraging accusatory behavior” is a point well-taken, and in the absence of evidence, such accusations themselves would likely run afoul of the Guidelines, but it’s worth noting that dang has said that LLM output itself is generally against the Guidelines, which could lead to a feedback loop of disinterested parties posting LLM content, only to be confronted with interested parties posting uninteresting takedowns of said LLM content and posters of it.

No easy answers here, I’m afraid.

reply

benatkin · 14 minutes ago

From the thread with see this thread

> There are lot of grey areas; for example, your GP comment wasn't just generated—it came with an annotation that you're a lawyer and thought it was sound. That's better than a completely pasted comment. But it was probably still on the wrong side of the line. We want comments that the commenters actually write, as part of curious human conversation.

This doesn't leave much room for AI non-slop:

> We want comments that the commenters actually write, as part of curious human conversation.

I think HN is trying to be good at being HN, not just to provide the most utility to its users in general. So those wanting something like HN if it started in 2030, may want to try and build a new site.

reply

refulgentis · 21 minutes ago

Law is hard!

In general, the de facto status quo is:

1. For whatever reason*, large swaths of LLM output copy-pasted is easily detectable.

2. If you're restrained, polite, with an accurate signal on this, you can indicate you see this, and won't get downvoted heavily. (ex. I'll post "my internal GPT detector went off, [1-2 sentence clipped version of why I think its wrong even if it wasn't GPT]")

3. People tend to downvote said content, as an ersatz vote.

In general, I don't think there needs to be a blanket ban against it, in the sense of I have absolutely no problem with LLM output per se, just lazy invocation of it, ex. large entry-level arguments that were copy-pasted.

i.e. I've used an LLM to sharpen my already-written rushed poor example, which didn't result in low-perplexity, standard-essay-formatted, content.

Additionally, IMHO it's not bad, per se, if someone invests in replying to an LLM. The fact they are replying indicates its an argument worth furthering with their own contribution.

* a strong indicator that a fundamental goal other than perplexity minimization may increase perceived quality

reply

t-writescode · 31 minutes ago

The odds of LLMs being used to produce content on HN is a number approaching 100%.

The odds of LLMs being trained / queried against data scraped from HN or HNSearch is even closer to 100%.

I know you don't like the "LLMs are allowed..." part, but they're here and they literally cannot be gotten rid of. However, this rule,

> As soon as possible, people should be made aware if they are interacting with, or their activity is being seen by, a LLM. Consider using line prefixes, channel topics, or channel entry messages.

Could be something that is strongly encouraged and helpful, and possibly the "good" LLM users would follow it.

reply

superkuh · 1 hour ago

Mostly it's just formalizing of the established status quo. But the changes re: allowing training on chat logs has caused some unintended consequences.

For one, now the classic IRC megahal bots which have been around for decades are technically not allowed unless you get permission from Libera staff (and the channel ops). They are markov chains that continuously train on channel contents as they operate.

But hopefully, as in the past, the Libera staffers will intelligently enforce the spirit of the rules and avoid any silly situations like the above caused by imprecise language.

reply

comex · 47 minutes ago

By its wording, the policy is specifically about training LLMs. A classic Markov chain may be a language model, but it’s not a large language model. The same rules might not apply.

reply

superkuh · 44 minutes ago

Yeah, you'd think, but this one was run by the staff in #libera the other night after the announcement and it sounded like they believed markovs technically counted. But I imagine as long as no one is rocking the boat they'll be left alone. Perhaps there was some misunderstanding on my part.

reply