eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

216
active users

#LLMs

65 posts57 participants11 posts today

This post comparing #AI #LLMs to carnival psychics is all well and good, but it fails when applied to coders. Coders are supposed to know how code works! Or at least be able to figure it out if they are working on it. What is the point in being a vibe coder when you can't even figure out how it works?

Any coder should know the basic description of how LLMs work and not trust them. They don't have to do experiments to "see how they work".

infosec.exchange/@briankrebs/1

Infosec ExchangeBrianKrebs (@briankrebs@infosec.exchange)Really enjoyed David Gerard's amusing take on how programming with AI becomes like a gambling addiction for many. "Large language models work the same way as a carnival psychic. Chatbots look smart by the Barnum Effect — which is where you read what’s actually a generic statement about people and you take it as being personally about you. The only intelligence there is yours." "With ChatGPT, Sam Altman hit upon a way to use the Hook Model with a text generator. The unreliability and hallucinations themselves are the hook — the intermittent reward, to keep the user running prompts and hoping they’ll get a win this time." "This is why you see previously normal techies start evangelising AI coding on LinkedIn or Hacker News like they saw a glimpse of God and they’ll keep paying for the chatbot tokens until they can just see a glimpse of Him again. And you have to as well. This is why they act like they joined a cult. Send ’em a copy of this post." https://pivot-to-ai.com/2025/06/05/generative-ai-runs-on-gambling-addiction-just-one-more-prompt-bro/

I'm one who thinks that #AI is far from ready to do #PeerReview. But I follow the discussion and often see suggestions that AI can do some of the auxiliary jobs, like recommending humans to do peer review.

Here's a new study on the "recommending humans" job.
arxiv.org/abs/2506.00074

Six tested #LLMs "consistently favor[ed] senior scholars. Representation biases persist, replicating gender imbalances (reflecting male predominance), under-representing Asian scientists, and over-representing White scholars. Despite some diversity in institutional and collaboration networks, models favor highly cited and productive scholars, reinforcing the rich-get-richer effect while offering limited geographical representation."

arXiv logo
arXiv.orgWhose Name Comes Up? Auditing LLM-Based Scholar RecommendationsThis paper evaluates the performance of six open-weight LLMs (llama3-8b, llama3.1-8b, gemma2-9b, mixtral-8x7b, llama3-70b, llama3.1-70b) in recommending experts in physics across five tasks: top-k experts by field, influential scientists by discipline, epoch, seniority, and scholar counterparts. The evaluation examines consistency, factuality, and biases related to gender, ethnicity, academic popularity, and scholar similarity. Using ground-truth data from the American Physical Society and OpenAlex, we establish scholarly benchmarks by comparing model outputs to real-world academic records. Our analysis reveals inconsistencies and biases across all models. mixtral-8x7b produces the most stable outputs, while llama3.1-70b shows the highest variability. Many models exhibit duplication, and some, particularly gemma2-9b and llama3.1-8b, struggle with formatting errors. LLMs generally recommend real scientists, but accuracy drops in field-, epoch-, and seniority-specific queries, consistently favoring senior scholars. Representation biases persist, replicating gender imbalances (reflecting male predominance), under-representing Asian scientists, and over-representing White scholars. Despite some diversity in institutional and collaboration networks, models favor highly cited and productive scholars, reinforcing the rich-getricher effect while offering limited geographical representation. These findings highlight the need to improve LLMs for more reliable and equitable scholarly recommendations.

Für die breite Verwendung von #KI, speziell im Kontext #Schule, muss sichergestellt sein, dass #LLMs user:innen nicht zu selbstgefährdendem Verhalten animieren.

Das Nonprofit Transluce arbeitet an verschiedenen Ansätzen, Sprachmodelle sicherer und überprüfbarer zu machen. Ein wichtiges Werkzeug, den Propensity Bound (PRBO) Algorithmus, haben sie gestern veröffentlicht:

transluce.org/pathological-beh

Ergebnis: verschiedene Modelle zeigen nicht nur Aufforderungen zur Selbstverletzung, sondern generieren auch Beleidigungen oder Verschwörungstheorien (in einer kleinen Anzahl von Fällen, was je nach Größe der Usergruppe dennoch ein massives Problem wäre).

Mit dem Verfahren von Transluce können Modelle nun automatisch getestet und entsprechend einer attack success rate (ASR) bewertet werden. Solche Pipelines sind in Hochrisiko-Kontexten wie #bildung unverzichtbar.

Another of my forays into AI ethics is just out! This time the focus is on the ethics (or lack thereof) of Reinforcement Learning Feedback (RLF) techniques aimed at increasing the 'alignment' of LLMs.

The paper is fruit of the joint work of a great team of collaborators, among whom @pettter and @roeldobbe.

link.springer.com/article/10.1

1/

SpringerLinkHelpful, harmless, honest? Sociotechnical limits of AI alignment and safety through Reinforcement Learning from Human Feedback - Ethics and Information TechnologyThis paper critically evaluates the attempts to align Artificial Intelligence (AI) systems, especially Large Language Models (LLMs), with human values and intentions through Reinforcement Learning from Feedback methods, involving either human feedback (RLHF) or AI feedback (RLAIF). Specifically, we show the shortcomings of the broadly pursued alignment goals of honesty, harmlessness, and helpfulness. Through a multidisciplinary sociotechnical critique, we examine both the theoretical underpinnings and practical implementations of RLHF techniques, revealing significant limitations in their approach to capturing the complexities of human ethics, and contributing to AI safety. We highlight tensions inherent in the goals of RLHF, as captured in the HHH principle (helpful, harmless and honest). In addition, we discuss ethically-relevant issues that tend to be neglected in discussions about alignment and RLHF, among which the trade-offs between user-friendliness and deception, flexibility and interpretability, and system safety. We offer an alternative vision for AI safety and ethics which positions RLHF approaches within a broader context of comprehensive design across institutions, processes and technological systems, and suggest the establishment of AI safety as a sociotechnical discipline that is open to the normative and political dimensions of artificial intelligence.
#aiethics#LLMs#rlhf

Yesterday, during training (customer service position), one of my coworkers asked if they should be using #LLMs to answer customer queries, and my blood ran cold.

Before anyone could respond, I jumped in and said, "No, absolutely not. Generative #AI doesn't give correct answers. It gives responses which look superficially correct, but are factually wrong. And when we give the customer the wrong answer, they don't want us to blame the machine. The blame lands squarely on our shoulders. Do not use ChatGPT, don't use Google AI, don't use Bing's whatever, just DON'T."

Thankfully, everyone thinks I know what I'm doing, so they accepted that. But this is such a common question that nobody thought it odd to ask, and that makes me worried in a very different fashion.

From tokens to thoughts: How LLMs and humans trade compression for meaning

arxiv.org/abs/2505.17117

arXiv logo
arXiv.orgFrom Tokens to Thoughts: How LLMs and Humans Trade Compression for MeaningHumans organize knowledge into compact categories through semantic compression by mapping diverse instances to abstract representations while preserving meaning (e.g., robin and blue jay are both birds; most birds can fly). These concepts reflect a trade-off between expressive fidelity and representational simplicity. Large Language Models (LLMs) demonstrate remarkable linguistic abilities, yet whether their internal representations strike a human-like trade-off between compression and semantic fidelity is unclear. We introduce a novel information-theoretic framework, drawing from Rate-Distortion Theory and the Information Bottleneck principle, to quantitatively compare these strategies. Analyzing token embeddings from a diverse suite of LLMs against seminal human categorization benchmarks, we uncover key divergences. While LLMs form broad conceptual categories that align with human judgment, they struggle to capture the fine-grained semantic distinctions crucial for human understanding. More fundamentally, LLMs demonstrate a strong bias towards aggressive statistical compression, whereas human conceptual systems appear to prioritize adaptive nuance and contextual richness, even if this results in lower compressional efficiency by our measures. These findings illuminate critical differences between current AI and human cognitive architectures, guiding pathways toward LLMs with more human-aligned conceptual representations.