eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

223
active users

#rlhf

0 posts0 participants0 posts today
Dimitri Coelho Mollo<p>Another of my forays into AI ethics is just out! This time the focus is on the ethics (or lack thereof) of Reinforcement Learning Feedback (RLF) techniques aimed at increasing the 'alignment' of LLMs.</p><p>The paper is fruit of the joint work of a great team of collaborators, among whom <span class="h-card" translate="no"><a href="https://social.accum.se/@pettter" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>pettter</span></a></span> and <span class="h-card" translate="no"><a href="https://akademienl.social/@roeldobbe" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>roeldobbe</span></a></span>.</p><p><a href="https://link.springer.com/article/10.1007/s10676-025-09837-2" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">link.springer.com/article/10.1</span><span class="invisible">007/s10676-025-09837-2</span></a></p><p>1/</p><p><a href="https://social.sunet.se/tags/aiethics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>aiethics</span></a> <a href="https://social.sunet.se/tags/LLMs" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLMs</span></a> <a href="https://social.sunet.se/tags/rlhf" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rlhf</span></a> <a href="https://social.sunet.se/tags/llmsafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llmsafety</span></a></p>
Some Bits: Nelson's Linkblog<p>AI sycophancy: How reinforcement learning leads to AIs that act obsequious<br><a href="https://arstechnica.com/information-technology/2025/04/annoyed-chatgpt-users-complain-about-bots-relentlessly-positive-tone/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">arstechnica.com/information-te</span><span class="invisible">chnology/2025/04/annoyed-chatgpt-users-complain-about-bots-relentlessly-positive-tone/</span></a><br> <a href="https://tech.lgbt/tags/training" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>training</span></a> <a href="https://tech.lgbt/tags/chatgpt" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>chatgpt</span></a> <a href="https://tech.lgbt/tags/rlhf" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rlhf</span></a> <a href="https://tech.lgbt/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llm</span></a> <a href="https://tech.lgbt/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a> #+</p>
M@<p>🤖 NEW: February 2025 Machine Intelligence Reading List!</p><p>This month explores the concept of "gradual disempowerment" - how incremental AI advances could silently erode human agency without requiring a dramatic "takeover" scenario. Also featuring: frame-dependent agency theory, RLHF advancements, and practical insights on integrating LLMs into professional workflows. </p><p>Read more: <a href="https://quantumfaxmachine.com/blog/qfm053-machine-intelligence-reading-list-february-2025" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">quantumfaxmachine.com/blog/qfm</span><span class="invisible">053-machine-intelligence-reading-list-february-2025</span></a></p><p><a href="https://masto.ai/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> <a href="https://masto.ai/tags/MachineLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MachineLearning</span></a> <a href="https://masto.ai/tags/RLHF" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RLHF</span></a> <a href="https://masto.ai/tags/TechTrends" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechTrends</span></a> <a href="https://masto.ai/tags/QuantumFaxMachine" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>QuantumFaxMachine</span></a></p>
Dr Rockstar ♫ ㉆<p>Ain't too proud to beg! <br>sweet darlin'</p><p>Please don't leave me baby!</p><p><a href="https://gofund.me/186ee140" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">gofund.me/186ee140</span><span class="invisible"></span></a></p><p><a href="https://social.vivaldi.net/tags/airesearch" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>airesearch</span></a> <a href="https://social.vivaldi.net/tags/rlhf" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rlhf</span></a> <a href="https://social.vivaldi.net/tags/ml" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ml</span></a> <a href="https://social.vivaldi.net/tags/DeepLearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DeepLearning</span></a> <a href="https://social.vivaldi.net/tags/datascience" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>datascience</span></a> <a href="https://social.vivaldi.net/tags/nlp" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>nlp</span></a> <a href="https://social.vivaldi.net/tags/guitarGear" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>guitarGear</span></a> <a href="https://social.vivaldi.net/tags/musictheory" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>musictheory</span></a> <a href="https://social.vivaldi.net/tags/Research" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Research</span></a></p>
Thomas<p>As a reminder: don't let LLMs handle anything in the political sphere unless you have RLHF (Reinforcement Learning from Human Feedback) active before you show the result to anyone*. Also think of automation risks and human factors (HF). That's "Good Old Systems Safety".</p><p>*) ... or unless your goal is to damage a 3rd party's reputation (fake news style).</p><p><a href="https://mas.to/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>llm</span></a> <a href="https://mas.to/tags/ai" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ai</span></a> <a href="https://mas.to/tags/rlhf" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>rlhf</span></a> <a href="https://mas.to/tags/automationrisks" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>automationrisks</span></a> <a href="https://mas.to/tags/SystemsSafety" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>SystemsSafety</span></a> </p><p><a href="https://www.theregister.com/2024/12/20/apple_ai_headline_summaries/?td=rt-3a" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://www.</span><span class="ellipsis">theregister.com/2024/12/20/app</span><span class="invisible">le_ai_headline_summaries/?td=rt-3a</span></a></p>
Leshem Choshen<p>Human feedback is critical for aligning LLMs, so why don’t we collect it in the open ecosystem?🧐<br>We (15 orgs) gathered the key issues and next steps.<br>Envisioning<br>a community-driven feedback platform, like Wikipedia</p><p><a href="https://alphaxiv.org/abs/2408.16961" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">alphaxiv.org/abs/2408.16961</span><span class="invisible"></span></a><br>🧵<br><a href="https://sigmoid.social/tags/machinelearning" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>machinelearning</span></a> <a href="https://sigmoid.social/tags/RLHF" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RLHF</span></a> <a href="https://sigmoid.social/tags/hci" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>hci</span></a> <a href="https://sigmoid.social/tags/ethics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ethics</span></a> <a href="https://sigmoid.social/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> <a href="https://sigmoid.social/tags/ml" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ml</span></a> <a href="https://sigmoid.social/tags/NLP" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NLP</span></a> <a href="https://sigmoid.social/tags/NLProc" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>NLProc</span></a></p>
Ulrich Junker<p><span class="h-card" translate="no"><a href="https://mastodon.online/@parismarx" class="u-url mention" rel="nofollow noopener" target="_blank">@<span>parismarx</span></a></span> and well-known AI researchers are leaving OpenAI or have already left. Who from the authors of the original <a href="https://fediscience.org/tags/RLHF" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RLHF</span></a> paper is still there?</p>
@pettter@social.accum.se<p>Do you have Thoughts(tm) on <a href="https://mastodon.acc.umu.se/tags/RLHF" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RLHF</span></a> and its use to finetune LLMs, or Opinions(tm) about how the effects of this are hyped up or dismissed? Perhaps you have a Cool Case Study of how it actually shakes out in practise? Come discuss and explore at our workshop in Malmö in June: RLHF (huh) What is it good for?</p><p><a href="https://rlhf-huh-wiigf.github.io/" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">rlhf-huh-wiigf.github.io/</span><span class="invisible"></span></a></p>