Harald Sack<p>Interesting (short) paper of game-based training and evaluation of agentic behaviour in LLMs: Leon Guertler, Bobby Cheng, Simon Yu, Bo Liu, Leshem Choshen, Cheston Tan.: "Textarena"</p><p><a href="https://arxiv.org/html/2504.11442v1" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">arxiv.org/html/2504.11442v1</span><span class="invisible"></span></a></p><p><a href="https://sigmoid.social/tags/llms" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>llms</span></a> <a href="https://sigmoid.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://sigmoid.social/tags/generativeai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>generativeai</span></a> <a href="https://sigmoid.social/tags/agents" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>agents</span></a> <a href="https://sigmoid.social/tags/agenticAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>agenticAI</span></a> <a href="https://sigmoid.social/tags/evaluation" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>evaluation</span></a></p>