5h15h<p>VLMs don't actually "see" - they rely on memorized knowledge instead of visual analysis due to bias (e.g. knowing that the Adidas logo has 3 stripes and a dog has 4 legs) <a href="https://vlmsarebiased.github.io/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">vlmsarebiased.github.io/</span><span class="invisible"></span></a></p><p><a href="https://techhub.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://techhub.social/tags/GenAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GenAI</span></a> <a href="https://techhub.social/tags/LLM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLM</span></a> <a href="https://techhub.social/tags/VLM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>VLM</span></a></p>