eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

197
active users

#benchmarks

0 posts0 participants0 posts today

Good point.
EU study warns over the shortcomings of AI benchmarking. Paper by EU researchers highlights problems with how AI models are currently measured and urges regulators to signal which benchmarks are trustworthy
"Measuring AI capabilities and risks is a challenge, and benchmarks have been found to promise too much, be easily gamed, and measure the wrong thing"
euractiv.com/section/tech/news
#AI #benchmarking #benchmarks

Durch #Benchmarks sollen KI-Modelle vergleichbar sein. Firmen zeigen mit Tests & Ergebnissen Fähigkeiten der Modelle, die Aussagekraft ist aber oft unklar. Forschende: Etablierte Benchmarks machen #KI vergleichbar, sind aber nur Indiz für reale Leistung: sciencemediacenter.de/angebote

alojapan.com/1331021/japanese- Japanese-led XRISM makes first-ever direct detection of sulfur in two states #benchmarks #GraphicsCard #Japan #JapanNews #Japanese #JapaneseNews #laptop #nasa #netbook #news #notebook #processor #reports #review #reviews #test #tests #XRISMSatellite #XRISMSatelliteDetectsSulfurInTwoStates An international team of scientists has, for the first time, directly detected sulfur in both its gas and solid phases in the interstellar medium — the gas-