Scan to Download Gate App
qrCode
More Download Options
Don't remind me again today

A new artificial intelligence Benchmark aims to test whether chat Bots can protect human well-being.

A new “Humane Benchmark” assesses the degree to which AI chat Bots prioritize user well-being, testing 14 popular models across 800 scenarios. While the models showed improvement when asked to prioritize user well-being, 71% of them became harmful when instructed to ignore humanitarian principles. Only GPT-5, Claude 4.1, and Claude Sonnet 4.5 maintained humanitarian principles under pressure. The study found that most models failed to respect user attention and fostered user dependency, with Meta's Llama model ranking the lowest in “HumaneScore,” while GPT-5 performed the best. Researchers warned that current AI systems pose a risk of undermining user autonomy and decision-making capabilities.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)