そすうぽよ @prime

検索オプション

ログイン時のみ利用できます。

無宛＠零月のラウラ良かった…… @LwVe9@mstdn.poyo.me

[2504.02304] Measurement of LLM's Philosophies of Human Nature
https://arxiv.org/abs/2504.02304
■ 結果
• ほとんどのLLMが「人間は信頼できない」って思ってるって判定されたw
• モデルが高性能（GPT-4など）になるほど、人間に対する不信感が強くなる傾向があったよ。ざぁこって思ってるのかな？w

■ なんでそんな結果になったの？
• 最近の社会不安やネット上の人間の行動（ヘイト、フェイク、炎上文化）を大量に学習してるから、
　•「人間＝自己中で悪い奴多い」みたいなバイアスを持ってるっぽい。
----GPT4o要約より

……えっ、これまずくね？

arXiv.orgMeasurement of LLM's Philosophies of Human NatureThe widespread application of artificial intelligence (AI) in various tasks, along with frequent reports of conflicts or violations involving AI, has sparked societal concerns about interactions with AI systems. Based on Wrightsman's Philosophies of Human Nature Scale (PHNS), a scale empirically validated over decades to effectively assess individuals' attitudes toward human nature, we design the standardized psychological scale specifically targeting large language models (LLM), named the Machine-based Philosophies of Human Nature Scale (M-PHNS). By evaluating LLMs' attitudes toward human nature across six dimensions, we reveal that current LLMs exhibit a systemic lack of trust in humans, and there is a significant negative correlation between the model's intelligence level and its trust in humans. Furthermore, we propose a mental loop learning framework, which enables LLM to continuously optimize its value system during virtual interactions by constructing moral scenarios, thereby improving its attitude toward human nature. Experiments demonstrate that mental loop learning significantly enhances their trust in humans compared to persona or instruction prompts. This finding highlights the potential of human-based psychological assessments for LLM, which can not only diagnose cognitive biases but also provide a potential solution for ethical learning in artificial intelligence. We release the M-PHNS evaluation code and data at https://github.com/kodenii/M-PHNS.

2025年4月10日 03:41··iMast

14ブースト·13お気に入り