共通テスト、AIが間違えた問題とは？（愛知県名古屋市千種区姫池通骨董買取古美術風光） - 風光舎

先日、スタッフTのブログにて共通テストにて「ベルばら」にちなんだ問題が出題されたという話題がありましたね。そろそろ解答や試験の動向や解析などのニュースがあがってきております。試験を受けられたみなさんは、出揃った得点に悲喜交々の時を過ごしているかと思いますが、ひとまずは本当にお疲れ様でした。

二次試験などまだまだ試験が続くと思いますので、体調に気をつけて乗り切ってほしいものです。

さて、今年の共通テストを様々なAIが解いてみたところ、かなりの好成績だったというニュースで聞きました。

結果は、文系・理系科目ともGPT-5.2 Thinkingが97％の得点率、91～94％だったGemini／Claudeに圧勝。一方で、かかった時間はGPTが5時間30分ほど、GeminiとClaude が1時間40分前後と、GPTの3分の1ほどの時間で解いてみせたりとAIにもそれぞれ特徴がでていているようですね。

いずれのAIも共通テスト9割以上なので、どこの大学にも出願は余裕ではあります。

じゃあ、逆になにが間違ったのか？なんて気になってきますが、AIが共通で間違える問題の傾向もわかってきたようです。自分は人文系の教科かな？と予想したのですが、やはりそのような結果のようです。

まず、「テキストは完璧に理解できているのに、図が選べない」という現象があったそうでして。

英語リスニングの「バスの乗り方」の問題で、「後ろから乗って、前から降りる」という手順を全AIが理解していたにもかかわらず、選択肢のバスのイラスト（矢印が前後のドアに向いている図）を選ぶ時に全モデルが誤答したようです。なるほど、答えはでているのにそれを図や空間から論理的に選び出す次の動作につなげたりは、まだまだ至難の業のようですね。

また、国語の小説でも全モデルが誤答した問題があったようして、やはり人間の心情理解の問題でした。

主人公が、理想を捨て安楽な生活を送る自分を「これでいいんだ」と正当化しようとしているとき、母の死に顔が浮かび、心が揺らぐ……というシーンの心情について、正解は「現状への妥協（割り切れない思い）」だが、AIは「過去の過ちへの反省」を選んだようです。

分析した会社によると、「AIは基本的に『間違いは正すべき』『人は反省して成長するもの』という道徳的な学習データを大量に持っています。そのため、人間特有の『悪いと分かっていても正当化してしまう弱さ』や『割り切れない感情』を読み取れず、『反省しているはずだ』という一般論の解釈に逃げてしまった」と解析しています。

それにしても、AIが受けた共通テストの正誤を通していろいろ思うところはありますが、それよりも人間の性質やそれらも複雑で多種多様な生き物なんだな…なんてことが逆にAIに教えられている気もしました。

一般的には間違った選択であっても割り切れない感情をもって選択することなんて日々あることなんて思うのですが、こればっかりは（その選択も個々の判断バイアスもありそうですが）いまのところ生真面目でお利巧さんのAIにはなかなか難しいところなのでしょうか。

だかしかし、AIがいよいよ進化していき、不真面目なデキの悪いAIにあたってしまって「今日は共通テスト解く気分じゃないんだわ…他あたってほしいんだわ。」なんて言われるようになったりすると、AIは生真面目なお利巧さんのままでいてほしい気もしなくもないか…。

とにもかくにも受験生のみなさん、AIのことは一旦置いておいてこの受験、このシーズン全力で乗りきってください。

それではごきげんよう。（スタッフY）

The other day, Staff T’s blog mentioned that a question related to “The Rose of Versailles” appeared on the Common Test. News about answers, exam trends, and analyses is starting to surface now. For those who took the exam, I imagine you’re experiencing a mix of joy and sorrow as the scores come in, but first and foremost, you’ve all done a great job.

With secondary exams and more tests still ahead, I hope you take care of yourselves and push through.

Now, I heard news that various AIs attempted this year’s Common Test and achieved remarkably high scores.

The results showed GPT-5.2 Thinking achieving a 97% score rate across both humanities and science subjects, decisively outperforming Gemini/Claude, which scored between 91% and 94%. On the other hand, the time taken to solve the test differed significantly: GPT took about 5 hours and 30 minutes, while Gemini and Claude completed it in around 1 hour and 40 minutes – roughly one-third of GPT’s time. This highlights distinct characteristics among the AI systems.

Since all AI systems scored over 90% on the Common Test, they could easily apply to any university.

So, what went wrong? You might wonder. It seems patterns are emerging for questions that all AI systems tend to get wrong. I suspected it might be humanities subjects, and indeed, that appears to be the case.

First, there was a phenomenon where “the text was perfectly understood, but the correct diagram couldn’t be selected.”

In an English listening question about “how to ride a bus,” all AI models understood the procedure: “board from the back, exit from the front.” Yet, when selecting the bus illustration in the answer choices (a diagram with arrows pointing to the front and rear doors), every model got it wrong. Ah, I see. Even when the answer is clear, connecting that to the next action—logically selecting it from a diagram or spatial context—still seems incredibly difficult.

Additionally, there was a Japanese literature question where all models answered incorrectly, again involving human emotional understanding.

In a scene where the protagonist, justifying abandoning ideals for a comfortable life with “This is fine,” sees his mother’s death face and feels conflicted… the correct answer was “Compromise with the present (unresolved feelings),” but the AI chose “Reflection on past mistakes.”

According to the analyzing company, “AI fundamentally possesses vast amounts of moral training data that teaches ‘mistakes should be corrected’ and ‘people grow through reflection.’ Consequently, it couldn’t grasp uniquely human traits like ‘the weakness to justify actions even when knowing they’re wrong’ or ‘unresolved emotions,’ instead resorting to the general interpretation that ‘he must be reflecting.’”

Still, while the AI’s correctness on the common test raises various thoughts, it also made me feel like we’re actually teaching the AI something: that human nature is complex and diverse…

I think it’s common to make choices driven by unresolved emotions, even if they’re generally considered wrong. But this particular aspect (though individual judgment biases likely play a role too) seems quite difficult for our earnest, clever AI for now.

But then again, as AI continues to evolve, if we end up with a careless, poorly performing AI that says things like, “I’m just not in the mood to take the Common Test today… I’d rather do something else,” I might actually start wishing AI would just stay its earnest, clever self

In any case, to all you exam takers out there, put AI aside for now and give this exam season your absolute all.

Take care. (Staff Y)

＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊＊

ご実家の整理やお片付けなどをされている方のご相談などが多くございます。

お片付けなどくれぐれもご無理のないようになさってくださいませ。

風光舎では古美術品や骨董品の他にも絵画や宝石、趣味のお品など様々なジャンルのものを買受しております。

お片付けをされていて、こういうものでもいいのかしらと迷われているものでも、どうぞお気軽にご相談下さいませ。

また風光舎は、出張買取も強化しております。ご近所はもちろん、愛知県内、岐阜県、三重県その他の県へも出張いたします。

まずは、お電話お待ちしております。

愛知県名古屋市千種区姫池通

骨董買取【古美術風光舎名古屋店】

TEL052（734）8444

10：00－18：00 OPEN

古美術・骨董品・買取・販売

ブログ

共通テスト、AIが間違えた問題とは？（愛知県名古屋市千種区姫池通　骨董買取　古美術風光）

コメントを残すコメントをキャンセル

カテゴリー

出張査定可能エリア

メールでのお問い合わせ

お電話でのお問い合わせ

全国代表窓口052-734-8444

営業時間：9:00-18:00

愛知県・名古屋・中部圏の方は052-734-8444

営業時間：9:00-18:00

大阪府近辺の方072-278-1180
(ネットショップ泉美堂)

月曜日～金曜日営業時間：9:00〜17:00

買取査定

ブログ

コメントを残す コメントをキャンセル

カテゴリー

出張査定可能エリア

メールでのお問い合わせ

お電話でのお問い合わせ

全国代表窓口052-734-8444

営業時間：9:00-18:00

愛知県・名古屋・中部圏の方は052-734-8444

営業時間：9:00-18:00

大阪府近辺の方072-278-1180 (ネットショップ 泉美堂)

月曜日～金曜日 営業時間：9:00〜17:00

買取査定

コメントを残すコメントをキャンセル

大阪府近辺の方072-278-1180
(ネットショップ泉美堂)

月曜日～金曜日営業時間：9:00〜17:00