<p>对 <a href="/tags/gpt/" rel="tag">#GPT</a>-5,Google的最佳策略可能是推出 <a href="/tags/gemini/" rel="tag">#Gemini</a>-2.6,轻轻压过。</p>
gpt
<p>The apparatus of a large language model really is remarkable. It takes in billions of pages of writing and figures out the configuration of words that will delight me just enough to feed it another prompt. There’s nothing else like it.<br></p>From ChatGPT Is a Gimmick,<br><a href="https://hedgehogreview.com/web-features/thr/posts/chatgpt-is-a-gimmick" rel="nofollow" class="ellipsis" title="hedgehogreview.com/web-features/thr/posts/chatgpt-is-a-gimmick"><span class="invisible">https://</span><span class="ellipsis">hedgehogreview.com/web-feature</span><span class="invisible">s/thr/posts/chatgpt-is-a-gimmick</span></a><br><br><a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/genai/" rel="tag">#GenAI</a> <a href="/tags/geneartiveai/" rel="tag">#GeneartiveAI</a> <a href="/tags/llm/" rel="tag">#LLM</a> <a href="/tags/chatgpt/" rel="tag">#ChatGPT</a> <a href="/tags/gpt/" rel="tag">#GPT</a> <a href="/tags/gimmick/" rel="tag">#gimmick</a><br>
<p>在 <a href="/tags/gpt/" rel="tag">#GPT</a> 模型语料库中,超过23%的长中文词元或者是 <a href="/tags/成人内容/" rel="tag">#成人内容</a> 或者是网络赌博。<br><a href="https://www.solidot.org/story?sid=82247" rel="nofollow" class="ellipsis" title="www.solidot.org/story?sid=82247"><span class="invisible">https://</span><span class="ellipsis">www.solidot.org/story?sid=8224</span><span class="invisible">7</span></a></p>
<p><a href="/tags/gpt/" rel="tag">#GPT</a>’s vocabulary behaves the worst: more than 23% long <a href="/tags/chinese/" rel="tag">#Chinese</a> tokens (i.e., a token with more than two Chinese characters) are either <a href="/tags/porn/" rel="tag">#porn</a> or online <a href="/tags/gambling/" rel="tag">#gambling</a>. <br><a href="https://arxiv.org/html/2508.17771v1" rel="nofollow"><span class="invisible">https://</span>arxiv.org/html/2508.17771v1</a></p>
<p>继 <a href="/tags/grok/" rel="tag">#Grok</a>-3 , <a href="/tags/gpt/" rel="tag">#GPT</a>-4.5 再次证实了模型规模的边际收益在剧烈递减。</p>
<p><a href="/tags/deepseek/" rel="tag">#DeepSeek</a> 还不支持多模态,和 <a href="/tags/gemini/" rel="tag">#Gemini</a> <a href="/tags/gpt/" rel="tag">#GPT</a> 还没法同台竞技。</p>
<p><a href="/tags/gpt/" rel="tag">#GPT</a>-5果然不行,怪不得 <a href="/tags/gemini/" rel="tag">#Gemini</a> 连2.6都懒得出了。</p>
Edited 219d ago
<p>有代码辅助的数据分析能力, <a href="/tags/claude/" rel="tag">#Claude</a> 3.5 Sonnet (New)> <a href="/tags/gpt/" rel="tag">#GPT</a>-4o >>> <a href="/tags/gemini/" rel="tag">#Gemini</a> 1.5 Pro 002。<br>这大概也是Gemini可以免费无限使用的原因之一吧。<br>不过免费的Claude和ChatGPT都只能分析两三次就超限了。</p>
Edited 1y ago
<p>> 给一个不会方程的小学生讲下面这道题,应该怎么讲?有四个数,第一个数是另三个数之和的1/2,第二个数是另三个数之和的1/3,第三个数是另三个数之和的1/4,第四个数是26。这四个数之和是多少?<br>这个 <a href="/tags/小学数学题/" rel="tag">#小学数学题</a> ,难倒了 <a href="/tags/gpt/" rel="tag">#GPT</a>-4o、 <a href="/tags/claude/" rel="tag">#Claude</a> 3.5 Sonnet、 <a href="/tags/gemini/" rel="tag">#Gemini</a> 1.5 Pro 002。<br>不说不用方程,基本都能算对。限定不用方程,一个个都傻了,答案千奇百怪。<br>那些 <a href="/tags/数学基准跑分/" rel="tag">#数学基准跑分</a> 都是怎么跑出来的?</p>
Regarding the last couple boosts: among other downsides, LLMs encourage people to take long-term risks for perceived, but not always actual, short-term gains. They bet the long-term value of their education on a chance at short-term grade inflation, or they bet the long-term security and maintainability of their software codebase on a chance at short-term productivity gains. My read is that more and more data is suggesting that these are bad bets for most people.<br><br>In that respect they're very much like gambling. The messianic fantasies some ChatGPT users have been experiencing fits this picture as well.<br><br><a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/genai/" rel="tag">#GenAI</a> <a href="/tags/generativeai/" rel="tag">#GenerativeAI</a> <a href="/tags/llm/" rel="tag">#LLM</a> <a href="/tags/tech/" rel="tag">#tech</a> <a href="/tags/dev/" rel="tag">#dev</a> <a href="/tags/chatgpt/" rel="tag">#ChatGPT</a> <a href="/tags/gpt/" rel="tag">#GPT</a> <a href="/tags/gemini/" rel="tag">#Gemini</a> <a href="/tags/gamblingaddiction/" rel="tag">#GamblingAddiction</a> <a href="/tags/nihilism/" rel="tag">#nihilism</a><br>
Edited 294d ago
<p><a href="/tags/gemini/" rel="tag">#Gemini</a> 2.5 Pro绝对是跟 <a href="/tags/gpt/" rel="tag">#GPT</a>-4.5、 <a href="/tags/grok/" rel="tag">#Grok</a>-3一个级别的,能领会你的想法,能悟出你没说出来的那些意思。</p>
<p><a href="/tags/gemma/" rel="tag">#Gemma</a>-3n-E4B-it(2025-6-26)在小米14(8Gen3, 16GB RAM)上流畅运行,表现和 <a href="/tags/gpt/" rel="tag">#GPT</a>-4 Turbo(2024-04-09)、<a href="/tags/gemini/" rel="tag">#Gemini</a> 1.5 Pro (2024-2-15)、<a href="/tags/claude/" rel="tag">#Claude</a> 3 Opus(20240229)这些16个月前的旗舰一个级别。<br><a href="https://lmarena.ai/leaderboard/text" rel="nofollow"><span class="invisible">https://</span>lmarena.ai/leaderboard/text</a></p>
Edited 195d ago