<p>这两天我是小刀捅屁股——开了眼了,可能对于熟练运用AI工具的象友不是新鲜事了吧,我之前只用chatgpt进行一些常识性的问答对话,但这两天我学会了:<br>- 用perplexity AI搜索资料<br>- 把pdf喂给NotebookLM并生成播客音频<br>- 把音频喂给otter.ai转化成可以逐字播放的文本</p><p>还有什么好用的工具也请象友多多推荐!<br><a href="/tags/长毛象安利大会/" rel="tag">#长毛象安利大会</a> <a href="/tags/长毛象安利交换大会/" rel="tag">#长毛象安利交换大会</a> <a href="/tags/长毛象安利中心/" rel="tag">#长毛象安利中心</a> <a href="/tags/generativeai/" rel="tag">#generativeAI</a> <a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/perplexity/" rel="tag">#perplexity</a> <a href="/tags/notebooklm/" rel="tag">#notebooklm</a></p>
Edited 1y ago
<p>Cloudflare 发现 Perplexity 的秘密爬虫会伪装 User-Agent 爬取其它网站,并且不遵守 robots.txt。<br><br>- Cloudflare 注册了一个新域名,配置了禁止任何爬虫的 robots.txt,在 Cloudflare 上禁止 GenAI 爬虫访问。<br>- 在询问 Perplexity 关于一个此域名下子域名的详细信息的时候,Perplexity 依然可以做出一些回答。<br>- Cloudflare 发现 Perplexity 以 macOS 上 Chrome 的 User-Agent 伪装自己,通过未宣告在文档所述自有 IP 段中的,来自多个 ASN 的 IP 访问了此域名。<br>- Cloudflare 通过机器学习等技术尝试识别并拦截此类爬取行为。测试发现当爬取被拦截时,Perplexity 转用了其它数据源,说明拦截确有起效。<br>- 文章还提到,OpenAI 旗下的 ChatGPT 会访问并遵守 robots.txt,尊重了网站主对爬虫的要求。<br>- Cloudflare Bot Management 用户现已可使用相关规则拦截 Perplexity 的秘密爬虫。<br><br><a href="https://blog.cloudflare.com/perplexity-is-using-stealth-undeclared-crawlers-to-evade-website-no-crawl-directives/" rel="nofollow">blog.cloudflare.com/~</a><br><br>[感谢 Pop 提供此消息。]<br><br><a href="/tags/genai/" rel="tag">#GenAI</a> <a href="/tags/cloudflare/" rel="tag">#Cloudflare</a> <a href="/tags/perplexity/" rel="tag">#Perplexity</a><br><br><a href="https://t.me/outvivid/4741" rel="nofollow">Telegram 原文</a></p>
Am I to understand from this that SearXNG is in the process of becoming AI poisoned?<br><br><p><a href="https://github.com/searxng/searxng/issues/2163" rel="nofollow" class="ellipsis" title="github.com/searxng/searxng/issues/2163"><span class="invisible">https://</span><span class="ellipsis">github.com/searxng/searxng/iss</span><span class="invisible">ues/2163</span></a><br><a href="https://github.com/searxng/searxng/issues/2008" rel="nofollow" class="ellipsis" title="github.com/searxng/searxng/issues/2008"><span class="invisible">https://</span><span class="ellipsis">github.com/searxng/searxng/iss</span><span class="invisible">ues/2008</span></a><br><a href="https://github.com/searxng/searxng/issues/2273" rel="nofollow" class="ellipsis" title="github.com/searxng/searxng/issues/2273"><span class="invisible">https://</span><span class="ellipsis">github.com/searxng/searxng/iss</span><span class="invisible">ues/2273</span></a></p>The last issue hasn't been active since 2023 but the 1st one has been active recently and the middle one last summer.<br><br><a href="/tags/searx/" rel="tag">#SearX</a> <a href="/tags/searxng/" rel="tag">#SearXNG</a> <a href="/tags/searchengines/" rel="tag">#SearchEngines</a> <a href="/tags/alternatesearchengines/" rel="tag">#AlternateSearchEngines</a> <a href="/tags/metasearchengines/" rel="tag">#MetaSearchEngines</a> <a href="/tags/web/" rel="tag">#web</a> <a href="/tags/dev/" rel="tag">#dev</a> <a href="/tags/tech/" rel="tag">#tech</a> <a href="/tags/foss/" rel="tag">#FOSS</a> <a href="/tags/opensource/" rel="tag">#OpenSource</a> <a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/aipoisoning/" rel="tag">#AIPoisoning</a> <a href="/tags/aislop/" rel="tag">#AISlop</a> <a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/genai/" rel="tag">#GenAI</a> <a href="/tags/generativeai/" rel="tag">#GenerativeAI</a> <a href="/tags/llm/" rel="tag">#LLM</a> <a href="/tags/chatgpt/" rel="tag">#ChatGPT</a> <a href="/tags/claude/" rel="tag">#Claude</a> <a href="/tags/perplexity/" rel="tag">#Perplexity</a><br>
Edited 76d ago
<p>A large international study coordinated by the <a href="/tags/ebu/" rel="tag">#EBU</a> and led by the <a href="/tags/bbc/" rel="tag">#BBC</a> found that AI assistants misrepresent news content 45% of the time across different languages and platforms, with <a href="/tags/gemini/" rel="tag">#Gemini</a> performing the worst.</p><p>[…] Key findings: </p><p>• 45% of all AI answers had at least one significant issue.<br>• 31% of responses showed serious sourcing problems – missing, misleading, or incorrect attributions.<br>• 20% contained major accuracy issues, including hallucinated details and outdated information.<br>• Gemini performed worst with significant issues in 76% of responses, more than double the other assistants, largely due to its poor sourcing performance.<br>• Comparison between the BBC’s results earlier this year and this study show some improvements but still high levels of errors.</p><p><a href="https://www.bbc.co.uk/mediacentre/2025/new-ebu-research-ai-assistants-news-content" rel="nofollow" class="ellipsis" title="www.bbc.co.uk/mediacentre/2025/new-ebu-research-ai-assistants-news-content"><span class="invisible">https://</span><span class="ellipsis">www.bbc.co.uk/mediacentre/2025</span><span class="invisible">/new-ebu-research-ai-assistants-news-content</span></a></p><p><a href="/tags/aihype/" rel="tag">#aihype</a> <a href="/tags/llm/" rel="tag">#llm</a> <a href="/tags/openai/" rel="tag">#openai</a> <a href="/tags/perplexity/" rel="tag">#perplexity</a> <a href="/tags/chatgpt/" rel="tag">#chatgpt</a></p>