R.A. Fisher wrote that the purpose of statisticians was "constructing a hypothetical infinite population of which the actual data are regarded as constituting a random sample." ( p. 311 <a href="https://royalsocietypublishing.org/doi/epdf/10.1098/rsta.1922.0009" rel="nofollow">here</a> ). In <a href="https://www.tandfonline.com/doi/abs/10.1080/00031305.1998.10480528" rel="nofollow">The Zeroth Problem</a> Colin Mallows wrote "As Fisher pointed out, statisticians earn their living by using two basic tricks-they regard data as being realizations of random variables, and they assume that they know an appropriate specification for these random variables."<br><br>Some of the pathological beliefs we attribute to techbros were already present in this view of statistics that started forming over a century ago. Our writing is just data; the real, important object is the “hypothetical infinite population” reflected in a large language model, which at base is a random variable. Stable Diffusion, the image generator, is called that because it is based on <a href="https://proceedings.mlr.press/v37/sohl-dickstein15.pdf" rel="nofollow">latent diffusion models</a>, which are a way of representing complicated distribution functions--the hypothetical infinite populations--of things like digital images. Your art is just data; it’s the latent diffusion model that’s the real deal. The entities that are able to identify the distribution functions (in this case tech companies) are the ones who should be rewarded, not the data generators (you and me).<br><br>So much of the dysfunction in today’s machine learning and AI points to how problematic it is to give statistical methods a privileged place that they don’t merit. We really ought to be calling out Fisher for his trickery and seeing it as such.<br><br><a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/genai/" rel="tag">#GenAI</a> <a href="/tags/generativeai/" rel="tag">#GenerativeAI</a> <a href="/tags/llm/" rel="tag">#LLM</a> <a href="/tags/stablediffusion/" rel="tag">#StableDiffusion</a> <a href="/tags/statistics/" rel="tag">#statistics</a> <a href="/tags/statisticalmethods/" rel="tag">#StatisticalMethods</a> <a href="/tags/diffusionmodels/" rel="tag">#DiffusionModels</a> <a href="/tags/machinelearning/" rel="tag">#MachineLearning</a> <a href="/tags/ml/" rel="tag">#ML</a><br>
Edited 137d ago
<p>ON THE HOUR<br><a href="/tags/lispygopherclimate/" rel="tag">#lispyGopherClimate</a> with <span class="h-card"><a href="https://climatejustice.social/@kentpitman" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>kentpitman</span></a></span> <a href="/tags/live/" rel="tag">#live</a> follow-up to last week's two interviews:<br><a href="https://archives.anonradio.net/202511190000_screwtape.mp3" rel="nofollow" class="ellipsis" title="archives.anonradio.net/202511190000_screwtape.mp3"><span class="invisible">https://</span><span class="ellipsis">archives.anonradio.net/2025111</span><span class="invisible">90000_screwtape.mp3</span></a><br>Loose plan:<br>- Ask a new question here<br>- lisp view on unix promoting tool reuse over new <a href="/tags/programming/" rel="tag">#programming</a> - does this relate to lisp's advice?<br>- <a href="/tags/ml/" rel="tag">#ML</a> using symbols (only?).<br>- Hypercard<br>- Juxtaposing assembly and <a href="/tags/lisp/" rel="tag">#lisp</a></p><p><a href="/tags/climatecrisis/" rel="tag">#climateCrisis</a> this week is getting to 29C here and it's too hot<br>In the second half I would like to cool down, also featuring <span class="h-card"><a href="https://mastodon.linkerror.com/@jns" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>jns</span></a></span> <span class="h-card"><a href="https://mastodon.sdf.org/@Cat" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>Cat</span></a></span> <span class="h-card"><a href="https://fe.disroot.org/users/ramin_hal9001" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>ramin_hal9001</span></a></span> <br><a href="/tags/lambdamoo/" rel="tag">#lambdaMOO</a> <a href="/tags/vr/" rel="tag">#VR</a></p>
Edited 138d ago
<p>The present perspective outlines how epistemically baseless and ethically pernicious paradigms are recycled back into the scientific literature via machine learning (ML) and explores connections between these two dimensions of failure. We hold up the renewed emergence of physiognomic methods, facilitated by ML, as a case study in the harmful repercussions of ML-laundered junk science. A summary and analysis of several such studies is delivered, with attention to the means by which unsound research lends itself to social harms. We explore some of the many factors contributing to poor practice in applied ML. In conclusion, we offer resources for research best practices to developers and practitioners.<br></p>From The reanimation of pseudoscience in machine learning and its ethical repercussions here: <a href="https://www.cell.com/patterns/fulltext/S2666-3899(24)00160-0" rel="nofollow" class="ellipsis" title="www.cell.com/patterns/fulltext/S2666-3899(24)00160-0"><span class="invisible">https://</span><span class="ellipsis">www.cell.com/patterns/fulltext</span><span class="invisible">/S2666-3899(24)00160-0</span></a>. It's open access.<br><br>In other words ML--which includes generative AI--is smuggling long-disgraced pseudoscientific ideas back into "respectable" science, and rejuvenating the harms such ideas cause.<br><br><a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/genai/" rel="tag">#GenAI</a> <a href="/tags/generativeai/" rel="tag">#GenerativeAI</a> <a href="/tags/llms/" rel="tag">#LLMs</a> <a href="/tags/machinelearning/" rel="tag">#MachineLearning</a> <a href="/tags/ml/" rel="tag">#ML</a> <a href="/tags/aiethics/" rel="tag">#AIEthics</a> <a href="/tags/science/" rel="tag">#science</a> <a href="/tags/pseudoscience/" rel="tag">#pseudoscience</a> <a href="/tags/junkscience/" rel="tag">#JunkScience</a> <a href="/tags/eugenics/" rel="tag">#eugenics</a> <a href="/tags/physiognomy/" rel="tag">#physiognomy</a><br>
Edited 127d ago
<p>Can someone clarify, in academia and industry are LLM hallucinations the result of overfitting, or simply a false positive?</p><p>I'm beginning to think that hallucinations are evidence of overfitting. It seems surprising that there are few attempts to articulate the underlying cause of hallucinations. Also, if the issue is overfitting, then increasing training time and datasets may not be an appropriate solution to the problem of hallucinations.</p><p><a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/ml/" rel="tag">#ML</a> <a href="/tags/llm/" rel="tag">#llm</a></p>
<p>Well, my book on TDDA has become slightly more real:</p><p>It’s not expected to be available until April, but you can see it on the publisher’s website at</p><p><a href="https://www.routledge.com/Test-Driven-Data-Analysis/Radcliffe/p/book/9781032897158" rel="nofollow" class="ellipsis" title="www.routledge.com/Test-Driven-Data-Analysis/Radcliffe/p/book/9781032897158"><span class="invisible">https://</span><span class="ellipsis">www.routledge.com/Test-Driven-</span><span class="invisible">Data-Analysis/Radcliffe/p/book/9781032897158</span></a></p><p>Although the publisher won’t let you pre-order till the end of March, the paper copy is listed on Blackwells and Waterstones:</p><p><a href="https://blackwells.co.uk/bookshop/product/Test-Driven-Data-Analysis-by-Nicholas-J-Radcliffe/9781032897158" rel="nofollow" class="ellipsis" title="blackwells.co.uk/bookshop/product/Test-Driven-Data-Analysis-by-Nicholas-J-Radcliffe/9781032897158"><span class="invisible">https://</span><span class="ellipsis">blackwells.co.uk/bookshop/prod</span><span class="invisible">uct/Test-Driven-Data-Analysis-by-Nicholas-J-Radcliffe/9781032897158</span></a></p><p><a href="https://www.waterstones.com/book/test-driven-data-analysis/nicholas-j-radcliffe/9781032897158" rel="nofollow" class="ellipsis" title="www.waterstones.com/book/test-driven-data-analysis/nicholas-j-radcliffe/9781032897158"><span class="invisible">https://</span><span class="ellipsis">www.waterstones.com/book/test-</span><span class="invisible">driven-data-analysis/nicholas-j-radcliffe/9781032897158</span></a></p><p>and Amazon will let you pre-order paper or Kindle copies.</p><p><a href="/tags/tdda/" rel="tag">#TDDA</a> <a href="/tags/books/" rel="tag">#books</a> <a href="/tags/data/" rel="tag">#data</a> <a href="/tags/analysis/" rel="tag">#analysis</a> <a href="/tags/testing/" rel="tag">#testing</a> <a href="/tags/datascience/" rel="tag">#datascience</a> <a href="/tags/quality/" rel="tag">#quality</a> <a href="/tags/ai/" rel="tag">#AI</a> <a href="/tags/ml/" rel="tag">#ML</a></p>
Edited 110d ago
<p>I realise on the fediverse this is maybe asking for a flaming, but yesterday out of sheer curiosity I tried Claude for a simpleish coding task that I'd been putting off (largely inspired by <span class="h-card"><a href="https://fediscience.org/@hausfath" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>hausfath</span></a></span> 's latest on <a href="/tags/theclimatebrink/" rel="tag">#theclimatebrink</a>). The performance of Claude was seriously impressive. I am convinced the AI cycle is more than hype (and have been for a while), the chatbots have been a huge attention hogger, misleadingly so, while the serious work has been done elsewhere. (We are developing ML tools to supplement parts of our climate model workflows). </p><p>Now I'm wondering if there is any serious EU competition to Anthropic? - Mistral's codestral perhaps? <br>Because this kind of performance changes everything and we can't afford to lag behind... <br><a href="/tags/aicoding/" rel="tag">#AIcoding</a> <a href="/tags/ml/" rel="tag">#ML</a></p><p>Edit: here is the climate brink post I mentioned</p><p><a href="https://www.theclimatebrink.com/p/the-ai-augmented-scientist" rel="nofollow" class="ellipsis" title="www.theclimatebrink.com/p/the-ai-augmented-scientist"><span class="invisible">https://</span><span class="ellipsis">www.theclimatebrink.com/p/the-</span><span class="invisible">ai-augmented-scientist</span></a></p>
Edited 36d ago
<p><a href="/tags/lispygopherclimate/" rel="tag">#lispyGopherClimate</a> <a href="/tags/technology/" rel="tag">#technology</a> <a href="/tags/podcast/" rel="tag">#podcast</a> <a href="/tags/live/" rel="tag">#live</a> <a href="https://toobnix.org/w/5QbQiLw7zrFiETgbv32kJk" rel="nofollow" class="ellipsis" title="toobnix.org/w/5QbQiLw7zrFiETgbv32kJk"><span class="invisible">https://</span><span class="ellipsis">toobnix.org/w/5QbQiLw7zrFiETgb</span><span class="invisible">v32kJk</span></a> first half missing?</p><p><span class="h-card"><a href="https://climatejustice.social/@kentpitman" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>kentpitman</span></a></span> <a href="/tags/climate/" rel="tag">#climate</a> <a href="/tags/haiku/" rel="tag">#haiku</a></p><p>- One week from today is <a href="/tags/lambdamoo/" rel="tag">#LambdaMOO</a>'s annual festival, April Fool's.<br>- Let's have a text based <a href="/tags/mud/" rel="tag">#MUD</a> <a href="/tags/bonkwave/" rel="tag">#bonkwave</a> pool party / concert</p><p>Notably /after/ submitting my <a href="/tags/els2026/" rel="tag">#ELS2026</a> <a href="/tags/commonlisp/" rel="tag">#commonLisp</a> <a href="/tags/deeplearning/" rel="tag">#deepLearning</a> article <a href="https://european-lisp-symposium.org/2026" rel="nofollow" class="ellipsis" title="european-lisp-symposium.org/2026"><span class="invisible">https://</span><span class="ellipsis">european-lisp-symposium.org/20</span><span class="invisible">26</span></a> </p><p>we just had this great thread: <a href="https://gamerplus.org/@screwlisp/116286095082069619" rel="nofollow" class="ellipsis" title="gamerplus.org/@screwlisp/116286095082069619"><span class="invisible">https://</span><span class="ellipsis">gamerplus.org/@screwlisp/11628</span><span class="invisible">6095082069619</span></a> I will read <a href="/tags/bookstodon/" rel="tag">#bookstodon</a> and <a href="/tags/programming/" rel="tag">#programming</a> suggestions by <span class="h-card"><a href="https://toot.cat/@riley" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>riley</span></a></span> and others. <a href="/tags/ai/" rel="tag">#ai</a> <a href="/tags/dl/" rel="tag">#DL</a> <a href="/tags/ml/" rel="tag">#ML</a></p>
Edited 12d ago