eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

226
active users

#bayes

0 posts0 participants0 posts today

I got an email from the author promoting this benchmark comparison of #Julialang + StanBlocks + #Enzyme vs #Stan runtimes.

StanBlocks is a macro package for Julia that mimics the structure of a Stan program. This is the first I've heard about it.

A considerable number of these models are faster in Julia than Stan, maybe even most of them.

nsiccha.github.io/StanBlocks.j

nsiccha.github.ioStanBlocks.jl - Julia vs Stan performance comparison

„calling something logic doesn’t make it so. Calling someone rational doesn’t make it so“

I’ve been thinking for a while that, as someone who works on human rationality and rational argument, I should write a blog post on what that actually means (and, maybe more importantly, doesn‘t mean).

in the meantime, though, I found much to agree with in this piece:

Title: The magical thinking of guys who love logic
theoutline.com/post/7083/the-m

The Outline · The magical thinking of guys who love logicWhy so many men online love to use “logic” to win an argument, and then disappear before they can find out they're wrong.
Replied in thread

@AeonCypher @paninid

"A p-value is an #estimate of p(Data | Null Hypothesis). " – not correct. A p-value is an estimate of

p(Data or other imagined data | Null Hypothesis)

so not even just of the actual data you have. Which is why p-values depend on your stopping rule (and do not satisfy the "likelihood principle"). In this regard, see Jeffreys's quote below.

Imagine you design an experiment this way: "I'll test 10 subjects, and in the meantime I apply for a grant. At the time the 10th subject is tested, I'll know my application's outcome. If the outcome is positive, I'll test 10 more subjects; if it isn't, I'll stop". Not an unrealistic situation.

With this stopping rule, your p-value will depend on the probability that you get the grant. This is not a joke.

"*What the use of P implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred.* This seems a remarkable procedure. On the face of it the fact that such results have not occurred might more reasonably be taken as evidence for the law, not against it." – H. Jeffreys, "Theory of Probability" § VII.7.2 (emphasis in the original) <doi.org/10.1093/oso/9780198503>.

OUP AcademicTheory of ProbabilityAbstract. Jeffreys' Theory of Probability, first published in 1939, was the first attempt to develop a fundamental theory of scientific inference based on
Replied in thread

@ctesta Hi Christian, this topic has always fascinated me, and its history seems always to be incomplete; so efforts and essays like this are a good thing :)

There's one person that I'd like to see cited more often in essays of this kind: *James Clerk Maxwell*. He seemed to have a very clear distinction in mind about probability and statistics, which one can gather from some of his letters and papers. He was after all one of the founders of the statistical method in physics.

See for instance "The Life of James Clerk Maxwell" (archive.org/details/lifeofjame), especially the passage on p. 143:

"They say that Understanding ought to work by the rules of right reason. These rules are, or ought to be, contained in Logic; but the actual science of Logic is conversant at present only with things either certain, impossible, or *entirely* doubtful, none of which (fortunately) we have to reason on. Therefore the true Logic for this world is the Calculus of Probabilities, which takes account of the magnitude of the probability (which is, or which ought to be in a reasonable man's mind). [...] When the probability (there is no better word found) in a man's mind of a certain proposition being true is greater than that of its being false, he believes it with a proportion of faith corresponding to the probability, and this probability may be increased or diminished by new facts."

And "Molecules" (doi.org/10.1038/008437a0), especially around p. 440:

"The modern atomists have therefore adopted a method which is I believe new in the department of mathematical physics, though it has long been in use in the Section of Statistics. When the working members of Section F get hold of a Report of the Census, or any other document containing the numerical data of Economic and Social Science, they begin by distributing the whole population into groups, according to age, income-tax, education, religious belief, or criminal convictions. The number of individuals is far too great to allow of their tracing the history of each separately, so that, in order to reduce their labour within human limits, they concentrate their attention on small number of artificial groups. The varying number of individuals in each group, and not the varying state of each individual, is the primary datum from which they work."

The distinction that is made here, in more modern terms, could be stated like this: Probability theory is the theory that describes and norms the quantification and propagation of uncertainty. Statistics is the study of collective properties of the variates of populations or, more generally, of collections of data.

From a Bayesian point of view such a distinction makes a lot of sense: one discipline studies the degrees of beliefs of agents; the other studies objective, measurable properties of collections. Of course the two go hand in hand because measured collective properties can change an agent's beliefs, and the agent must have beliefs on unknown collective properties.

This is my strictly personal take – everyone seems to have their own, and I expect few to agree with mine. But pedagogically I've found it very useful. The students seem to find it very clear and helpful (pglpm.github.io/ADA511/statist).

Internet ArchiveThe life of James Clerk Maxwell, with a selection from his correspondence and occasional writings and a sketch of his contributions to science : Campbell, Lewis, 1830-1908 : Free Download, Borrow, and Streaming : Internet Archive14

@mzloteanu
Just to offer a different perspective on the stackexchange question: One could calculate the mutual information between the two variates, including its possible variability if one could obtain an infinite sample size. Its advantage is that it's basically model-free.

In fact, a full nonparametric analysis of the underlying "full-population distribution" and its uncertainty owing to the finite sample could also be made. That would give full and model-free information about the problem.

When reporting a credibility interval, maybe also you, like I, are sometimes undecided between a 95%, a 90%, and an 89% interval (the last is common in the Bayesian literature). Well it turns out that the 89% interval has the following special property – for what it's worth:

Knowing whether the true value is within or without the 89% interval, corresponds to almost exactly *0.5 shannons* of uncertainty (more precisely 0.4999 Sh). That is, the uncertainty is half that of a 50% credibility interval, measured on the log-scale of Shannon information.

The 90% interval corresponds to 0.469 Sh. The 95% one, to 0.286 Sh.

So if one reports 50% and 89% credibility intervals, one is reporting 1 Sh and 0.5 Sh of uncertainty.

The remarks above don't pretend to be more than a curiosity :)

`posterior` package (Tools for Working with Posterior Distributions) has new CRAN release which includes additional Pareto khat diagnostics support and corresponding vignette illustrating the use for diagnosing reliability of estimating any expectation with empirical mean mc-stan.org/posterior/articles

See other enchantments in the NEWS cran.r-project.org/web/package

mc-stan.orgPareto-khat diagnosticsposterior

Is climate science biased by gender? 🔥

The Guardian writes in a recent article (theguardian.com/environment/ar):
"Female scientists were also more downbeat than male scientists, with 49% thinking global temperature would rise at least 3C, compared with 38%."

Think about the implications of that sentence: assuming that all surveyed climate scientists are experts on the topic with essentially the same scientific information, they come to different conclusions depending on their gender. One way of interpreting this is that climate science is not objective, which would be troubling.

When we put uncertainty bars (from a Bayesian log-linear model) on the percentages reported by The Guardian the message becomes less troubling in one sense (see figure below).

It looks as if the data is consistent with a fraction of roughly about 40-45% of climate scientists -- irrespective of gender -- thinking that temperature will rise by more than 3°C, which is troubling enough.

More on this on the blog: manywordsandnumbers.org/2024/0
#climatecrisis #bayes #uncertainty

My Nabiximols case study users.aalto.fi/~ave/casestudie has a nice set of PIT-ECDF plots that illustrate typical cases. PIT stands for probability integral transform, that is, the cumulative density or probability up to the observation. Given a well calibrated predictive distributions, PIT values should have uniform distribution.

The first one shows PIT-ECDF plot for a binomial model with too many PIT values close to 0 and 1, which indicates too narrow predictive distributions.