eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

211
active users

#correlation

0 posts0 participants0 posts today

#TIL at work: sometimes, it may be okay to force a linear regression through the origin.

Normally speaking, you should use "y=ax+b" as model for your linear regressions and correlation indices. Because fixing "b=0" introduces a bias in the estimator "a".

However, today, someone argued that if you're using a regression to measure the predictive power of a model with respect to measurement data, you should fix "b=0".

Now, intuitively this makes some sense, but I haven't been able to find clear proof pro or against it.

So, if you know more about this *and* have relevant references to papers to backup your claim, I'd very much love to hear from you. Statistics is not really my field.

#correlation #regression #stats #statistics #math #maths

Replied in thread

@data @datadon 🧵

How to assess a statistical model?
How to choose between variables?

Pearson's #correlation is irrelevant if you suspect that the relationship is not a straight line.

If monotonic relationship:
"#Spearman’s rho is particularly useful for small samples where weak correlations are expected, as it can detect subtle monotonic trends." It is "widespread across disciplines where the measurement precision is not guaranteed".
"#Kendall’s Tau-b is less affected [than Spearman’s rho] by outliers in the data, making it a robust option for datasets with extreme values."
Ref: statisticseasily.com/kendall-t

LEARN STATISTICS EASILY · Kendall Tau-b vs Spearman: Which Correlation Coefficient Wins?Discover why Kendall Tau-b vs Spearman Correlation is crucial for your data analysis and which coefficient offers the most reliable results.

▶️ LabPlot - A free OriginPro alternative for #Researchers (Scatter Plots)

@labplot@lemmy.kde.social

Catalyst Nanomaterials Lab has published another video tutorial that shows how to create stunning scatter plots, analyze correlations, and uncover hidden patterns in your data using #LabPlot. Go check it out!

➡️ youtube.com/watch?v=r1Fv1lAEf2

What do you do when you encounter a weird thing in your research? Blog about it!

Inspired by today's sleuthing about weird looking correlation histograms, here is a post actually digging into it at least a little bit (there's probably way deeper one could go, but this was good enough for me).

Weird Correlation Patterns
rmflight.github.io/posts/2024-

Original :mastodon: post was: mastodon.social/@rmflight/1122

rmflight.github.ioDeciphering Life: One Bit at a Time - Weird Correlation Patterns

#RStats #statistics #correlation help?!

I've got a bunch (3.5 x 10^6) gene-gene _Kendall-tau_ correlations, where each correlation is based on 12 samples. When I visualize the histogram of correlations, with a default of 30 bins, I see pic #1. Which looks fine, personally.

When I go to my personal favorite number of bins, 100, I get pic #2, with weird lows below the peaks.

Is this just me being paranoid as to something weird causing the alternating high and low values?

edit to add cor type

Continued thread

Deux billets pour tenter d'expliquer en langage courant le problème du biais de sélection en statistique. D'abord à travers le paradoxe de Berkson : comment on peut faire apparaitre une #corrélation entre deux maladies pourtant sans lien entre elles, simplement en étudiant des données recueillies chez des personnes admises à l’hôpital. dbao.leo-varnet.fr/2020/05/01/

In the early 2000s, #SvenHenkel and myself developed an #IDMEF/ #IDXP compliant security event message pipelining framework for collecting and consolidating log messages, e.g., from network #IDS, and #EDR products.

In the messages stream, we were able to match multi-stage #correlation #DetectionRules in near real-time (in-memory), before everything was stored in a central database. Structural graph-based #AnomalyDetection was developed later by some colleagues.

We called it #MetaIDS.