eupolicy.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
This Mastodon server is a friendly and respectful discussion space for people working in areas related to EU policy. When you request to create an account, please tell us something about you.

Server stats:

240
active users

#statistical

0 posts0 participants0 posts today

ggplot2 is the gold standard when it comes to data visualization.

The image in this post showcases examples of ggplot2 visualizations, demonstrating its versatility to create a wide range of plots with nearly limitless customization options.

Check out my online course, "Data Visualization in R Using ggplot2 & Friends," for a deeper dive into creating stunning plots with ggplot2.

More info: statisticsglobe.com/online-cou

At our #statistical consult, I helped a #master student with her meta-analysis in #R . She "admitted" she wrote all code with the help of #chatGPT . (of course, it's the only one, right?)

One hour in, I typed ?forest (the basic R command prompting documentation about the specific function she was using, forest), because I needed to know something about an function argument.

She was shocked to see the help file being displayed saying "Wow, where did you find all this information?"

When performing multiple imputation of missing data, it is essential to evaluate how the imputed values compare to the observed data.

The attached image, created with the bwplot() function, showcases how the distributions of observed and imputed values vary across different imputations for multiple variables.

I’ll be hosting an 8-week online workshop on Missing Data Imputation in R: statisticsglobe.com/online-wor

Dimensionality reduction simplifies high-dimensional data while retaining its essential features. It’s a powerful tool for improving data analysis, visualization, and machine learning performance.

Image credit to Wikipedia: en.wikipedia.org/wiki/Dimensio

I've developed an in-depth course on PCA theory and its application in R programming. Check out this link for more details: statisticsglobe.com/online-cou

Misinterpretation of correlation and causation is a common issue in data analysis. Correlation measures the strength and direction of a relationship between two variables, but it does not imply that one variable causes the other.

Consider the statement, "Dinosaurs didn't read. Now they are extinct."

For regular tips on data science, statistics, Python, and R programming, check out my free email newsletter: eepurl.com/gH6myT

Continued thread

Krugman:
This is, by the way, [what] standard autocratic regimes are known for.

⭐️In some ways, among their first targets are #statistical #agencies because ➡️ they want the numbers to say what they want the numbers to say.

I’ve been at conferences in Asia where the Chinese government announces that the economy grew 5.3 percent.

And everyone at the conference asks not “why did the Chinese economy grow by 5.3 percent?” but “why did the Chinese government decide to say that it grew by 5.3 percent?”

⭐️The numbers are our political statements, not reality.

🆘And if I were a federal employee at the Bureau of Labor Statistics,
I would be extremely frightened;
quite quickly they’re going to be in the line of fire.

The last time around, back during the Obama years, when there was a lot of "inflation truthers" claiming that the inflation numbers were being manipulated to make it look like there was less inflation than there was.

❌Such accusations are always projections
—it’s what they would do, not what was actually happening.

We turned to various kinds of private sector independent measures of inflation, many of which were originally developed by economists in places like Argentina,
where manipulation of the data was standard so they developed their own ways to measure.

We’re going to be having to do that.

⚠️ My guess is by sometime next year, we’re going to be having to look at proxies for what’s actually happening to the economy,
possibly for what’s actually happening to crime,
because the official numbers are going to be corrupted.

In Bayesian inference, a credible interval is a range of values within which a parameter lies with a certain probability, given the observed data and prior beliefs. The image of this post (based on this Wikipedia image: en.wikipedia.org/wiki/Credible) represents a 90% highest-density credible interval of a posterior probability distribution.

More details: eepurl.com/gH6myT

Hypothesis testing is a key statistical method that allows us to draw conclusions about populations based on sample data. Choosing the right test is essential for obtaining accurate and reliable results.

Interested in learning more? Check out my online course on Statistical Methods in R, starting September 9, 2024, where we dive deeper into hypothesis testing and other key statistical methods.

Take a look here for more details: statisticsglobe.com/online-cou

Continued thread

2/

"When #ComplexSystems, such as the overturning circulation, undergo critical transitions by changing a control parameter λ through a critical value λᶜ, a structural change in the #dynamics happens. The previously statistically stable state ceases to exist and the system moves to a different statistically stable state. The system undergoes a bifurcation [...] there are #EarlyWarning signals (EWSs), #statistical quantities, which also change before the tipping happens" [1]