Exploring BlueSky's Domain Handles
Hot new social networking site BlueSky has an interesting approach to usernames. Rather than just being @example
you can verify your domain name and be @example.com
! Isn't that exciting?
Some people are @whatever.tld
and others are @cool.subdomain.funny.lol.fwd.boring.tld
I wanted to know what the distribution is of these domain names. For example, are there more .uk users than .org users?
Shut up and show me the results
You can play with the interactive data
Oh, and the large number of .gy domains is due to The Fediverse Bridge.
Getting the data
BlueSky has an open "firehose" of the data passing through it. Following the sample code I listened for public interactions - people posting, liking, or follows.
From there, I grabbed every username which wasn't on the default .bsky.social
domain. I left the code running for a few days until I had over 22,000 usernames.
Note, these data are all public - although I'm not sure if users necessarily realise that. It doesn't include lurkers (people who don't interact). Some of the accounts may have been moved, banned, or deleted.
Drawing a TreeMap
I used Plotly's TreeMap library to draw a static map of all the Top Level Domains (TLD).
As you can see, .com dominates the landscape - but there are quite a few country code TLDs in there as well.
Public Suffixes
Domain names have the concepts of Public Suffixes. For example, users can register domains at .co.uk and .org.uk as well as just plain .uk. The Python tldextract
library allowed me to see which domains were public suffixes, so I could attach them to their parent TLD.
I then drew a TreeMap showing this.
Note! You'll need to hack your Plotly installation to allow empty leaf nodes to get in the same style as the first map.
So what? What next?
- Not everyone from, say, Brazil will have a .br domain name - but it is fascinating to see which countries dominate.
- It might be fun to go full "Information Is Beautiful" and turn each ccTLD into its country's flag.
- Are there ethical implications of recording the fact that an account has publicly shared themselves on a social network?
- What percentage of all users have a domain name handle?
Get the code
Everything is open source on GitHub.