Introducing the ASML ActivityPub Fuzzer: Improving Testing in the Fediverse - Applied Social Media Lab at Harvard University

I’m happy to announce the release of the ActivityPub Fuzzer.

The ActivityPub Fuzzer is a small program to help developers build social media software on the Fediverse with the ActivityPub protocol. It uses data collected by the Fediverse Schema Observatory to emulate known Fediverse software, solving the problem where developers have to manually test compatibility with dozens of other projects. The Fuzzer runs in a local development environment. You can tell it to locally emulate a public fire hose, or to send you messages formatted from every known version of a specific software project.

The problem

The Fediverse is a decentralized, interoperable network of social media sites like Mastodon.social, Threads.net, and Pixelfed.social. These apps and services are built on the ActivityPub protocol. What this means for users is that accounts on different social media services can do social media interactions like subscribe, reply, and like each other’s posts across services. Critically, users have the choice to move from service to service without losing contact with anyone. This way there’s no lock-in: if you don’t like the policies on one service, you can move to another and keep your friends and feeds. An account on Threads.net can talk to an account on Mastodon.social can talk to an account on Pixelfed.social. They’re able to do this because they all speak the same technical language. That language is called ActivityPub.

ActivityPub is very flexible by design, and different software packages put their own spin on the kind of content that goes out there in the world. I like to say that if ActivityPub is a language, then every social media service in the Fediverse speaks its own dialect of the language. Like linguistic dialects, they are mostly mutually understandable to one another, but there are some pieces that don’t make sense. That’s not a problem for humans, but we have to remember that this is computers talking to computers, and that computers take things very literally. Imagine a British person asking an American to put something in the boot of their car, and the American frantically looking for footwear. We need to be literal when dealing with computers, otherwise errors will happen and messages will fail to be delivered.

Let’s say I’m building a new social media service and I want it to be able to talk to other services via the Fediverse. If I want my software to be compatible with Mastodon, Threads and Pixelfed, that’s tricky but not impossible to do. When I’m making my software, I can hook it up to accounts on the websites mastodon.social and threads.com and pixelfed.social and see what works and what breaks.

But the big problem is that there aren’t just three services I need to be compatible with. The Fediverse Schema Observatory has observed 70 software projects out there in the Fediverse. Different servers will run different versions of those software projects, so if we want to support every known version of every known software project, that is 663 potentially different ActivityPub “dialects” out there.

We want to support as many software versions as possible so the users of our new social media service can talk to as many users of other services as possible. This is good for users because it solves the “empty room” problem of showing up to a new service without a lot of users on it. It’s good for developers because it lets us work around the network effect that keeps users on entrenched platforms.

So how do we make our software compatible with hundreds of dialects?

The solution

Since September 2024, the Applied Social Media Lab has been running the Fediverse Schema Observatory. It has been collecting data on the different dialects that are used in ActivityPub and knows which software version speaks each dialect. Since we know how each message is shaped, we can fake data that is shaped like a message from Mastodon 4.2.0, or Misskey 2024.10.0, or WordPress 6.7.1.

So instead of having to hook our in-development software up to a WordPress 6.7.1 server, or to run a WordPress 6.7.1 server of our own, we can just run the Fuzzer and say, “Please pretend to be WordPress 6.7.1 and send me some data.”

The ActivityPub Fuzzer lets you emulate software that you are likely to encounter in the wild. And you can do it without having to connect to public servers.

The other day, I tried to improve Article rendering in Hometown. I was running a local Hometown server—not connected to the Internet—I ran the Fuzzer and said, “Send me all known message formats that contain an Article object.” It sent me dozens of different messages from Bookwyrm, Bridgy Fed, Friendica, WordPress, WriteFreely, NodeBB, Ghost, and Hacker’s Pub.

And then I just looked at the feed that my software was rendering. I could see immediately that messages from WriteFreely and Ghost looked good, but there was a weird duplication error with messages from NodeBB. At that point, I was able to go to the ActivityPub Fuzzer and inspect the JSON of the messages that were not rendering well, and I determined there was an issue with how I was displaying summary vs content. I made a post to an ActivityPub developer discussion community and we came up with a reasonable solution to my display problem. (I even got help from Julian Lam, the author of the NodeBB ActivityPub code.)

I did that all on my laptop, without having to connect my work-in-progress code to the wider Fediverse.

Design safety considerations

This is a piece of software that can create a fire hose of fake data. When designing it, I thought, “Could this be used for a DDOS attack or spam?” But all it’s doing is posting JSON data to an HTTP endpoint. A programmer could run:

while true
do
  curl -XPOST -H "Content-type: application/activity+json" -d '{"foo":"bar"}' '<https://social.example/inbox>'
done

and it would have the same effect, as far as bad behavior goes. So this isn’t really increasing risk or providing new capability to bad actors.

The other problem was: if we provided this as an online service, then we’d also be providing a low-effort tool for people to spam arbitrary inboxes on the Fediverse. Or we’d need to implement abuse-prevention measures. The solution here was to simply not provide it as an online service. The Fuzzer runs on your computer, so if you want to be a bad actor and have your IP banned for running a low-effort and easily thwarted spam campaign, that’s on you.

So generally speaking, this tool on its own does not increase attack capabilities on the Fediverse.

How to use it

The ActivityPub Fuzzer is now open for use, as well as open sourced for continued improvement. You can go here for installation instructions. It’s meant to run as a basic, tiny JavaScript server. The overall flow is:

Clone the Fuzzer from Github
Download the latest Fediverse Schema Observatory database snapshot and include it in the Fuzzer directory
Run a reverse proxy tunneling service like ngrok or fedify tunnel
Configure the Fuzzer to identify itself via the tunneling service
Run the Fuzzer on a local port
Load up the Fuzzer website and start telling it what to do

The Fuzzer will continue to be actively developed, and issues and pull requests can be filed on the project’s Github repository.

The problem

The solution

Design safety considerations

How to use it

Darius KazemiSenior Engineer