Transparency Hub is a user-friendly platform that aggregates and archives the policy documents (and associated data) that social media companies release. The platform is focused on increasing visibility into how data flows through the social media ecosystem.

Vision

To facilitate platform transparency and public-interest research by offering researchers and the general public an accessible interface to explore the policies and public-facing documents released by social media companies.

Goals

The project aims to

  • Aggregate and archive public-facing social media company documents like privacy policies, terms of service, transparency reports, and community guidelines in real time, all in one place
  • Provide tooling that lets people—whether they are researchers, teachers, or parents—easily search, analyze, export, and compare documents.

Why This Matters

Social media platforms have enormous influence over public life. Unfortunately, their policy documents are often difficult to find, hidden behind hard-to-locate URLs. The documents also lack a standard cross-company format, use verbose and confusing language, and rarely include a version history. These shortcomings make it difficult to track how platform governance changes over time, over companies, across jurisdictions, and in response to external pressure. The Transparency Hub system facilitates transparency research by making policy documents easier to find, analyze, and compare.

How It Works

The Transparency Hub system consists of a user-facing website that, behind the scenes, talks to a database which stores the hundreds of policy documents that we have downloaded.

Data collection: We have automated mechanisms for detecting and downloading updated policy documents from over 300 social media companies. We also have automated mechanisms for identifying the launch of new social media companies so that we can incorporate their policy documents into our database. This eliminates the tedious task of gathering these documents and helps researchers evaluate both how these policies have changed over time and how they are impacting users like you right now.

The platform showcases a company view to see metadata and browse document history in different formats.

Tools for users: Transparency Hub provides several interfaces for querying our database. For example, users can browse documents by platform, exploring a platform’s current documents and comparing those to archived policies to understand how the policies have evolved. There’s also a helpful tool to compare documents across platforms. Using AI models, Transparency Hub also intends to offer automated document summarization and recommendations for how users can modify their platform settings to better safeguard personal data.

We have built a helpful tool for comparing documents across time or company.

Who Can Benefit

We created this platform for

  • Researchers, journalists, and analysts tracking changes in platform policies over time or across a variety of social media platforms
  • Advocates, educators, and students studying or teaching digital rights and platform governance
  • Developers and technologists building tools that require structured, accessible policy data
  • Everyday consumers trying to understand platform rules, how they evolve, and how to better protect their personal data on these platforms

Get Involved

Help take back control of your data today! Reach out to us at asml@cyber.harvard.edu to suggest platforms to include or share what features in Transparency Hub would help you better understand how your social media data is used. Additionally, if you are a researcher interested in accessing the full dataset in closed beta, please reach out to discuss a potential collaboration.

News

Team

James Mickens

Principal Investigator

Meg Marco

Senior Director

Johnny Richardson

Senior Engineer

Zoe Robert

Principal Engineer

Teagan D’Addeo, Research Assistant