Social Media Comment Map

Project Overview
The Intervention
Demo
Method
Findings and Implications
Resources

Project Overview

When people scroll social media, they often spend more time reading the comments than the original post.

Yet what we see in comment sections is rarely a neutral sample of public opinion. It is shaped by opaque ranking algorithms, engagement metrics, and visibility dynamics that amplify certain voices over others. As a result, readers may overestimate extreme views, misjudge the distribution of opinions, or misperceive social norms.

This project asks:

Can we make collective voices visible without flattening individual ones?

Instead of engagement signals or raw text data, this approach treats comments as a space of collective sensemaking, where representation itself shapes perception.

Why Comment Spaces Matter

Comment sections are a major site of social inference. People form impressions of what “most people think” based on what they can see, and that visibility is platform-mediated.

If ranking systems disproportionately surface outrage, slogan-like rhetoric, or high-engagement edge cases, readers may infer skewed social norms. This design problem is also a social cognition problem.

The Intervention

I built an interactive prototype that transforms a comment thread into a semantic map.

Each comment is embedded using a language model and projected into a two-dimensional space. Comments with similar meanings appear closer together, forming clusters that reflect recurring themes, framings, or narrative styles.

Crucially, this map does not summarize away individual voices. Users can click any point to view:

the full original comment
number of likes
timestamp
cluster affiliation

The goal is not to replace comments with AI summaries, but to make their structure visible.

Demo

Alternatively, open the interactive map in a new tab

Method

This system uses a lightweight NLP and visualization pipeline:

Data Collection
Comment threads are exported as csv/excel files (currently from Xiaohongshu and Reddit-style formats).
Text Embedding
Sentence-level embeddings are generated using OpenAI embedding models.
Dimensionality Reduction
UMAP projects high-dimensional embeddings into a 2D semantic space.
Clustering
HDBSCAN identifies dense thematic clusters without predefining cluster counts.
Interactive Visualization
Plotly renders an interactive scatterplot where each point represents a comment.

Findings and Implications

Across multiple datasets, several patterns emerged:

More narrative-driven responses created larger, diffuse semantic regions.
In some cases, what appeared to be polarized discussions visually resembled continuous landscapes rather than sharply separated camps.
TF-IDF-based maps produced sharper separations around keyword repetition, while embedding-based maps revealed deeper semantic continuity.

Why This Matters

From a research perspective, comment presentation may shape:

perceived social norms
perceived polarization
perceived toxicity
willingness to express minority views

Social psychology literature shows that people infer majority opinion from visible cues. If comment ordering distorts distribution, norm perception may also be distorted.

From a product perspective, this raises design questions:

Could alternative representations reduce misperception and polarization?
Could visible distribution maps improve deliberative quality?
How might platforms expose diversity without suppressing engagement?

From a policy perspective, transparency in comment representation could become part of broader algorithmic accountability conversations.

Resources

This project was inspired by Talk to the City, built by the AI Objectives Institute, which explores how collective input can be summarized while preserving nuance.

Yi Zhang

Social Media Comment Map

Table of Contents

Project Overview

Why Comment Spaces Matter

The Intervention

Demo

Method

Findings and Implications

Why This Matters

Resources

Yi Zhang

Social Media Comment Map

Table of Contents

Project Overview

Why Comment Spaces Matter

The Intervention

Demo

Method

Findings and Implications

Why This Matters

Resources

Related Posts