Publication - 2022

RumorLens: Interactive Analysis and Validation of Suspected Rumors on Social Media

Ran Wang, Kehan Du, Qianhe Chen, Yifei Zhao, Mojie Tang, Hongxi Tao, Shipan Wang, Yiyao Li, and Yong Wang^*.

CHI EA '22: CHI Conference on Human Factors in Computing Systems Extended Abstracts.

Paper Information

Authors: Ran Wang, Kehan Du, Qianhe Chen, Yifei Zhao, Mojie Tang, Hongxi Tao, Shipan Wang, Yiyao Li, Yong Wang

Venue: CHI EA '22: CHI Conference on Human Factors in Computing Systems Extended Abstracts.

Paper: https://doi.org/10.1145/3491101.3519712
Arxiv: https://arxiv.org/abs/2203.03098
Project: https://rumorlens.datavizu.app/
GitHub: https://github.com/DataVizU/RumorLens

Summary

RumorLens is an interactive visual analytics system for administrators who must inspect large queues of suspected rumors on social media. Built from four months of requirements work with platform administrators, it combines NLP-style preprocessing with coordinated views for spatial-temporal overview, feature projection, and detailed propagation analysis. Its distinctive contribution is a circular propagation view that displays retweet hierarchy, time, sentiment, keywords, and post details in a compact layout. The evaluation uses a real Sina Weibo dataset and a domain-expert case study, showing how the tool can guide a reasoned rumor judgment while leaving quantitative performance and broader deployment questions open.

Why This Paper Matters

The paper addresses this question: How can social media platform administrators combine automated feature extraction with interactive visualization to analyze and validate large sets of suspected rumors while understanding their spatial, temporal, feature, and propagation patterns?

It is especially relevant to:

Social media moderation teams that need to inspect queues of suspected rumors produced by user reports, crowdsourcing, or automated detectors.
Visual analytics research on misinformation workflows, especially studies concerned with coordinated multiple views and explainable human-computer decision support.
Social-media analysis projects that need to examine content, user, topic, and propagation features in one exploratory workflow.

Key Contributions

RumorLens offers a requirements-grounded visual analytics workflow for suspected rumor review. This matters because the paper positions rumor validation as a task where human judgment and computational support need to work together rather than as a purely automatic classification problem.
The system organizes rumor analysis across three coordinated levels: spatial-temporal overview, feature-based projection, and detailed propagation. This structure helps analysts move from broad filtering to case-level reasoning without losing the context of the larger collection.
The paper introduces a compact circular propagation design for retweet diffusion. Its value is that hierarchy, time, sentiment, keywords, and post details can be inspected together, supporting validation decisions that depend on how a message spreads.
The authors demonstrate the system with a real Sina Weibo dataset and a domain expert case study. This gives the design ecological grounding, while also making clear that the evaluation is exploratory rather than a quantitative benchmark.

Method

Derive workflow requirements with platform administrators: The authors worked with three administrators over four months and distilled the task into three needs: exploring space-time distributions, comparing suspected rumor characteristics, and inspecting propagation details.
Collect and prepare a real suspected-rumor dataset: RumorLens is demonstrated on Sina Weibo data containing suspected rumors, retweets, comments, and user profiles collected over roughly one year.
Extract textual, topical, influence, and projection features: The system enriches raw posts using TF-IDF, sentiment recognition, topic classification, influence calculation, and t-SNE projection so suspected rumors can be compared visually.
Support overview-level filtering: A choropleth map and temporal topic chart let administrators narrow the collection by location, topic, and time before examining detailed cases.
Represent cases in a feature projection: The projection view places suspected rumors in a two-dimensional feature map and uses glyphs to encode topic, influence, user fan and followee counts, tweet count, and profile integrity.
Inspect propagation and post details interactively: The propagation view uses a circular layout to show original tweets and multi-level retweets by hierarchy and date, with interactions for user information, full content, and side-by-side comparison.

Evaluation and Findings

The evaluation uses Sina Weibo suspected-rumor dataset.

The expert used the overview, projection, and propagation views to select a world-news case, inspect suspicious account and propagation cues, and conclude that the tweet was probably a rumor.

The evaluation is primarily qualitative, so the findings should be read with these caveats in mind:

The paper does not report a controlled baseline; the result should be interpreted as qualitative evidence from one experienced administrator.
The study supports the usefulness of the workflow for case reasoning, but it does not measure speed, accuracy, or comparative benefit.
The evaluation is based on one domain-expert case study and does not provide controlled comparisons, task-time measurements, or validation-accuracy metrics.
The dataset and evaluation are limited to Chinese-language Sina Weibo data, so cross-platform and multilingual generalization are not established.

In the case study, an experienced administrator used RumorLens to progress from broad filtering to instance validation and concluded that a selected tweet was probably a rumor. This supports the system as a qualitative decision aid, but it does not establish measured accuracy or time savings.
The workflow surfaced interpretable cues that mattered to the expert: high influence despite an unauthenticated account and relatively few fans, unusually deep retweet propagation, strong sentiment signals, and comments suggesting the original tweet was misleading.
The circular propagation design is argued to better combine retweet hierarchy and time-series information than alternatives such as node-link diagrams, treemaps, spiral timelines, or sunburst charts, especially when compactness is important.
The requirements study shows that administrators wanted tools for prioritizing suspected rumors, comparing feature patterns, and inspecting propagation paths and post contents. RumorLens directly maps its main views to those three needs.
The paper’s conclusions should be read within its scope: the evaluation uses one expert case study, the dataset comes from Chinese-language Sina Weibo, and the approach focuses mainly on textual information rather than images or videos.

Applications

This paper is most useful for:

Social media platform administrators and content-moderation analysts who already receive suspected-rumor candidates: The system is designed to help administrators analyze and validate suspected rumors through overview filtering, feature comparison, and propagation inspection.
Visualization and HCI researchers studying misinformation analysis workflows: The paper contributes a coordinated visual analytics system and a new circular propagation design for inspecting information diffusion.
Researchers working with Chinese microblog data or Sina Weibo-style rumor datasets: The dataset and evaluation are explicitly grounded in Sina Weibo suspected-rumor data.

It is less suitable for:

Projects that require a deployable automatic rumor classifier with reported accuracy, precision, or recall: The paper centers on interactive validation and reports a case study, not a classifier benchmark.
Deployments that must already be validated across languages and social platforms: RumorLens is designed for and evaluated on Chinese-language Sina Weibo data, with other languages left for future work.

Limitations

The evaluation is based on one domain-expert case study and does not provide controlled comparisons, task-time measurements, or validation-accuracy metrics.
The system needs richer user-history information, such as historical complaints and prior identified rumors connected to an account.
The paper leaves open how to choose rumor features and evaluate their effects on validation decisions.
The dataset and evaluation are limited to Chinese-language Sina Weibo data, so cross-platform and multilingual generalization are not established.
The approach focuses mainly on text and does not incorporate image information from rumor messages.

Frequently Asked Questions

What is the main idea of RumorLens?

RumorLens is an interactive visual analytics system for suspected rumors on social media. It lets administrators start with spatial and temporal overviews, compare suspected rumors through feature glyphs, and then inspect the detailed propagation of an individual message.

What problem does the paper try to solve?

The paper targets the gap between slow manual review and imperfect automatic rumor detection. It argues that platform administrators need tools that preserve human judgment while helping them handle many suspected rumors and inspect why a case may be suspicious.

How does the system support rumor validation?

It combines preprocessing methods such as keyword extraction, sentiment recognition, topic classification, influence calculation, and t-SNE projection with coordinated visual views. The most detailed view shows retweet depth, timing, sentiment, keywords, and post details for a selected suspected rumor.

What evidence does the paper provide?

The authors report four months of requirements work with platform administrators, a real Sina Weibo dataset, and one case study with an experienced domain expert. The case study illustrates how the system can guide a validation process, but it is not a controlled comparison.

What are the main limitations?

The paper does not report quantitative task performance or classifier accuracy for RumorLens. It is evaluated on Chinese-language Sina Weibo data, uses one expert case study, and leaves richer user-history features, feature-effect evaluation, multilingual validation, and image-based rumor evidence for future work.