LLM-Based Detection of Manipulative Political Narratives

Authors: Sinclair Schneider, Florian Steuber, Gabi Dreo Rodosek
Published: 2026-05-14
arXiv: 2605.14354
Source: arXiv (cs.CL)

Abstract

We present a new computational framework for detecting and structuring manipulative political narratives. A task that became more important due to the shift of political discussions to social media. One of the primary challenges is differentiating between manipulative political narratives and legitimate critiques. Some posts may also reframe actual events within a manipulative context. To achieve good clustering results, we filter manipulative posts beforehand using a detailed few-shot prompt that combines documented campaign narratives with legitimate criticisms to differentiate them. This prompt enables a reasoning model to assign labels, retaining only manipulative narrative posts for further processing. The remaining posts are subsequently embedded and dimensionality-reduced using UMAP, before HDBSCAN is applied to uncover narrative groups. A key advantage of this unsupervised approach is its independence from a predefined list of target categories, enabling it to uncover new narrative clusters. Finally, a reasoning model is employed to uncover the narrative behind each cluster. This approach, applied to over 1.2 million social media posts, effectively identified 41 distinct manipulative narrative clusters.


Why This Work Matters

The paper solves a critical practical problem: distinguishing manipulative narratives from legitimate political criticism at scale, without a predefined target list. The unsupervised discovery of 41 narrative clusters from 1.2 million posts demonstrates both the scale at which modern IO operates and the feasibility of automated pattern extraction that does not depend on human experts pre-labeling every campaign type.

The “independence from predefined categories” is the key methodological contribution. It enables detection of novel manipulation campaigns — the ones that evade rule-based and supervised-classification systems because they are new. This is operationally significant for early-warning IO monitoring.

Core Concepts and Contributions

Pipeline architecture: Three-stage framework — (1) LLM few-shot filtering to separate manipulative from legitimate posts; (2) UMAP dimensionality reduction + HDBSCAN unsupervised clustering on embeddings; (3) LLM reasoning model to characterize each narrative cluster. Each stage uses a different LLM capability: classification, unsupervised pattern extraction, and narrative synthesis.

Few-shot prompt design: The filtering prompt is designed to explicitly include examples of legitimate political criticism alongside documented manipulation campaigns — forcing the model to attend to intent and framing, not just topic. This directly addresses the manipulation-vs-critique ambiguity that defeats naive keyword or topic-based detection.

Unsupervised cluster discovery: The UMAP+HDBSCAN approach produces 41 clusters from 1.2M posts without requiring predefined narrative categories. This makes the framework applicable to novel campaigns and to analysts working in new information environments without established campaign taxonomies.

Scale validation: 1.2 million posts is operationally realistic for social media IO monitoring. The framework’s performance at this scale is evidence of production viability, not just lab conditions.

Connections