Network Analysis Methodology

BLUF

Network Analysis Methodology in OSINT is the systematic application of graph-theory techniques to map relationships between entities — individuals, organizations, infrastructure nodes, communications channels, or financial accounts — for the purpose of identifying hidden actors, key connectors, structural vulnerabilities, and attribution pathways that are invisible in linear source-by-source analysis. The methodology originated in social network analysis (SNA) sociology but is now foundational to intelligence analysis, counterterrorism targeting, financial crime investigation, and information operations attribution. Its primary analytical products are: link analysis charts (visual relationship maps), centrality assessments (identifying which nodes are most influential), community detection (identifying clusters that share structural properties), and path analysis (mapping shortest routes between two entities). The methodology’s core analytical insight is that relationships are intelligence: who an actor communicates with, funds, travels alongside, or is geographically proximate to often reveals more about intent and capability than any individual node’s publicly available information.


Conceptual Foundations

Graph Theory Basics

A network consists of nodes (entities: people, organizations, accounts, IP addresses, phone numbers, financial accounts) and edges (relationships: communication, financial transaction, co-occurrence, shared attribute, physical co-location).

PropertyDefinitionIntelligence application
Directed graphEdges have direction (A → B ≠ B → A)Communication direction; money flow; command-and-control hierarchy
Undirected graphEdges are bidirectional (A — B)Co-occurrence; shared attribute; geographic proximity
Weighted edgeEdge carries a value (frequency, amount, strength)Communication frequency; transaction volume; relationship strength
Bipartite graphTwo distinct node types (e.g., persons and organizations)Membership networks; funding networks; event participation

Centrality Measures

The four primary centrality measures used in intelligence analysis:

MeasureDefinitionIntelligence interpretation
Degree centralityNumber of direct connectionsWho has the most connections — potential hub or coordinator
Betweenness centralityHow often a node lies on the shortest path between other nodesBroker, intermediary, or information chokepoint — high-value for interdiction
Closeness centralityAverage shortest path distance to all other nodesWho can reach all others fastest — rapid dissemination capability
Eigenvector centralityDegree weighted by the centrality of connected nodesWho is connected to influential nodes — prestige or high-value association

Assessment (High): Betweenness centrality is the most operationally significant measure for counterterrorism and financial investigation — nodes with high betweenness are the brokers that connect otherwise disconnected clusters. Removing them (or monitoring them) has the highest network disruption value.


Data Sources for Network Construction

Communication Networks

  • Social media follower/following graphs: Twitter/X (Academic API, suspended 2023; alternatives: Apify scraper, CrowdTangle historical); Facebook (Platform API, restricted); Telegram channel membership and forward chains; Instagram follower networks
  • Email headers: From, To, CC, BCC, Reply-To, and X-Originating-IP fields construct communication networks; leaked datasets (BlueLeaks, ICIJ, various breach datasets) provide historical communication edges
  • Phone records: Call detail records (CDRs) in disclosed legal proceedings; communication metadata in court filings

Financial Networks

  • Cryptocurrency transaction graphs: Bitcoin’s UTXO model makes transaction chains publicly traceable; Ethereum wallet interaction graphs; tools: Chainalysis (commercial), Elliptic (commercial), OXT (free for Bitcoin), Breadcrumbs (free), Etherscan (Ethereum)
  • Corporate ownership networks: OpenCorporates (global company registration data), ICIJ Offshore Leaks database, national corporate registries; construct beneficial ownership graphs
  • Sanctions networks: OFAC SDN list, EU Consolidated Sanctions List, UN Security Council Consolidated List — cross-reference against corporate and individual networks to identify sanctions exposure

Infrastructure Networks

  • Domain and IP infrastructure: WHOIS registrant email → domain cluster; passive DNS → IP hosting relationships; ASN ownership → hosting provider attribution; Shodan/Censys → co-hosted domains and shared SSL certificates
  • BGP routing: AS path analysis for IP address ownership; RIPEstat (free) for RIPE-region ASN data
  • Email infrastructure: MX record analysis; SPF/DKIM/DMARC configurations → hosting provider attribution

Physical and Geographic Networks

  • Co-location: flight manifests (disclosed court exhibits), hotel registration records (litigation discovery), property records, geotag data from social media
  • Event co-attendance: conference attendee lists, published photographs, academic co-authorship networks

Collection Phase — Building the Entity List

Step 1: Seed Node Identification

Define the analytical question precisely before collection. The seed node(s) are the starting entities from which the network will be constructed. Ambiguous seed node definition produces unfocused graphs that cannot answer the analytical question.

Examples:

  • Criminal network: Known members as seeds; expand via communication and financial edges
  • Disinformation operation: Known amplifier accounts as seeds; expand via retweet/share relationships and account creation proximity
  • Corporate beneficial ownership: Named company as seed; expand via director, shareholder, and subsidiary relationships

Step 2: Structured Edge Collection

For each seed node, systematically collect all available relationships. Record in a structured edge list format:

Source_Node | Relationship_Type | Target_Node | Weight | Date | Source_Evidence | Confidence

Maintain source evidence for every edge — unsourced edges must be labeled Unverified and should not drive analytical conclusions.

Step 3: Entity Resolution

Resolve aliases, pseudonyms, and spelling variants to canonical identities before building the graph. A single individual appearing under three usernames as three separate nodes produces a fragmented, misleading network. Entity resolution techniques:

  • Username clustering: same username across platforms (Sherlock, WhatsMyName, Maigret)
  • Profile image clustering: reverse image search across platforms
  • Writing style analysis: idiolect comparison for text-based personas
  • Metadata linking: shared device fingerprints, IP addresses, posting time signatures

See Entity Resolution Methodology for the full protocol.


Analysis Phase — Graph Construction and Interpretation

Visualization Tools

ToolTypeStrengthsAccess
GephiDesktop, open-sourceForce-directed layouts; centrality calculation; community detection; large graphs (100k+ nodes)Free
MaltegoDesktop, commercialAutomated OSINT transforms; real-time entity expansion; integrated data sourcesFreemium; Community edition free
Neo4j BrowserDatabase + visualizationCypher query language; persistent graph database; complex path queriesFree (self-hosted)
GraphistryWeb, GPU-acceleratedLarge-scale visual analytics; point-and-click explorationCloud (paid) / self-hosted
NodeXLExcel add-inLow barrier to entry; basic SNA; Twitter API integration (API tier dependent)Free (academic)
Palantir GothamEnterpriseGovernment-grade; classified and unclassified data fusionCommercial (government-contracted)

Layout Algorithms

  • Force-directed (Fruchterman-Reingold, ForceAtlas2): Nodes repel; connected nodes attract; produces organic cluster visualization. Default for most intelligence network analysis. Gephi’s ForceAtlas2 is the standard.
  • Circular layout: All nodes on a circle; edges visible as chords. Useful for bipartite networks (e.g., persons on one semicircle, organizations on the other).
  • Hierarchical layout: Best for command-and-control or organizational hierarchy analysis.

Community Detection Algorithms

  • Louvain method: Maximizes modularity; identifies clusters with denser internal than external connections. Standard for detecting organizational subgroups within large networks.
  • Girvan-Newman: Iteratively removes edges with highest betweenness; reveals community structure. Computationally expensive for large networks.
  • Label propagation: Fast; useful for first-pass community identification on large graphs.

Intelligence application: Community detection identifies operational cells (tight clusters with few external connections), support networks (clusters with high external connectivity), and isolated individuals (no cluster membership — potential cut-outs or unwitting participants).


Analytical Products

The primary visual product. Elements:

  • Node shape/color: encode entity type (person = circle, organization = square, domain = diamond, financial account = triangle)
  • Edge thickness: encode relationship weight (transaction volume, communication frequency)
  • Edge color: encode relationship type (communication = blue, financial = green, co-location = orange, shared attribute = gray)
  • Node size: encode centrality measure (eigenvector or betweenness — the most analytically significant)
  • Date annotation: for temporal networks, timestamp edges and animate or layer the graph chronologically

Centrality Report

Rank all nodes by each centrality measure; flag the top 5 by betweenness centrality for priority collection tasking. The betweenness-centrality top list is the primary output for interdiction planning or priority HUMINT targeting.

Path Analysis Report

For any two nodes of interest: shortest path analysis (what is the minimum number of edges connecting them?); all paths analysis (enumerate all routes between them). Path analysis drives association analysis in law enforcement and intelligence contexts.


Temporal Network Analysis

Static network analysis (all edges treated as simultaneous) misses the sequential development of relationships. Temporal analysis:

  • Edge timestamping: record the date each relationship was established
  • Sliding window analysis: analyze the network at T, T+30 days, T+60 days to observe emergence and dissolution of connections
  • Event-driven analysis: map relationship changes against known real-world events (arrests, operations, publications) to identify causation and reaction patterns

Intelligence application: In information operations analysis, temporal network analysis distinguishes a pre-existing coordinated network (edges pre-date the target event) from spontaneous organic amplification (edges emerge during or after the event). This is a primary discriminator for CIB attribution.


Counter-Network Analysis

For targeting analysis (law enforcement, military intelligence): identify which nodes, if removed, maximally disrupt network function:

  1. High betweenness nodes: removal disconnects the largest number of node pairs
  2. Articulation points: nodes whose removal splits the network into disconnected components
  3. High-degree hubs: removal reduces connectivity most broadly

Assessment (High): Counter-network strategy should prioritize articulation points over high-degree hubs when the goal is network fragmentation rather than disruption — removing an articulation point creates isolated subgraphs; removing a hub reduces connectivity but leaves the network intact. The distinction matters for both kinetic targeting and legal interdiction sequencing.


Limitations and Failure Modes

Failure modeDescriptionMitigation
Selection biasSeed node selection determines what the graph can revealDefine analytical question before seeding; use multiple independent seeds
Missing dataEncrypted or dark-channel communications produce incomplete edge listsLabel graph as partial; explicitly state collection gaps
False edgesTwo entities sharing an attribute (same ISP, same common name) may not be relatedRequire multiple independent evidence streams per edge for High-confidence links
Entity resolution failureSame entity appearing as multiple nodes fragments the networkSystematic resolution protocol before graph construction
Temporal conflationEdges from different time periods treated as simultaneousAlways timestamp edges; use temporal analysis for time-sensitive investigations

Case Studies

Case Study 1: ICIJ Panama Papers — Beneficial Ownership Network (2016)

The International Consortium of Investigative Journalists applied network analysis to 11.5 million documents from Mossack Fonseca, constructing a global beneficial ownership graph. The graph revealed hidden connections between political figures, sanctioned individuals, and offshore shell company structures that were undetectable from linear document review. Key finding: betweenness-centrality analysis identified intermediary law firms and formation agents as the critical brokers connecting clients to shell structures — a counter-network insight that regulatory reform subsequently targeted (beneficial ownership registries, introduced across EU member states 2017–2020, directly respond to this structural analysis).

Case Study 2: EU DisinfoLab Indian Chronicles Network (2019–2020)

DisinfoLab used network analysis to map a coordinated network of fake NGOs, news outlets, and defunct think tanks — all ultimately linked to a single Indian political communications firm — that had been amplifying Indian government positions in UN human rights forums for 15 years. Network analysis identified shared website infrastructure (WHOIS registrant clustering), recycled content relationships (near-identical articles across nominally independent outlets), and domain-creation date clusters as the structural evidence for coordination. The network contained over 750 fake media outlets and 10 NGOs spanning 116 countries — invisible to linear source-by-source verification but structurally obvious in graph form.


Key Connections

Methodological dependencies: Entity Resolution Methodology — prerequisite: entities must be resolved before graph construction Disinformation Detection Methodology — Layer 2 (network pattern analysis) applies this methodology Link Analysis — the specific analytical framework for structured intelligence products Social Media Intelligence — primary data source for social network construction Financial Intelligence — financial network construction methodology

Tools: Maltego Guide — primary OSINT network expansion tool Shodan-Censys Guide — infrastructure network data source Crypto Tracing Tools Guide — financial network construction for cryptocurrency

Institutional applications: Attribution — network analysis as attribution methodology Active Investigations — apply to all open investigations requiring relationship mapping