POI Profiling Methodology
BLUF
Person of Interest (POI) profiling is the systematic, multi-source construction of a comprehensive intelligence profile of a named subject from open-source material. The output is a structured, source-anchored document that integrates biographical, financial, relational, behavioral, digital, and reputational dimensions into a single analytic artifact, with each constituent claim traceable to a primary or corroborated secondary source and tagged with an explicit confidence rating. POI profiling is an all-source synthesis discipline: it does not invent a new collection technique but orchestrates the disciplined application of biographical research, corporate-registry analysis, network mapping, digital footprint sweeping, and behavioral observation against a single, identified subject.
POI profiling is distinct from three adjacent activities with which it is frequently confused. Entity Resolution (see Entity Resolution Methodology) is the upstream disambiguation step that confirms whether records, accounts, or mentions refer to the same individual; entity resolution is a prerequisite to profiling, not a substitute for it. Pattern of Life Analysis (see Pattern of Life Analysis) is a downstream specialization focused specifically on temporal-behavioral patterns — where the subject is, when, doing what — typically used in operational or protective contexts. Target dossiers in the classified-intelligence sense may integrate signals intelligence, human sources, and other restricted material; POI profiling as practiced in OSINT contexts uses only lawfully obtainable open sources and is governed by the ethical and legal constraints in Section 5.
Primary use cases include investigative journalism (exposing undisclosed financial interests, conflicts of interest, or hidden relationships of public figures); corporate due diligence (counterparty background investigation for M&A, hiring, partnership decisions); accountability OSINT (identification and documentation of human rights violators, war crimes suspects, and sanctions evaders); law enforcement support (lead development from open sources, never as a substitute for formal legal process); and security vetting (insider threat and pre-employment background where lawfully scoped).
The Profile Architecture — What a Complete POI Profile Contains
A defensible POI profile addresses six dimensions. Each is independently sourced and independently rated for confidence.
| Layer | Contents |
|---|---|
| Biographical | Full legal name in all variants and scripts (incl. patronymics, transliterations); date and place of birth; nationalities (incl. dual/multiple); educational history (institutions, degrees, dates); professional history; military or government service; family background |
| Financial | Declared wealth and income sources; property holdings; corporate directorships and shareholdings; beneficial ownership of opaque vehicles; salary, fees, and consultancy income; undisclosed assets identified via leaks or registries; cryptocurrency holdings; luxury assets (aircraft, yachts, real estate, art) |
| Network | Family ties; close personal associates; business partners and co-investors; legal and financial advisors; political connections; intelligence and government relationships; organized crime links where documented; known proxies and nominees |
| Behavioral | Typical daily and weekly patterns derivable from open data; travel patterns; public appearance schedule; media interaction style (combative, evasive, prolific); litigation behavior (SLAPP history, settlement patterns); reaction profile to public scrutiny |
| Digital | Social media presence across all platforms, including non-obvious (e.g., Strava, Goodreads, niche forums); email addresses and historical addresses; usernames and handle history; domain registrations; device signatures where detectable; preferred communication platforms |
| Reputation and Narrative | Subject’s self-constructed public narrative vs. externally documented record; contradictions between stated biography and documented history; arc of media coverage over time; prior investigations, regulatory findings, or adverse rulings |
The architecture is deliberately exhaustive. In any given profile, several layers will return thin or null results — those gaps are themselves analytically informative and must be documented rather than silently omitted.
The Six-Phase Methodology
Phase 1 — Subject Identification and Seed Collection
Profiling begins only after the subject’s identity is operationally fixed. The analyst is not investigating “a person named Ivan Petrov” but “this specific Ivan Petrov, with these anchoring attributes, distinguishable from the other 4,200 Ivan Petrovs in open records.”
Seed attribute set (minimum):
- Full legal name (and known variants)
- Jurisdiction of primary residence or activity
- Approximate date of birth (±2 years)
- Known current or recent employer/role
- At least one verified contact vector: a confirmed social media account, an email address tied to verified activity, or a known residential or business address
Each seed attribute is logged with its source and confidence (primary source vs. secondary report). If seed identity is uncertain — multiple plausible candidates, common name in target jurisdiction, transliteration ambiguity — the analyst halts profiling and runs the disambiguation procedure in Entity Resolution Methodology first. Assessment: most documented profiling errors trace to a failed or skipped Phase 1, not to errors in later collection.
Phase 2 — Biographical Layer Collection
- Academic records: university websites (faculty pages, alumni directories), thesis and dissertation databases (ProQuest, DART-Europe, national equivalents), academic publications (ORCID, Google Scholar, Scopus, ResearchGate)
- Professional history: LinkedIn current profile plus Wayback Machine cached versions (job histories are frequently sanitized retroactively); corporate-registry director records (Companies House, OpenCorporates, national registries); professional licensing bodies (bar associations, medical boards, regulatory registers); published CVs on personal sites, conference bios, and university faculty pages
- Government/military service: official appointment records, parliamentary and congressional registers, military honor and personnel records (where declassified), gazettes and official journals, diplomatic lists
- Media archive sweep: Factiva, LexisNexis, ProQuest News & Newspapers; Google News across all name variants and transliterations; regional and language-specific archives (e.g., Integrum for Russian-language press); historical newspaper archives (Newspapers.com, British Newspaper Archive)
- Court records: civil and criminal litigation history via PACER (US federal), state court portals, BAILII (UK), CourtListener; divorce and probate records, which are frequently wealth-revealing; corporate liquidation and bankruptcy proceedings
Phase 3 — Financial Layer Collection
- Property: national land registries (HM Land Registry UK, county assessor databases US, equivalents elsewhere); offshore property registers where available (Dubai REST, scraped datasets from OCCRP); leaked datasets curated by ICIJ
- Corporate interests: Companies House (UK), SEC EDGAR (US), OpenCorporates (cross-jurisdictional aggregator), national business registries, ICIJ Offshore Leaks Database for beneficial-ownership exposure (Panama Papers, Pandora Papers, Paradise Papers, Cyprus Confidential)
- Political donations and lobbying: FEC (US), Electoral Commission (UK), EU Transparency Register, OpenSecrets, national equivalents
- Luxury assets: aircraft registration via FAA Registry, ICAO databases, EASA records, with movement tracking via ADS-B Exchange (which does not honor blocking requests); yacht registration via Lloyd’s Register and US Coast Guard documentation, with movement tracking via MarineTraffic and VesselFinder; vehicle registry where public
- Cryptocurrency: where wallet addresses are identifiable (via past disclosures, exchange leaks, or attribution research), conduct on-chain clustering using Chainalysis Reactor (commercial), Arkham Intelligence, Etherscan/Blockchain.com, and Breadcrumbs. See Crypto Tracing Tools Guide.
Phase 4 — Network Layer Mapping
- Family: public birth and marriage records, genealogy databases where lawfully accessible, obituaries (frequently the richest single source of family-network data)
- Business associates: shared corporate directorships (OpenCorporates network views), co-signatories on filings, co-investors in disclosed deals, regulatory co-filings
- Political connections: donor records, lobbying disclosures, photographed appearances at political events, attendance at fundraisers and conferences, social media endorsements and tagged interactions
- Intelligence and security connections: former agency roles, advisory or board membership on security firms and defense contractors, think-tank affiliations, security-cleared-contractor histories
- Visualization: construct the network graph in Maltego (see Maltego Guide) or programmatically in NetworkX/Gephi; compute betweenness centrality and eigenvector centrality to identify intermediaries and brokers who may themselves merit profiling; see Network Analysis Methodology and Link Analysis
Phase 5 — Behavioral and Digital Layer Collection
- Social media sweep: run Sherlock and Maigret across the subject’s known and probable usernames; manually verify candidate matches (automated tools produce false positives); archive every located account with Hunchly or equivalent before the subject can sanitize it; extract posting time distributions (revealing time zone and routine), linguistic style (forensic-linguistics indicators), and topic distributions
- Travel patterns: Instagram geotags, historical Twitter/X location metadata where preserved, Strava activity tracks (a recurring operational-security failure for political and military figures), ADS-B Exchange tail-number history for private aviation, MarineTraffic for yacht movements
- Public appearance patterns: conference and event speaker lists, press conference schedules, court appearance dates, parliamentary or board attendance records
- Digital infrastructure: WHOIS and historical WHOIS (DomainTools, SecurityTrails) on domains associated with the subject; passive DNS for infrastructure overlap; crt.sh for TLS certificate transparency history; Shodan and Censys for exposed services tied to subject infrastructure (see Shodan-Censys Guide)
- Counter-OSINT indicators: anomalous gaps in the digital footprint, accounts with implausibly clean histories, legend inconsistencies (claimed biographies that do not survive routine verification) — see Counter-OSINT Methodology. Assessment: a too-clean digital footprint on a person whose public role would generate routine traces is itself a strong indicator warranting deeper investigation.
Phase 6 — Synthesis, Contradiction Analysis, and Profile Finalization
- Cross-check every collected attribute against every other. Contradictions are more analytically informative than confirmations; a single contradicted claim can collapse an entire claimed identity.
- Verify claimed biography against documented record at the level of dates, institutions, and roles. A CV claim of “MBA, Harvard, 1998” is testable against alumni records, thesis databases, and yearbook archives.
- Apply per-element confidence ratings: High (primary source verified, independently corroborated), Medium (corroborated secondary or single primary), Low (single secondary source or analytical inference).
- Document every source: URL, archive snapshot link (Wayback, archive.today), retrieval date, language, and confidence.
- Identify gaps explicitly: what specific evidence, if obtained, would materially change the profile? Naming the gap converts an unknown unknown into a known unknown and feeds the next collection cycle.
Profile Report Structure
The standardized output format ensures profiles are comparable, auditable, and reusable.
- Subject Identification — confirmed identity with explicit confidence statement and the anchoring attribute set from Phase 1
- Executive Summary — 2–3 paragraph synthesis covering the subject’s significance, the profile’s key findings, and primary risk indicators (financial, legal, reputational, security)
- Biographical Record — structured output from Phase 2, in chronological order with per-element confidence
- Financial Profile — structured output from Phase 3, separating declared, documented-but-undeclared, and assessed-but-unverified holdings
- Network Map — structured output from Phase 4 with the network graph visualization and an annotated table of top-N connections by analytical weight
- Behavioral and Digital Profile — structured output from Phase 5
- Contradiction Analysis — every identified gap between the subject’s claimed narrative and the documented record, with the supporting evidence for each contradiction
- Confidence Assessment — overall profile confidence; the three most material uncertainties; what specific additional evidence would upgrade or degrade the assessment
- Source Log — every source consulted, with URL, archive link, retrieval date, language, and confidence tag
Ethical and Legal Constraints
POI profiling sits at the intersection of public interest and individual privacy and is governed by an ethical and legal framework that the analyst must apply before, during, and after collection.
Private versus public figures. The investigative threshold differs sharply. Public officials exercising public power, executives of regulated entities, and recipients of public funds operate under a reduced privacy expectation in respect of those public roles. Private individuals do not, and profiling them requires a substantially higher public-interest threshold and proportionality assessment. Profiling a private individual’s family members, who are not themselves the investigation target, requires an additional and independent justification.
Minimum necessary information. Collect only what the investigation requires. Over-collection creates legal exposure (data-protection liability), ethical exposure (intrusion beyond legitimate purpose), and operational exposure (larger handling footprint, larger spill risk) without proportional analytical gain. The discipline of writing the investigation question first, and then collecting only against that question, is the most reliable check on scope creep.
Data-protection regimes. Personal data collected in profiling exercises falls within GDPR (EU/UK), LGPD (Brazil), CCPA/CPRA (California), PIPEDA (Canada), POPIA (South Africa), and equivalents. The analyst must establish a lawful basis (typically legitimate interest, with journalistic exemptions where applicable), document data minimization, define retention and deletion schedules, and respect data-subject rights where the regime grants them.
Re-identification risk in publication. Publishing a profile that names the subject does not automatically license the publication of all collected material on third parties. Editorial review must explicitly assess re-identification risk to family members, associates, and bystanders who are not the investigation’s public-interest target. Redaction of third-party identifiers in published outputs is the default; un-redaction requires affirmative justification.
Berkeley Protocol compliance. For profiles that may enter accountability proceedings — war crimes investigations, sanctions designations, ICC matters — collection must follow the Berkeley Protocol on Digital Open Source Investigations: documented chain of custody, hashing of evidentiary artefacts, verification standards, and analyst-level documentation. See OSINT for Human Rights.
The crowdsourcing failure mode. POI profiling exercises that are crowd-sourced through Reddit, 4chan, or open Discord servers have produced documented misidentifications with serious real-world harm — most notoriously the 2013 Boston Marathon misidentifications. Amateur profiling without trained editorial oversight, without verification standards, and without legal review is a significant misuse pathway. Methodological rigor is not optional decoration; it is the boundary between investigation and defamation.
Case Studies
Bellingcat + Der Spiegel + The Insider — Skripal attackers (2018–2019). Following the Salisbury nerve-agent attack, a multi-outlet team constructed full POI profiles of the two GRU officers operating under the cover identities “Alexander Petrov” and “Ruslan Boshirov.” Profile construction integrated Russian passport database records, Aeroflot flight manifests, hotel registration records, customs and border records, leaked driver records, university yearbook archives, and social-media correlation. The profile work conclusively identified “Boshirov” as Col. Anatoliy Chepiga, a decorated GRU officer, and “Petrov” as Dr. Alexander Mishkin of GRU Unit 29155 — a definitive accountability outcome produced entirely from open and leaked sources.
ICIJ — Panama Papers and successor leaks (2016–present). The Panama Papers, Paradise Papers, Pandora Papers, and Cyprus Confidential investigations have produced systematic POI profiling at scale across 214 jurisdictions, integrating corporate-registry data, leaked Mossack Fonseca / Asiaciti / Trident Trust records, biographical cross-references, and beneficial-ownership inference. The methodology — register the entity, identify the human beneficial owner, profile the human against documented public role — is the canonical model for financial-conflict-of-interest profiling.
Human Rights Watch Digital Investigations Lab and Bellingcat — war-crimes suspect profiling. In Syria, Ukraine, Myanmar, and Sudan, suspect commanders have been profiled from unit-insignia identification in geolocated combat video, command-structure documentation from leaked or published military records, social-media exposure by subordinates, and biographical cross-referencing — supporting subsequent universal-jurisdiction prosecutions in German, Swedish, and Dutch courts. The Koblenz trial of Anwar Raslan (2022) demonstrated that profiles built to Berkeley Protocol standards survive judicial scrutiny.
Strategic Implications
- POI profiling is the OSINT discipline most prone to crossing legal and ethical lines; methodological discipline is not a refinement, it is the precondition of legitimate practice.
- The defining unit of profile quality is contradiction analysis, not collection volume. A 200-source profile that does not surface contradictions in the subject’s self-presented narrative has failed analytically even if it is bibliographically impressive.
- Counter-profiling capabilities — sanitized digital legends, scrubbed corporate records, sophisticated proxies — are now standard tradecraft for hostile actors. Absence of expected traces is itself an analytic signal, not a null finding.
- Beneficial-ownership profiling has been transformed by the leaks-and-registries era; the analytical question has shifted from “what does the subject own?” to “what does the subject control through nominees and opaque vehicles?”
- Profiling pipelines are increasingly machine-assisted (entity-resolution ML, automated handle correlation, registry scrapers), but the synthesis, contradiction analysis, and confidence rating remain irreducibly analyst tasks. Automation expands collection; it does not replace judgment.
Sources
- Berkeley Protocol on Digital Open Source Investigations, UN OHCHR & UC Berkeley Human Rights Center, 2022. [primary, authoritative]
- Bellingcat, “Skripal Suspect Boshirov Identified as GRU Colonel Anatoliy Chepiga,” 26 September 2018. [primary, investigative]
- ICIJ Offshore Leaks Database documentation, International Consortium of Investigative Journalists, ongoing. [primary, authoritative]
- Human Rights Watch, “Digital Investigations Lab Methodology Notes,” ongoing publications. [primary, authoritative]
- Bazzell, M. Open Source Intelligence Techniques (11th ed., 2023), IntelTechniques. [primary, practitioner]
- European Data Protection Board, “Guidelines on the processing of personal data in the context of journalism,” 2023. [primary, authoritative]
- OCCRP Investigative Dashboard documentation and Aleph platform guides, Organized Crime and Corruption Reporting Project. [primary, authoritative]
- Akhgar, B., Bayerl, P.S., Sampson, F. (eds.) Open Source Intelligence Investigation: From Strategy to Implementation, Springer, 2017. [primary, academic]
Key Connections
OSINT, Entity Resolution Methodology, Pattern of Life Analysis, Network Analysis Methodology, Link Analysis, Source Verification Framework, Maltego Guide, Crypto Tracing Tools Guide, Shodan-Censys Guide, Dark Web Methodology, Counter-OSINT Methodology, OSINT Ethics, OSINT Legal Framework, OSINT for Human Rights, Attribution, Corporate OSINT and Due Diligence