Tarka

Prove every signal.

Open-source, modular fraud detection platform. Pick the components you need or run the full stack.

Tarka — from Sanskrit तर्क (tarka), the method of logical hypothesis testing in Nyaya Shastra (Indian analytical philosophy). Every signal is a hypothesis; every decision is proved.

Canonical repo: github.com/pamu512/tarka

Saarthi (Investigation Copilot) — OSS ships in this repo as services/investigation-agent. Standalone paid: Saarthi Pro. Buyer / PMO summary: Saarthi Pro vs OSS.

	OSS (`investigation-agent`)	Saarthi Pro
Best for	Full Tarka stack, self-hosted ops	Procurement, SLAs, governance roadmap, focused copilot SKU
You own	Upgrades, uptime, compliance mapping	Commercial terms + vendor support (where purchased)
Code	Here in `services/investigation-agent`	github.com/pamu512/Saarthi-pro

What’s on trunk (shipping now)

These capabilities are in the codebase today and roll forward on master:

Decision API: normalized inference_context on evaluate responses (integrity, tamper, network trust, replay, geo-consistency, top signals) plus OpenAPI contract alignment; session geo merges optional browser GPS and server IP geo hints; sdk:geo_ip_mismatch / sdk:geo_tz_mismatch signal tags when inconsistent; /v1/ops/calibration-status and calibration_status on /v1/ops/governance for drift posture.
Ingress hardening: replay-style payload detection (short-lived Redis signatures) folded into scoring and audit context; optional HMAC on POST /v1/decisions/evaluate when REQUEST_SIGNATURE_SECRET is set (see TLS pinning & signed requests).
SDKs: Python and TypeScript clients typed for inference_context on evaluate responses; TypeScript optional enableGeo (browser GPS); Python server collector optional enable_ip_geo / ENABLE_IP_GEO_LOOKUP (public IP lookup is off by default).
Graph (lite path): default schema includes Place (quantized geo cells) and SEEN_AT edges for co-location–style graph context when enabled.
Frontend: case explainability surfaces inference metrics; API client can fall back to mock data when backends are down (demo-friendly).
Ops / planning: module project roadmaps under docs/docs/projects/, 30/60/90 plan, competitive notes, and OSS adoption backlog (issues + dependency order in docs).

April 2026 — Investigation copilot, collaboration bridge, and ops

Investigation agent (Saarthi): GET /v1/ready (data-dir readiness), GET /v1/setup (first-run checklist), and a production object on GET /v1/health when production profiling is enabled; GET /v1/workflows with workflow_id / workflow_params (plus playbook_id / batch_id where applicable) on POST /v1/chat; case-summary PDF and turn-bundle report routes; optional copilot rate limits and request body size cap. Reference env: services/investigation-agent/.env.reference.example. Hardening compose: deploy/docker-compose.production-hardening.yml. Integration notes: CHANGELOG_INTEGRATION.
Trust / ops, evidence summary, parity: Decision API GET /v1/ops/evaluation-posture + GET /v1/slo for the console readiness strip; POST /v1/evidence/summary (deterministic citations + next actions); Feature Service POST /v1/internal/parity/verify. Indexed in API Reference (Decision, Feature Service, Investigation Agent sections).
Collaboration chat bridge (services/collaboration-chat-bridge): Slack, Microsoft Teams, and Lark with optional per-source minute rate limits; Slack file text extraction (plain text, CSV, PDF, Excel .xlsx); SSRF-hardened fetch of the first public https:// URL in the user line; directives !wf, !wfp, !style; forwards workflow and batch fields to the agent. Details: services/collaboration-chat-bridge/README.md, Collaboration chat & cloud.
Frontend: Investigation page updates for copilot setup and workflows (frontend/src/pages/Investigation.tsx).
Observability & deploy: Grafana dashboard JSON for copilot metrics under deploy/observability/; optional deploy/docker-compose.host-ports.override.yml for local port mapping; guide Investigation CMS & ITSM.

v1.1.0 train — tests, CI/CD, security, onboarding

Mirrors docs/docs/releases/v1.1.0-2026-04-30.md and RELEASE_SCHEDULE.md.

Tests and validation

Unit coverage for inference_build (tiering, velocity, travel/colocation, derive_recommended_action).
pytest for /v1/replay paired trace_ids mode (order, missing_trace_ids, empty-window 404).

CI/CD, security hygiene, and first-run polish

GitHub Actions CI (main / master): Ruff; decision-api tests with coverage gate (≥48% as enforced in .github/workflows/ci.yml, path to 60%+); case-api, Python SDK; graph-service; integration-ingress; investigation-agent; graphql-gateway, event-ingest, analytics-sink, feature-service, ml-scoring; frontend npm run test then npm run build + TypeScript SDK npm run build; Alembic migrations for decision/case APIs on PostgreSQL startup; GraphQL /metrics via shared observability; benchmark-latency-evaluate job (lite compose + scripts/benchmarks/latency_evaluate.py artifact); coverage XML artifacts; Docker builds gated on all jobs.
Security scanning workflow: Trivy filesystem + decision-api image → SARIF upload (where code scanning is enabled); weekly schedule.
Secret scanning workflow: TruffleHog on push/PR/schedule (.github/workflows/secret-scan.yml).
Dependabot: grouped updates for GitHub Actions, pip (core services), npm (frontend).
Docs: SECURITY.md (responsible disclosure), LICENSE-DEPENDENCIES.md (Neo4j AGPL / lite and alternates), CODE_OF_CONDUCT.md, docs/docs/guides/security-scanning.md, docs/docs/guides/sandbox-five-minute.md (copy-paste evaluate + OSINT + UI path).
Onboarding: .devcontainer/devcontainer.json (Codespaces / Docker-outside-Docker); README badges (CI, security scan, Codespaces); Maintainer walkthrough (Loom, Tarka / this repo only): five-minute sandbox + Case Detail explainability. (Not Skuld or other repos — those are separate products.)
deploy/docker-compose.lite.yml: adds integration-ingress (8003) so lite stack matches the five-minute OSINT demo without full Neo4j.

Planned validation (release gate)

pytest (decision-api), frontend npm run test + npm run build, and TypeScript SDK npm run build green before tag.
CI workflow green on default branch: lint, all Python service test jobs, Node builds, Docker build matrix.
Trivy security workflow completes (SARIF upload may depend on org plan); Dependabot enabled for the repository.
Lite compose smoke: docker compose -f deploy/docker-compose.lite.yml up -d --build → 8000 evaluate, 8003 OSINT health, 3000 frontend reachable.

Client SDKs (evaluate vs ingest)

Synchronous scoring: call Decision API POST /v1/decisions/evaluate via DecisionClient (Python / TypeScript under packages/).
Async high-volume path: send events to event-ingest POST /v1/events (NATS → worker → evaluate) via EventIngestClient; optional Idempotency-Key when REDIS_URL is configured on ingest.

Onboarding (ports, metrics, replay script): docs/docs/guides/ingest-replay-onboarding.md — see also docs/docs/sdks/python.md and docs/docs/sdks/typescript.md.

Examples, benchmarks, and ops

What	Where
Scripts index (CI gates, policy/ML validators, links to subtree READMEs)	scripts/README.md
Three walkthroughs (payments + ML, bot defense, IOC + graph)	docs/docs/guides/examples/README.md
Evaluate latency (stdlib script)	scripts/benchmarks/README.md
Simulation / A-B rules	docs/docs/guides/shadow-and-ab-testing.md
Prometheus + Grafana (compose add-on)	deploy/observability/README.md
Apache-friendly graph options (vs Neo4j AGPL)	docs/docs/guides/graph-backend-alternatives.md

Shipping cadence & releases

Artifact	Where
Version targets (`v1.1.0` … `v1.3.0`)	RELEASE_SCHEDULE.md
May 2026 Friday train (weekly commits / themes)	docs/docs/guides/release-calendar-2026-05.md — queue: `scripts/release/release-queue-2026-05.json`
OSS-pattern execution order (`#31`–`#54` + graph)	docs/docs/guides/oss-ship-order-dependencies.md
Product milestones (Epics A–F)	docs/docs/guides/roadmap-30-60-90.md

June 2026 milestones on GitHub group the borrowed-from-OSS workstream (policy DAG, typologies, parity gates, deployment profiles, scorecards, etc.) — see issues labeled borrowed-from-OSS and the module swimlanes project.

Who Should Choose Tarka

Choose Tarka if you need fraud controls that your team can own, audit, and evolve quickly.

Fintech, payments, lending, crypto, and marketplaces that need real-time decisions plus investigations.
Risk and fraud teams that want rules + ML + graph in one stack, with explainable decisions and evidence exports.
Engineering teams that prefer open, modular architecture over closed vendor lock-in.
Compliance-heavy organizations that need auditable controls, traceability, and regional privacy support.
Teams with existing tools that want to integrate KYC, sanctions, device, CRM, or dispute providers via one hub.

Tarka may be less ideal if you only need a very basic, single-rule workflow and do not require integrations, investigations, or governance.

Install

# Clone the repository
git clone https://github.com/pamu512/tarka.git
cd tarka

# Option 1: Interactive installer (pick modules)
python tarka.py install

# Option 2: Install everything
python tarka.py install --all

# Option 3: Minimal setup (5-minute quickstart — Decision + Case + OSINT ingress + UI; no Neo4j)
python tarka.py install --lite

# Option 4: Specific modules only
python tarka.py install --modules core,graph,ml,frontend

Try in five minutes (Decision API + inference + OSINT + UI)

Full copy-paste path: docs/docs/guides/sandbox-five-minute.md — docker compose -f deploy/docker-compose.lite.yml up -d --build, then curl the Decision API for live inference_context, Integration Ingress for parallel OSINT, and open the frontend (mock fallbacks for graph-heavy views when Neo4j is not running).

Prebuilt images (optional)

docker compose -f https://raw.githubusercontent.com/pamu512/tarka/master/deploy/docker-compose.sandbox.yml up -d

http://localhost:3000 — frontend
http://localhost:8000/v1/health — decision-api
http://localhost:8003/v1/health — integration-ingress

GitHub Codespaces

Use the badge at the top of this README, then in the terminal:
docker compose -f deploy/docker-compose.lite.yml up -d --build
(Ports 3000, 8000, 8002, 8003 are forwarded from .devcontainer/devcontainer.json.)

Walkthrough video

Experience Tarka -Click Here.

Security & compliance (table stakes)

SECURITY.md — responsible disclosure
LICENSE-DEPENDENCIES.md — Neo4j AGPL and Apache-friendly lite option
CODE_OF_CONDUCT.md
Dependabot + Trivy workflows — see docs/docs/guides/security-scanning.md
Regional AI governance builds (US / EU+UK / global Investigation Copilot profiles) — docs/docs/guides/ai-governance-regional-builds.md · deploy/profiles/ai-governance/README.md

Requirements

Python 3.11+
Docker & Docker Compose

What Each Module Includes

CLI slugs stay stable; codenames are the product story (see Module codenames). Riti (gateway) draws on rīti (रीति) in the technical Sanskrit lexicon—often read in sources such as the Viṣṇudharmottarapurāṇa as iron rust, an ingredient of Vajralepa (a hard cement)—as a metaphor for the GraphQL layer that binds services into one API surface.

Slug	Codename	What You Get	Infrastructure
`core`	Hetu	Decision API, rules engine, Redis tags/scores, OPA	Postgres, Redis
`graph`	Jaala	Neo4j entity graph, community detection, fraud rings	Neo4j
`ml`	Anumana	ONNX inference, adaptive autoencoder, feature engineering	—
`cases`	Lekh	Case management, workflow automation, SAR generation	Postgres
`integration`	Setu	KYC adapters, 12-source OSINT enrichment	Postgres
`agent`	Saarthi	AI investigation copilot (LLM tool-use)	—
`streaming`	Srotas	High-throughput event ingestion via NATS JetStream	NATS
`analytics`	Kala	ClickHouse OLAP, historical decision analytics	ClickHouse, NATS
`gateway`	Riti	Unified GraphQL API over all REST services	—
`frontend`	Dwar	React dashboard (10 pages)	—

pip Install (Library Use)

# Install as Python library with specific extras
pip install tarka[core]              # Just decision engine deps
pip install tarka[core,graph,ml]     # Core + graph + ML
pip install tarka[full]              # Everything
pip install tarka[lite]              # Core + cases
pip install tarka[standard]          # Core + graph + ML + cases + OSINT

Managing Services

python tarka.py start              # Start all installed modules
python tarka.py stop               # Stop all services
python tarka.py status             # Show running services & health
python tarka.py logs -f            # Follow all logs
python tarka.py logs decision-api  # Logs for one service

# Add or remove modules later
python tarka.py add graph,ml       # Add graph and ML to existing install
python tarka.py remove analytics   # Remove analytics module

# Local development (no Docker)
python tarka.py dev decision-api   # Run decision-api with hot-reload

# List all available modules
python tarka.py list

# Show module details
python tarka.py info graph

# Clean uninstall
python tarka.py uninstall

Architecture

SDK (Web/Android/iOS/Python) --> Decision API --> Redis (tags + scores)
                                     |
                   +-----------------+-----------------+
                   |                 |                 |
              Rule Engine       ML Scoring        OPA (optional)
              (no-code UI)    (ONNX + adaptive)
              (shadow mode)   (drift detection)
              (AI recommend)  (explainability)
                   |
              OSINT Enrichment
              (Shodan, AbuseIPDB, GreyNoise,
               EmailRep, HIBP, IPinfo, RDAP)
                   |
              Graph Service --> Neo4j
              (community detection, fraud rings,
               risk propagation)

Investigation UI --> Case API --> Graph Service
                       |
                  AI Agent (LLM tool-use)

Event Ingest --> NATS JetStream --> Analytics Sink --> ClickHouse

Components

Service	Port	Description
`decision-api`	8000	Fraud scoring, attestation, rule + ML orchestration, simulation, recommendations
`graph-service`	8001	Entity graph (Neo4j), GDS algorithms, tag storage on nodes
`case-api`	8002	Investigation cases, workflow automation, SAR/STR generation
`integration-ingress`	8003	KYC webhooks, adapter registry, OSINT enrichment (12 sources)
`feature-service`	8004	Feature engineering, enrichment, OSINT signal injection
`ml-scoring`	8005	ONNX inference, adaptive autoencoder, drift detection, model registry
`investigation-agent`	8006	AI copilot with LLM tool-use loop
`collaboration-chat-bridge`	8009	Slack / Teams / Lark → investigation-agent (`collab` profile)
`event-ingest`	8007	NATS-based high-throughput event ingestion
`analytics-sink`	8008	ClickHouse analytics writer
`graphql-gateway`	8010	Unified GraphQL API
`frontend`	3000	React dashboard (10 pages)

Cross-service env alignment: case-api uses DECISION_API_URL for downstream decision calls; investigation-agent uses CASE_API_URL, DECISION_API_URL, and optional GRAPH_SERVICE_URL / UPSTREAM_API_KEY. See docs/docs/guides/deployment.md for defaults, docs/docs/guides/service-ports.md for ports and OpenAPI mapping, and deploy/.env.example for compose-oriented URLs.

SDK	Platform
`packages/fraud-sdk-typescript`	Web (browser) — device signals + behavioral biometrics
`packages/fraud-sdk-python`	Server-side Python — IP/geo signal collection
`packages/fraud-sdk-android`	Android (Kotlin) — `io.tarka.sdk`, Play Integrity, `device_context` (README)
`packages/fraud-sdk-ios`	iOS (Swift) — App Attest, `device_context` (README)

SDK positioning (directional, mid-scale scores): docs/docs/guides/sdk-scorecard-2026-01.md.

Highly regulated sectors (fintech, banking, crypto-adjacent): optional regulated markets feature pack checklist — ingress integrity, attestation, audit, self-hosted boundaries. SOC 2 / PCI / ISO orientation: compliance readiness.

Frontend Pages

Page	Description
Dashboard	Real-time decision stats, hourly charts, top entities
Cases	Investigation case list with workflow status
Rules	No-code visual rule builder with drag-and-drop conditions, templates
Shadow Mode	Observation dashboard: toggle packs active/shadow/disabled, divergence metrics
Simulation	Synthetic fraud scenarios, A/B rule testing, precision/recall/F1 analysis
Graph Explorer	Neo4j visualization, community detection, fraud ring discovery
OSINT	12-source enrichment for email/phone/IP/domain with composite risk scoring
Analytics	ClickHouse-powered historical analytics
Investigation	AI agent chat with tool-use for case research
Case Detail	Full case view with timeline, evidence, comments; decision explainability includes `inference_context` when present

OSINT Enrichment

Built-in OSINT enrichment queries 12 sources in parallel (9 work without API keys):

Source	Type	Key Needed	Data
Shodan InternetDB	IP	No	Open ports, CVEs, tags
AbuseIPDB	IP	Optional	Abuse confidence score
GreyNoise	IP	Optional	Scanner classification
IPinfo Lite	IP	Optional	Geo, ASN, VPN/proxy/Tor
ip-api.com	IP	No	Geo, ISP, proxy, hosting
EmailRep.io	Email	Optional	Reputation, social profiles
Gravatar	Email	No	Avatar existence
Have I Been Pwned	Email	No	Breach count
DNS MX	Email	No	Mail server validation
NumVerify	Phone	Optional	Carrier, line type
RDAP	Domain	No	Registration age, nameservers
GitHub	Identity	No	Profile discovery

Configure optional keys in .env:

ABUSEIPDB_KEY=your-key
GREYNOISE_KEY=your-key
EMAILREP_KEY=your-key
NUMVERIFY_KEY=your-key
IPINFO_TOKEN=your-token

SDK Device Signals

All SDKs collect device signals and send them as device_context with each evaluation:

Emulator/simulator detection (WebDriver, headless browser, Android emulator, iOS simulator)
VPN detection (WebRTC leak, Android NET_CAPABILITY_NOT_VPN, iOS utun interfaces)
Bot detection (behavioral entropy, automation framework detection, bot User-Agent)
Behavioral biometrics (typing cadence, mouse dynamics, scroll patterns, session timing)
Location spoofing (mock location providers, GPS consistency)
App repackaging (certificate hash verification, Play Integrity, App Attest)
Security handshake (server nonce → SDK signs with platform attestation → server verifies)

Signals become sdk:* tags on Redis and graph nodes (e.g., sdk:emulator, sdk:vpn, sdk:bot).

Configuration

Decision API supports configurable scoring:

DENY_THRESHOLD (default 80) — score at which to deny
REVIEW_THRESHOLD (default 50) — score at which to flag for review
SCORE_BLEND_STRATEGY — average (default), max, or rules_only

Set API_KEYS=key1,key2 on any service to require X-API-Key header. Leave empty to disable (development mode).

License

Application code in this repository is Apache-2.0 unless otherwise noted. See LICENSE.

Third-party and copyleft components: Neo4j (when used) is AGPL-3.0 for the database in typical networked deployments. Use docker-compose.lite or review LICENSE-DEPENDENCIES.md before production architecture sign-off.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tarka

What’s on trunk (shipping now)

April 2026 — Investigation copilot, collaboration bridge, and ops

v1.1.0 train — tests, CI/CD, security, onboarding

Client SDKs (evaluate vs ingest)

Examples, benchmarks, and ops

Shipping cadence & releases

Who Should Choose Tarka

Install

Try in five minutes (Decision API + inference + OSINT + UI)

Prebuilt images (optional)

GitHub Codespaces

Walkthrough video

Security & compliance (table stakes)

Requirements

What Each Module Includes

pip Install (Library Use)

Managing Services

Architecture

Components

Frontend Pages

OSINT Enrichment

SDK Device Signals

Configuration

License

About

Licenses found

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 292 Commits
.cursor/rules		.cursor/rules
.devcontainer		.devcontainer
.github		.github
contracts		contracts
deploy		deploy
docs		docs
frontend		frontend
packages		packages
scripts		scripts
services		services
templates/cookiecutter-investigation-integration-adapter		templates/cookiecutter-investigation-integration-adapter
wiki		wiki
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSE-DEPENDENCIES.md		LICENSE-DEPENDENCIES.md
README.md		README.md
RELEASE_SCHEDULE.md		RELEASE_SCHEDULE.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
benchmark-latency-evaluate-local.txt		benchmark-latency-evaluate-local.txt
pyproject.toml		pyproject.toml
tarka.py		tarka.py

Folders and files

Latest commit

History

Repository files navigation

Tarka

What’s on trunk (shipping now)

April 2026 — Investigation copilot, collaboration bridge, and ops

v1.1.0 train — tests, CI/CD, security, onboarding

Client SDKs (evaluate vs ingest)

Examples, benchmarks, and ops

Shipping cadence & releases

Who Should Choose Tarka

Install

Try in five minutes (Decision API + inference + OSINT + UI)

Prebuilt images (optional)

GitHub Codespaces

Walkthrough video

Security & compliance (table stakes)

Requirements

What Each Module Includes

pip Install (Library Use)

Managing Services

Architecture

Components

Frontend Pages

OSINT Enrichment

SDK Device Signals

Configuration

License

About

Topics

Resources

License

Licenses found

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages