Prove every signal.
Open-source, modular fraud detection platform. Pick the components you need or run the full stack.
Tarka — from Sanskrit तर्क (tarka), the method of logical hypothesis testing in Nyaya Shastra (Indian analytical philosophy). Every signal is a hypothesis; every decision is proved.
Canonical repo: github.com/pamu512/tarka
Saarthi (Investigation Copilot) — OSS ships in this repo as services/investigation-agent. Standalone paid: Saarthi Pro. Buyer / PMO summary: Saarthi Pro vs OSS.
OSS (investigation-agent) |
Saarthi Pro | |
|---|---|---|
| Best for | Full Tarka stack, self-hosted ops | Procurement, SLAs, governance roadmap, focused copilot SKU |
| You own | Upgrades, uptime, compliance mapping | Commercial terms + vendor support (where purchased) |
| Code | Here in services/investigation-agent |
github.com/pamu512/Saarthi-pro |
These capabilities are in the codebase today and roll forward on master:
- Decision API: normalized
inference_contexton evaluate responses (integrity, tamper, network trust, replay, geo-consistency, top signals) plus OpenAPI contract alignment; session geo merges optional browser GPS and server IP geo hints;sdk:geo_ip_mismatch/sdk:geo_tz_mismatchsignal tags when inconsistent;/v1/ops/calibration-statusandcalibration_statuson/v1/ops/governancefor drift posture. - Ingress hardening: replay-style payload detection (short-lived Redis signatures) folded into scoring and audit context; optional HMAC on
POST /v1/decisions/evaluatewhenREQUEST_SIGNATURE_SECRETis set (see TLS pinning & signed requests). - SDKs: Python and TypeScript clients typed for
inference_contexton evaluate responses; TypeScript optionalenableGeo(browser GPS); Python server collector optionalenable_ip_geo/ENABLE_IP_GEO_LOOKUP(public IP lookup is off by default). - Graph (lite path): default schema includes
Place(quantized geo cells) andSEEN_ATedges for co-location–style graph context when enabled. - Frontend: case explainability surfaces inference metrics; API client can fall back to mock data when backends are down (demo-friendly).
- Ops / planning: module project roadmaps under
docs/docs/projects/, 30/60/90 plan, competitive notes, and OSS adoption backlog (issues + dependency order in docs).
- Investigation agent (Saarthi):
GET /v1/ready(data-dir readiness),GET /v1/setup(first-run checklist), and aproductionobject onGET /v1/healthwhen production profiling is enabled;GET /v1/workflowswithworkflow_id/workflow_params(plusplaybook_id/batch_idwhere applicable) onPOST /v1/chat; case-summary PDF and turn-bundle report routes; optional copilot rate limits and request body size cap. Reference env:services/investigation-agent/.env.reference.example. Hardening compose:deploy/docker-compose.production-hardening.yml. Integration notes: CHANGELOG_INTEGRATION. - Trust / ops, evidence summary, parity: Decision API
GET /v1/ops/evaluation-posture+GET /v1/slofor the console readiness strip;POST /v1/evidence/summary(deterministic citations + next actions); Feature ServicePOST /v1/internal/parity/verify. Indexed in API Reference (Decision, Feature Service, Investigation Agent sections). - Collaboration chat bridge (
services/collaboration-chat-bridge): Slack, Microsoft Teams, and Lark with optional per-source minute rate limits; Slack file text extraction (plain text, CSV, PDF, Excel .xlsx); SSRF-hardened fetch of the first publichttps://URL in the user line; directives!wf,!wfp,!style; forwards workflow and batch fields to the agent. Details:services/collaboration-chat-bridge/README.md, Collaboration chat & cloud. - Frontend: Investigation page updates for copilot setup and workflows (
frontend/src/pages/Investigation.tsx). - Observability & deploy: Grafana dashboard JSON for copilot metrics under
deploy/observability/; optionaldeploy/docker-compose.host-ports.override.ymlfor local port mapping; guide Investigation CMS & ITSM.
Mirrors docs/docs/releases/v1.1.0-2026-04-30.md and RELEASE_SCHEDULE.md.
Tests and validation
- Unit coverage for
inference_build(tiering, velocity, travel/colocation,derive_recommended_action). pytestfor/v1/replaypairedtrace_idsmode (order,missing_trace_ids, empty-window 404).
CI/CD, security hygiene, and first-run polish
- GitHub Actions CI (
main/master): Ruff; decision-api tests with coverage gate (≥48% as enforced in.github/workflows/ci.yml, path to 60%+); case-api, Python SDK; graph-service; integration-ingress; investigation-agent; graphql-gateway, event-ingest, analytics-sink, feature-service, ml-scoring; frontendnpm run testthennpm run build+ TypeScript SDKnpm run build; Alembic migrations for decision/case APIs on PostgreSQL startup; GraphQL/metricsvia shared observability;benchmark-latency-evaluatejob (lite compose +scripts/benchmarks/latency_evaluate.pyartifact); coverage XML artifacts; Docker builds gated on all jobs. - Security scanning workflow: Trivy filesystem + decision-api image → SARIF upload (where code scanning is enabled); weekly schedule.
- Secret scanning workflow: TruffleHog on push/PR/schedule (
.github/workflows/secret-scan.yml). - Dependabot: grouped updates for GitHub Actions, pip (core services), npm (frontend).
- Docs:
SECURITY.md(responsible disclosure),LICENSE-DEPENDENCIES.md(Neo4j AGPL / lite and alternates),CODE_OF_CONDUCT.md,docs/docs/guides/security-scanning.md,docs/docs/guides/sandbox-five-minute.md(copy-paste evaluate + OSINT + UI path). - Onboarding:
.devcontainer/devcontainer.json(Codespaces / Docker-outside-Docker); README badges (CI, security scan, Codespaces); Maintainer walkthrough (Loom, Tarka / this repo only): five-minute sandbox + Case Detail explainability. (Not Skuld or other repos — those are separate products.) deploy/docker-compose.lite.yml: adds integration-ingress (8003) so lite stack matches the five-minute OSINT demo without full Neo4j.
Planned validation (release gate)
pytest(decision-api), frontendnpm run test+npm run build, and TypeScript SDKnpm run buildgreen before tag.- CI workflow green on default branch: lint, all Python service test jobs, Node builds, Docker build matrix.
- Trivy security workflow completes (SARIF upload may depend on org plan); Dependabot enabled for the repository.
- Lite compose smoke:
docker compose -f deploy/docker-compose.lite.yml up -d --build→ 8000 evaluate, 8003 OSINT health, 3000 frontend reachable.
- Synchronous scoring: call Decision API
POST /v1/decisions/evaluateviaDecisionClient(Python / TypeScript underpackages/). - Async high-volume path: send events to event-ingest
POST /v1/events(NATS → worker → evaluate) viaEventIngestClient; optionalIdempotency-KeywhenREDIS_URLis configured on ingest.
Onboarding (ports, metrics, replay script): docs/docs/guides/ingest-replay-onboarding.md — see also docs/docs/sdks/python.md and docs/docs/sdks/typescript.md.
| What | Where |
|---|---|
| Scripts index (CI gates, policy/ML validators, links to subtree READMEs) | scripts/README.md |
| Three walkthroughs (payments + ML, bot defense, IOC + graph) | docs/docs/guides/examples/README.md |
| Evaluate latency (stdlib script) | scripts/benchmarks/README.md |
| Simulation / A-B rules | docs/docs/guides/shadow-and-ab-testing.md |
| Prometheus + Grafana (compose add-on) | deploy/observability/README.md |
| Apache-friendly graph options (vs Neo4j AGPL) | docs/docs/guides/graph-backend-alternatives.md |
| Artifact | Where |
|---|---|
Version targets (v1.1.0 … v1.3.0) |
RELEASE_SCHEDULE.md |
| May 2026 Friday train (weekly commits / themes) | docs/docs/guides/release-calendar-2026-05.md — queue: scripts/release/release-queue-2026-05.json |
OSS-pattern execution order (#31–#54 + graph) |
docs/docs/guides/oss-ship-order-dependencies.md |
| Product milestones (Epics A–F) | docs/docs/guides/roadmap-30-60-90.md |
June 2026 milestones on GitHub group the borrowed-from-OSS workstream (policy DAG, typologies, parity gates, deployment profiles, scorecards, etc.) — see issues labeled borrowed-from-OSS and the module swimlanes project.
Choose Tarka if you need fraud controls that your team can own, audit, and evolve quickly.
- Fintech, payments, lending, crypto, and marketplaces that need real-time decisions plus investigations.
- Risk and fraud teams that want rules + ML + graph in one stack, with explainable decisions and evidence exports.
- Engineering teams that prefer open, modular architecture over closed vendor lock-in.
- Compliance-heavy organizations that need auditable controls, traceability, and regional privacy support.
- Teams with existing tools that want to integrate KYC, sanctions, device, CRM, or dispute providers via one hub.
Tarka may be less ideal if you only need a very basic, single-rule workflow and do not require integrations, investigations, or governance.
# Clone the repository
git clone https://github.com/pamu512/tarka.git
cd tarka
# Option 1: Interactive installer (pick modules)
python tarka.py install
# Option 2: Install everything
python tarka.py install --all
# Option 3: Minimal setup (5-minute quickstart — Decision + Case + OSINT ingress + UI; no Neo4j)
python tarka.py install --lite
# Option 4: Specific modules only
python tarka.py install --modules core,graph,ml,frontendFull copy-paste path: docs/docs/guides/sandbox-five-minute.md — docker compose -f deploy/docker-compose.lite.yml up -d --build, then curl the Decision API for live inference_context, Integration Ingress for parallel OSINT, and open the frontend (mock fallbacks for graph-heavy views when Neo4j is not running).
docker compose -f https://raw.githubusercontent.com/pamu512/tarka/master/deploy/docker-compose.sandbox.yml up -dhttp://localhost:3000— frontendhttp://localhost:8000/v1/health— decision-apihttp://localhost:8003/v1/health— integration-ingress
Use the badge at the top of this README, then in the terminal:
docker compose -f deploy/docker-compose.lite.yml up -d --build
(Ports 3000, 8000, 8002, 8003 are forwarded from .devcontainer/devcontainer.json.)
Experience Tarka -Click Here.
- SECURITY.md — responsible disclosure
- LICENSE-DEPENDENCIES.md — Neo4j AGPL and Apache-friendly lite option
- CODE_OF_CONDUCT.md
- Dependabot + Trivy workflows — see docs/docs/guides/security-scanning.md
- Regional AI governance builds (US / EU+UK / global Investigation Copilot profiles) — docs/docs/guides/ai-governance-regional-builds.md · deploy/profiles/ai-governance/README.md
- Python 3.11+
- Docker & Docker Compose
CLI slugs stay stable; codenames are the product story (see Module codenames). Riti (gateway) draws on rīti (रीति) in the technical Sanskrit lexicon—often read in sources such as the Viṣṇudharmottarapurāṇa as iron rust, an ingredient of Vajralepa (a hard cement)—as a metaphor for the GraphQL layer that binds services into one API surface.
| Slug | Codename | What You Get | Infrastructure |
|---|---|---|---|
core |
Hetu | Decision API, rules engine, Redis tags/scores, OPA | Postgres, Redis |
graph |
Jaala | Neo4j entity graph, community detection, fraud rings | Neo4j |
ml |
Anumana | ONNX inference, adaptive autoencoder, feature engineering | — |
cases |
Lekh | Case management, workflow automation, SAR generation | Postgres |
integration |
Setu | KYC adapters, 12-source OSINT enrichment | Postgres |
agent |
Saarthi | AI investigation copilot (LLM tool-use) | — |
streaming |
Srotas | High-throughput event ingestion via NATS JetStream | NATS |
analytics |
Kala | ClickHouse OLAP, historical decision analytics | ClickHouse, NATS |
gateway |
Riti | Unified GraphQL API over all REST services | — |
frontend |
Dwar | React dashboard (10 pages) | — |
# Install as Python library with specific extras
pip install tarka[core] # Just decision engine deps
pip install tarka[core,graph,ml] # Core + graph + ML
pip install tarka[full] # Everything
pip install tarka[lite] # Core + cases
pip install tarka[standard] # Core + graph + ML + cases + OSINTpython tarka.py start # Start all installed modules
python tarka.py stop # Stop all services
python tarka.py status # Show running services & health
python tarka.py logs -f # Follow all logs
python tarka.py logs decision-api # Logs for one service
# Add or remove modules later
python tarka.py add graph,ml # Add graph and ML to existing install
python tarka.py remove analytics # Remove analytics module
# Local development (no Docker)
python tarka.py dev decision-api # Run decision-api with hot-reload
# List all available modules
python tarka.py list
# Show module details
python tarka.py info graph
# Clean uninstall
python tarka.py uninstallSDK (Web/Android/iOS/Python) --> Decision API --> Redis (tags + scores)
|
+-----------------+-----------------+
| | |
Rule Engine ML Scoring OPA (optional)
(no-code UI) (ONNX + adaptive)
(shadow mode) (drift detection)
(AI recommend) (explainability)
|
OSINT Enrichment
(Shodan, AbuseIPDB, GreyNoise,
EmailRep, HIBP, IPinfo, RDAP)
|
Graph Service --> Neo4j
(community detection, fraud rings,
risk propagation)
Investigation UI --> Case API --> Graph Service
|
AI Agent (LLM tool-use)
Event Ingest --> NATS JetStream --> Analytics Sink --> ClickHouse
| Service | Port | Description |
|---|---|---|
decision-api |
8000 | Fraud scoring, attestation, rule + ML orchestration, simulation, recommendations |
graph-service |
8001 | Entity graph (Neo4j), GDS algorithms, tag storage on nodes |
case-api |
8002 | Investigation cases, workflow automation, SAR/STR generation |
integration-ingress |
8003 | KYC webhooks, adapter registry, OSINT enrichment (12 sources) |
feature-service |
8004 | Feature engineering, enrichment, OSINT signal injection |
ml-scoring |
8005 | ONNX inference, adaptive autoencoder, drift detection, model registry |
investigation-agent |
8006 | AI copilot with LLM tool-use loop |
collaboration-chat-bridge |
8009 | Slack / Teams / Lark → investigation-agent (collab profile) |
event-ingest |
8007 | NATS-based high-throughput event ingestion |
analytics-sink |
8008 | ClickHouse analytics writer |
graphql-gateway |
8010 | Unified GraphQL API |
frontend |
3000 | React dashboard (10 pages) |
Cross-service env alignment: case-api uses DECISION_API_URL for downstream decision calls; investigation-agent uses CASE_API_URL, DECISION_API_URL, and optional GRAPH_SERVICE_URL / UPSTREAM_API_KEY. See docs/docs/guides/deployment.md for defaults, docs/docs/guides/service-ports.md for ports and OpenAPI mapping, and deploy/.env.example for compose-oriented URLs.
| SDK | Platform |
|---|---|
packages/fraud-sdk-typescript |
Web (browser) — device signals + behavioral biometrics |
packages/fraud-sdk-python |
Server-side Python — IP/geo signal collection |
packages/fraud-sdk-android |
Android (Kotlin) — io.tarka.sdk, Play Integrity, device_context (README) |
packages/fraud-sdk-ios |
iOS (Swift) — App Attest, device_context (README) |
SDK positioning (directional, mid-scale scores): docs/docs/guides/sdk-scorecard-2026-01.md.
Highly regulated sectors (fintech, banking, crypto-adjacent): optional regulated markets feature pack checklist — ingress integrity, attestation, audit, self-hosted boundaries. SOC 2 / PCI / ISO orientation: compliance readiness.
| Page | Description |
|---|---|
| Dashboard | Real-time decision stats, hourly charts, top entities |
| Cases | Investigation case list with workflow status |
| Rules | No-code visual rule builder with drag-and-drop conditions, templates |
| Shadow Mode | Observation dashboard: toggle packs active/shadow/disabled, divergence metrics |
| Simulation | Synthetic fraud scenarios, A/B rule testing, precision/recall/F1 analysis |
| Graph Explorer | Neo4j visualization, community detection, fraud ring discovery |
| OSINT | 12-source enrichment for email/phone/IP/domain with composite risk scoring |
| Analytics | ClickHouse-powered historical analytics |
| Investigation | AI agent chat with tool-use for case research |
| Case Detail | Full case view with timeline, evidence, comments; decision explainability includes inference_context when present |
Built-in OSINT enrichment queries 12 sources in parallel (9 work without API keys):
| Source | Type | Key Needed | Data |
|---|---|---|---|
| Shodan InternetDB | IP | No | Open ports, CVEs, tags |
| AbuseIPDB | IP | Optional | Abuse confidence score |
| GreyNoise | IP | Optional | Scanner classification |
| IPinfo Lite | IP | Optional | Geo, ASN, VPN/proxy/Tor |
| ip-api.com | IP | No | Geo, ISP, proxy, hosting |
| EmailRep.io | Optional | Reputation, social profiles | |
| Gravatar | No | Avatar existence | |
| Have I Been Pwned | No | Breach count | |
| DNS MX | No | Mail server validation | |
| NumVerify | Phone | Optional | Carrier, line type |
| RDAP | Domain | No | Registration age, nameservers |
| GitHub | Identity | No | Profile discovery |
Configure optional keys in .env:
ABUSEIPDB_KEY=your-key
GREYNOISE_KEY=your-key
EMAILREP_KEY=your-key
NUMVERIFY_KEY=your-key
IPINFO_TOKEN=your-tokenAll SDKs collect device signals and send them as device_context with each evaluation:
- Emulator/simulator detection (WebDriver, headless browser, Android emulator, iOS simulator)
- VPN detection (WebRTC leak, Android NET_CAPABILITY_NOT_VPN, iOS utun interfaces)
- Bot detection (behavioral entropy, automation framework detection, bot User-Agent)
- Behavioral biometrics (typing cadence, mouse dynamics, scroll patterns, session timing)
- Location spoofing (mock location providers, GPS consistency)
- App repackaging (certificate hash verification, Play Integrity, App Attest)
- Security handshake (server nonce → SDK signs with platform attestation → server verifies)
Signals become sdk:* tags on Redis and graph nodes (e.g., sdk:emulator, sdk:vpn, sdk:bot).
Decision API supports configurable scoring:
DENY_THRESHOLD(default 80) — score at which to denyREVIEW_THRESHOLD(default 50) — score at which to flag for reviewSCORE_BLEND_STRATEGY—average(default),max, orrules_only
Set API_KEYS=key1,key2 on any service to require X-API-Key header. Leave empty to disable (development mode).
Application code in this repository is Apache-2.0 unless otherwise noted. See LICENSE.
Third-party and copyleft components: Neo4j (when used) is AGPL-3.0 for the database in typical networked deployments. Use docker-compose.lite or review LICENSE-DEPENDENCIES.md before production architecture sign-off.
