| LLM agent |
LangGraph, LangChain, langchain-openai, langchain-anthropic |
| Web framework |
FastAPI, Pydantic v2, uvicorn, asyncpg, websockets |
| State store |
PostgreSQL (HA: 1 primary + 1 replica), langgraph-checkpoint-postgres |
| Event bus |
NATS JetStream (3-replica cluster, 3 streams: RESERVATIONS, PIPELINES, INFRASTRUCTURE) |
| GitOps |
Flux v2 (source-controller, kustomize-controller, helm-controller, notification-controller) |
| Provisioning |
Kustomize, Helm, Cluster API (CAPV for vSphere; Metal3 planned), Ansible (containerized as a K8s Job) |
| Container registry |
GitHub Container Registry (ghcr.io/wesleypeng/...) |
| Secrets |
Sealed Secrets — encrypted at rest in Git, decrypted in-cluster by the controller |
| Web UI |
React, TypeScript, Vite, Ant Design |
| CI/CD |
Jenkins (with kubernetes-plugin, JCasC, shared library), GitHub Actions |
| Code quality |
SonarQube Community, flake8, mypy, ESLint, Prettier |
| Test automation |
agentic-taf (PyXTaf evolved) — pytest, behave, Playwright, httpx, websockets, paramiko, langchain-openai, kubernetes |
| Metrics |
Prometheus + Grafana via kube-prometheus-stack |
| Logs |
OpenSearch + OpenSearch Dashboards (single-node), Fluent Bit DaemonSet |
| LLM observability |
LangFuse self-hosted (chart 1.5.x, app 3.162.x) — bundled ClickHouse + Valkey + MinIO + external PostgreSQL |
| Distributed tracing |
OpenTelemetry SDK + Jaeger backend on OpenSearch |
| Bare-metal/VM |
NetBox (CMDB), vSphere REST /api/ (vCenter), Ansible vSphere collection, Ansible IPMI module |