Solution Architect

Real-time data and AI for the systems that can’t afford to fail.

I design and troubleshoot mission-critical streaming architectures, often AI-augmented. Mines generalist engineer, 15 years across fintech, telecom, marketplaces and gaming — I target the 20% of effort that delivers 80% of immediate impact. No endless refactors.

100K+ events/sec · 97% accuracy · 18× speedups · €10M+/day in production

Aurélien Courrèges-Clercq

Past collaborations

  • European Parliament
  • Thales
  • SIS / ISCG
  • Société Générale
  • ESL Gaming

Mission-critical streaming

Real-Time Data, Done Right

Streaming pipelines for fraud, payments, trading, and live ops.

Real-time systems break in expensive ways: lost transactions, missed fraud, SLAs breached at 3am. I design streaming pipelines that hold under pressure — and modernize the legacy Java that surrounds them, without rewrites. Sub-second latency, multi-site resilience, predictable costs.

What I bring to the table

  • Streaming pipelines sized for the workload — real-time event processing with an optional analytics layer on top: fraud detection, product telemetry, leaderboards, fault detection. p50/p99 measured under load, capacity that holds during spikes.
  • Cost engineering with trade-offs — same throughput on less infra; daily/monthly cost estimates based on your traffic.
  • Legacy modernization, two modes — Strangler Pattern for incremental cuts, or full rewrite covered by non-regression tests. No big-bang risk either way.
  • Observability that wakes you up for the right reasons — alerts that mean something, runbooks your team can actually follow.

PatternAlarm

Real-time fraud detection at scale

10K events/min · 97.5% accuracy · 79× faster on the same hardware · ~$280/month

Société Générale CIB

€10M+/day in equity derivatives, 3 years live trading floor

Java/Spring monolith · custom rules engine + AI solver · zero rewrites

ESL Gaming

5 years streaming for global esports tournaments

Multi-region · sub-second · 100K+ events/sec under live ops pressure

For Fintech CTOs · Heads of Data · Banking compliance leads · Live-ops teams

AI Integration

AI That Survives Production

From prototype to deployed system, with measured costs and audit trails.

Most LLM projects ship as demos and stay there: no observability, no cost ceilings, no fallback when the model is wrong. I deploy multi-agent systems with the boring 80% production needs — auth, audit, cost discipline, runbooks — solved before launch.

What I bring

  • The right tool for the job — RAG with vector DBs, semantic caching, classical ML with Spark, graph algorithms and rule-based systems, or LLM agents — picked on fit, not hype.
  • LLM costs measured per session, per query — unit economics you can show the CFO, not "should be cheap".
  • Production AI from 10 weeks — notebook to deployed Kubernetes, solo. Spring Boot or FastAPI microservices, chosen case by case to fit your existing stack.
  • Audit-ready by design — rules engines for compliance-sensitive logic, AI for context, structured outputs.

RansomRampage

Multi-agent simulator, notebook → AWS EKS in 10 weeks solo

$0.02/session · 60% cache hit · auto-scales to zero · ransomrampage.com

ESL Tournament Automation

RAG production at gaming scale, 6 months

4-microservice architecture · $0.30/process measured

Société Générale CIB

AI solver inside Java/Spring monolith, 3 years live

€10M+/day · audit-ready · Spring AI bridge before the term existed

For AI/ML Heads with cost burn · CISOs with audit-trail needs · Heads of Data shipping AI products · Compliance officers

Delivery style

Prototype → production. Methodology, not magic.

Two real engagements, same playbook: smallest skateboard first, prod-shaped early, measure before scaling.

RansomRampage — 10 weeks, prototype → production

CTO crisis simulator. Three LLM agents around the table — CISO advises, SRE optimizes, the hacker attacks. Up to 20 turns to survive a live ransomware siege on your AI-generated fintech.

WeekMove
1Game design + tech arbitration. Defined rules, turn structure, win/lose conditions, UI layout. Compared different LangGraph + RAG strategies. Chose vector DB, observability stack, microservice boundaries.
25 domain knowledge bases. 70 FAISS chunks: MITRE ATT&CK, SRE patterns, offensive techniques, fintech archetypes, tech corpus. BGE-small over BGE-M3 (17x smaller).
3–43-agent LangGraph pipeline. Gateway → cache → RAG → generate → update. Semantic caching at cosine > 0.9999. Structured Pydantic output.
5Deterministic game engine + API. Revenue, compliance, breaches resolved by Python, not AI. Hacker queued this turn, resolved next. FastAPI backend.
6React frontend + first playtest. 1 user, 9 issues, 3 critical. 23 fixes in 1 commit. Added 20 adversary personas including Kevin from IT.
7–9EKS production deploy. 6 Terraform modules. ArgoCD GitOps. CI ~11 min. Cognito SSO at ALB level — zero auth code.
10Observability + launch. Prometheus, Grafana, Loki (256MB vs 2GB ELK). $0.02/game, $160/mo, $0.50 idle.

PatternAlarm — Real-time fraud, 8 weeks solo

Multi-domain fraud detection on live streams. Gaming exploits, fintech transfers, ecommerce checkout — 10K events/min through one Flink topology, sub-second alerting with ML inline.

StepMove
1Concept + target outcomes. Multi-domain fraud detection. Goals: sub-second alerting, 10K+ events/min, under $300/mo.
2Fraud pattern research. Studied velocity abuse, account takeover, payment manipulation. Built synthetic data generator.
3Model prototyping in notebook. Scikit-learn gradient-boosted Random Forest. Feature engineering across 3 domains. 97.5% accuracy.
4Local env matching prod. Real Kafka, PostgreSQL, Flink via Docker Compose. Evaluated MSK vs self-managed.
5End-to-end data flow. Generator → Kafka → Flink + model serving → alerts dashboard. Schema validated.
6AWS deploy + training pipeline. Terraform modules. Airflow DAG for automated retrain + promote.
7Bottleneck diagnosis. Latency spike under load. Root cause: single-record async inference, not capacity.
8Batched sync inference. Same hardware: 79× latency, 59× throughput. Scales to 35K predictions/min.
9Open-sourced. Terraform + logic + performance analysis on GitHub. Under $300/mo.

skateboard first → prod-shaped local → cloud per service → measure before scaling → ship lean

Philosophy

Skateboard before scooter, scooter before car.

Most projects fail because they over-build before knowing what users actually need. Ship the smallest version that works, put it in front of one real user, fix what breaks. Scale comes later — only after the thing is proven worth scaling.

Three principles I apply on every engagement:

1. Build the skateboard.

Whatever proves the riskiest assumption, in the smallest form, with one real user touching it within weeks — not months.

2. Production-grade ≠ over-provisioned.

Every resource earns its cost. Caching added when traffic justifies it, not before. Smaller models when retrieval quality matches. Single instances when one handles the load. Over-sizing is paid every month; right-sizing is measured first.

3. Deterministic where it counts.

AI handles context and language. Plain code handles money, compliance, and anything that needs to be reproducible. Rules engines for decisions that get audited; LLMs for the parts where ambiguity is acceptable.

15 years of finding the 20% of work that moves the needle.

Read more on rapid iteration →

Stack

The toolbox below is here for engineering leaders who want to verify fit before a call.

Streaming & events Kafka · Confluent · MSK · Kafka Streams · Flink · Spark ML · Lambda · Protobuf · gRPC · Avro

Backend & languages Java 21 · Spring Boot · Spring Batch · Spring AI · Scala · Akka · Cats Effect · Python · FastAPI · Pydantic · REST · GraphQL · SOAP · JSON

AI & agentic systems LangGraph · Claude · OpenAI · MCP · FAISS · Vector Databases · semantic caching · Scikit-learn · Keras · Generative AI · expert systems

Infrastructure & ops AWS · EKS · ECS · Cognito · ALB · S3 · Docker · Kubernetes · Terraform · ArgoCD · Helm · GitHub Actions · GitLab CI · Jenkins · AWS CodePipeline · Prometheus · Grafana · Loki · Kibana

Data & storage Postgres · MySQL · Oracle · Cassandra · DynamoDB · Redis · Apache Iceberg · Trino · Parquet · CSV

Domains served

Fintech & Capital Markets · Banking · Telecom · Gaming & live ops · Marketplace platforms · Public sector / GovTech

French/EU citizen · APAC base (Singapore Pte Ltd) · EU/UK/APAC follow-the-sun delivery

Mines engineering school (MSc) · AWS Solutions Architect Associate · Udacity Data Engineering · Udacity AI & Specializations

What people say

Exceptional problem-solving abilities. His capacity to quickly understand complex technical systems and develop creative, unconventional solutions was remarkable.

Engineering Leadership ESL Gaming Gmbh

Toujours à l'écoute, il sait s'adapter aux différents changements et trouve toujours une solution pour obtenir le résultat escompté.

L. Jabre — Designer Web France Télévisions

Let’s talk

Mission-critical or production AI? 20 minutes.

No prep, no slides. We figure out if there’s a fit — and if not, you walk away with one concrete suggestion. That’s the offer.

Book a 20-min call →