
Tech Stack
Java
Spring Boot
Apache Kafka
Kubernetes
AWS
PostgreSQL
Redis
CI/CD
Description
Designed and operated large-scale services processing tens of millions of events daily with sub-300 ms P95 latency targets and 99.9% uptime.
Focused on reliability engineering practices, operational readiness, and fault isolation in production systems.
- Scaled Kafka consumption patterns to sustain burst traffic while avoiding consumer lag incidents.
- Introduced idempotent reconciliation and compensating workflows to reduce cross-service failure propagation.
- Established SLO-driven standards and production readiness processes backed by automation.
Page Info
High-scale Messaging
Kafka scaling, DLQ handling, and backpressure controls for high-throughput outbound communications.
