Hi, I'm Simone —

I build systems
that hold under
pressure.

I design pipelines and ML systems built to scale. Occasionally, I ship side projects that solve real problems.

Data EngineeringStreaming PipelinesML InfrastructureCloud (AWS/GCP)Distributed SystemsPython · Spark · Airflowdbt · Kafka · Terraform

Selected work

Featured Projects

All projects
DataBackendDevOps

Real-Time Event Pipeline

Legacy batch pipeline couldn't handle peak traffic spikes, causing 4-6 hour data delays in downstream ML models.

Handled 12M+ events/day

Reduced end-to-end latency from 4h to 90s

Apache KafkaApache FlinkPython+2
View on GitHub
MLDataBackend

ML Feature Store

Data science teams were duplicating feature engineering logic across 8+ models, causing inconsistencies and wasted compute.

Reduced feature computation time by 63%

Unified 40+ features across 8 models

PythonFeastRedis+3
View on GitHub
DataDevOps

Data Warehouse Migration

Migrating a 10TB legacy Redshift warehouse to Snowflake with zero downtime and full historical parity.

Migrated 10TB+ data with zero data loss

Reduced query costs by 41%

dbtPythonSnowflake+3
View on GitHub

Building in public

Apps & MVPs

Alongside the day job, I build micro-products — mostly data tooling and developer utilities. Some stay experiments; some ship.

Explore Apps

Thinking out loud

Writing

All posts

System design breakdowns, data modeling decisions, performance tuning, and lessons learned building infrastructure at scale. First posts in progress.

See what's coming
Simone Benitozzi — Data & Software Engineer