Twitter: @morsapaes
I’ve been working with data, big and small, for the last 10 years: first as a Data Engineer slogging through ETL pipelines and big migration projects; and then in Product-shaped roles helping build Apache Flink (before it was cool) and Materialize. I’m currently a Product Manager at ClickHouse, focusing on our real-time data ingestion service.
In this talk, we’ll walk through the evolution of CDC as the enabler for high-performance, real-time analytics on transactional data, and explore what’s missing to make it work for the 99%.
A decade after Debezium entered the scene and commoditized Change Data Capture (CDC), we’re still struggling to bridge OLTP and OLAP for analytics. Expensive batch jobs evolved into overcomplicated streaming architectures, HTAP promised to ditch the need for data movement, someone told us to “just use Postgres”. How much progress have we made, and where do we go from here? We’ll share lessons learned over the past decade seeing hundreds of customers run CDC at scale, and how we can use them to build best-of-breed experiences for high-performance, real-time analytics.
Key Takeaways: