- This event has passed.
MLDM Monday | Stateful Stream Processing Explained w/ Apache Flink
November 6, 2017 @ 7:30 pm UTC+0
The Next “Holy Grail” of Data Engineering: Stateful Stream Processing Explained w/ Apache Flink
Distributed Stream Processing systems is not a new idea. Shortly after MapReduce was recognized, stream processing systems have emerged as a solution to analyze and produce results in real-time for data intensive applications. However, earlier stream processing engines (such as Storm, S4, earlier versions of Spark Streaming etc.) at the time were designated merely the role of “assisting” batch systems with “approximate, real-time” results.
Very recently, the re-emerging hype around stream processing is no longer just about real-time analytics; as a matter of fact, the discussions are all around the idea that stream processing is the superior computing paradigm and model for almost all of todayʼs data applications, and should completely subsume batch processing. At the center of this “next big thing” in data engineering and big data computational systems, is the ability for stream processors to efficiently handle stateful stream pipelines with exactly-once guarantees.
In this talk, I will go deep into explaining what stateful stream processing is all about, as well as the challenges it poses when designing a computational engine to realize it. All of this will be coupled with explanations of Apache Flinkʼs approach to stateful stream processing and how it works internally.
Tzu-Li (Gordon) Tai
現任 Apache Flink Committer / PMC Member，任職於 data Artisans 擔任軟體工程師
平視工作為全職參與 Flink 開源專案貢獻，內容包含推動 Flink 持續開發以及審核程式碼貢獻。
專長為 Java、Distributed Software Systems、Data Stream Processing
Twitter / Linkedin / Github: @tzulitai