Welcome to Checkpoint, the monthly newsletter for Apache Flink users.
Apache Flink 1.16 released
Lots of new features and improvements in Flink 1.16:
Learn more about Apache Flink 1.16:
Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support for Non-Parallelizable Workloads
Apache Flink is part of Netflix’s internal encoding job queueing and pipelining infrastructure. This article explains how they implemented "Exclusive Queues" to eliminate wasted compute cycles and eventual consistency issues.
Externalization of a Flink state to Aerospike at Contentsquare
Contentsquare leverages Apache Flink as a key technology to deliver high-throughput, low-latency and high resiliency stateful streaming. This article focuses on the process of externalizing one of the RocksDB-embedded states from their Flink application to an external database, as well as the motivations that led to this decision.
Comparing Stateful Stream Processing and Streaming Databases
How do these two technologies work? How do they differ, and when is the right time to use them? Dunith Dhanushka, Senior Developer Advocate at Redpanda, provides his take on the subject.
Writing to Apache Parquet from Apache Flink
Apache Flink cookbook recipe explains how to consume event data from an Apache Kafka topic and write it to Apache Parquet files.
Iceberg Flink Sink: Stream Directly into your Data Warehouse Tables
This blog post explains how using a single shared catalog, both Flink and Spark can operate on the same Iceberg warehouse, providing the powerful streaming capabilities of Flink along with the feature-rich batch framework provided by Spark.
When Streaming Needs Batch - Flink's Journey Towards a Unified Engine
Using Flink to process a backlog (e.g., a big burst) of streaming data in batch.
Exploring Popular Open-source Stream Processing Technologies: Part 2 of 2
A brief demonstration of Apache Spark Structured Streaming, Apache Kafka Streams, Apache Flink, and Apache Pinot with Apache Superset
rok create job immerok-cloud
Immerok announces the initial availability of Immerok Cloud, a serverless, cloud-native Apache Flink service for developers building real-time systems.
Amazon EMR release 6.8 now supports Apache Flink 1.15.1
Amazon announces that Amazon EMR release 6.8 includes Apache Flink 1.15.1– available on EMR on EC2.
OpenLineage adds Apache Flink support
OpenLineage is an open framework for data lineage collection and analysis. If you care about the end-to-end provenance and auditability of your data, this integration now documents how Flink jobs affect it.
Apache Iceberg Achieves Milestone 1.0 Release
Just-released Apache Iceberg version 1.0 ensures API stability and reinforces its status as a production-ready technology for data warehousing and data science use cases.
Flink Forward Asia - November 26-27
2 days of on-line sessions. Keynotes and technical presentations on real-time systems, best practices, integration, AI and more.
Current 2022 on demand
The annual Kafka user conference (fka Kafka Summit) was held in Austin in early October. You can watch conference talks about streaming data topics on demand.
Welcome new Apache Flink PMC member, Danny Cranmer
If you have news you’d like to see in Checkpoint, please send us the blog post, tweet, or story via email to ok@immerok.io