The Dataflow Model

杨朝坤 发布于

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in MassiveScale, Unbounded, OutofOrder Data Processing

最近看了Google发表的dataflow论文,其中两点颇有感触:

  1. dataflow提供了一个思考维度:从什么结果被计算、在事件时间的哪里计算、处理时间的什么时候观察到结果、以及早先的结果如何与之后的修正相关。这种思维方式的转变是很重要的。
  2. dataflow分析了lambda架构的思想,即实时计算结果不准确但是低延迟,然后在批处理中修正结果,达到最终正确性。它对其推广,在流处理中使用触发器低延时产生结果,这个结果不一定准确,然后在后续计算中,利用晚到的数据产生新的结果来修正之前的结果,从而实现低延迟和正确性,而不是依赖等待数据完整来实现正确性(这会造成延时)

下面是它的主要内容:

A single unified model

  • 允许在无界乱序数据源上,使用各种correctness、latency、cost的combinations和tradeoff,进行事件时间有序结果、按数据自身特征划分窗口进行计算
  • 分解data pipeline为四个维度,提供清晰性、可组合性和灵活性
    • What results are being computed
    • Where in event time they are being computed.
    • When in processing time they are materialized.
    • How earlier results relate to later re nements.
  • 分离数据处理的逻辑与下层的实现,允许基于correctness, latency, and cost选择batch, micro-batch, or streaming engine

Concrete contribution

  • A windowing model which supports unaligned event-time windows, and a simple API for their creation and use.
  • A triggering model that binds the output times of results to runtime characteristics of the pipeline, with a powerful and flexible declarative API for describing desired triggering semantics.
  • An incremental processing model that integrates retractions and updates into the windowing and triggering models described above

Dataflow Model

Core Primitives

  • ParDo
  • GroupByKey

Windowing

  • unaligned windows

  • windowing can be broken apart:

    • AssignWindows
      • Elements are initially assigned to a default global window, covering all of event time, providing semantics that match the defaults in the standard batch model.
      • Since windows are associated directly with the elements to which they belong, this means window assignment can happen anywhere in the pipeline before grouping is applied. This is important, as the grouping operation may be buried somewhere downstream inside a composite transformation
    • MergeWindows
      • All are initially placed in a default global window by the system.
      • Then implementation of AssignWindows puts each element into a single window.
      • DropTimestamps
      • GroupByKey
      • MergeWindows
      • GroupAlsoByWindow
      • ExpandToElements

Triggers & Incremental Processing

  • Triggers are complementary to the windowing model, in that they each affect system behaviour along a different axis of time:

  • Windowing determines where in event time data are grouped together for processing.

  • Triggering determines when in processing time the results of groupings are emitted as panes.

  • Triggers system provides a way to control how multiple panes for the same window relate to each other

  • Discarding

  • Accumulating

  • Accumulating & Retracting

Design Principles

  • Never rely on any notion of completeness.
  • Be flexible, to accommodate the diversity of known use cases, and those to come in the future.
  • Not only make sense, but also add value, in the context of each of the envisioned execution engines.
  • Encourage clarity of implementation.
  • Support robust analysis of data in the context in which they occurred.