Become a better Data Engineer with waitingforcode.com

Master stream processing

You have a first successful experience with batch data pipelines and were asked to implement your first stream processing jobs?

You shouldn't consider stream processing as a batch on the unbounded data. It's much more than that!

There are various stream-processing concepts. This 3-modules course will show you them from lead you through them, from the data ingestion to the stateful stream processing!

What I will learn?

  1. Data ingestion - real-time
    • Plan
    • API Gateway vs. direct ingestion
    • API Gateway πŸ’» (theory + demo)
    • Change Data Capture πŸ’»
    • Files streaming πŸ’»
    • Delivery semantics πŸ’»
    • Idempotency πŸ’»
    • Batch layer
    • Real-time ad-hoc querying πŸ’»
    • Polyglot persistence
    • Homework
  2. Data cleansing
    • Plan
    • Data enrichment πŸ’» (theory + demo)
    • Data anonymization
    • Deduplication πŸ’»
    • Schema
    • Schema registry πŸ’»
    • Schema management πŸ’»
    • Metadata πŸ’»
    • Binary file formats πŸ’»
    • Monitoring and alerting πŸ’»
    • Homework
  3. Stream processing
    • Plan
    • Patterns πŸ’» (theory + demo)
    • Architectures
    • Transformations πŸ’»
    • Event time vs. processing time πŸ’»
    • Scalability πŸ’»
    • Auto-scaling πŸ’»
    • Reprocessing πŸ’»
    • Messaging patterns
    • Backpressure πŸ’»
    • Debugging
    • Homework
  4. Stateful stream processing
    • Plan
    • Stateless vs. stateful πŸ’» (theory + demo)
    • State store πŸ’»
    • Incremental and full state πŸ’»
    • Triggers πŸ’»
    • Watermarks and late data πŸ’»
    • Fault-tolerance πŸ’»
    • Idempotency πŸ’»
    • Aggregations πŸ’»
    • Arbitrary stateful processing πŸ’»
    • Joins πŸ’»
    • Windows πŸ’»
    • Complex Event Processing πŸ’»
    • Homework

Libraries and tools used in demos: Apache Flink, Apache Kafka, Apache Spark, Delta Lake, Debezium, Kafka Connect, KSQL, ScyllaDB

Enroll in

Watch 3 samples of Master stream processing module

Enroll in

What I will get?

  • 10 hours of content Value $600

    Each lesson has a dedicated recording so that you can follow the course on your own pace.

  • hands-on homework exercises using modern Open Source technologies Value $200

    Each section ends with a homework assignement where you will have to implement the concepts presented in the lesson. If you get stuck, you can rely on the instructions document. If you lack the inspiration, you can also check the proposed solutions out.

  • code snippets in Python and Scala Value $600

    Each demo lesson has dedicated Github repository with code snippets so that you can play with the code samples on your own.

  • lifetime access to the course and material updates Priceless

    Since the data engineering changes, the course will be updated and you'll have an automatic access to the upgrades.

$1400 Only $98 during the first year

Enroll in