For data engineers and researchers new to stream processing, this practical book provides a professional reference that focuses on Apache Heron. Authors Huijun Wu and Maosong Fu from Twitter provide the basic knowledge you need to get started with this real-time processing engine. Learn how Heron serves as a general-purpose, modular, and extensible platform that you can use to support common real-time analytics use cases.
Through the course of this book, you'll discover approaches for tackling challenges in stream processing systems and applications. You'll also understand how to build streaming applications that can benefit from Heron's robustness, high performance, adaptability to cloud environments, and ease of use.
With this book, you'll examine:
A complete study path that shows you how to develop stream processing systems
Heron's data model, system, topology submission process, architecture, and components
How to compile the Heron source code
Methods for migrating Apache Storm's topology to Heron
Heron components, including state manager, scheduler, topology master, stream manager, instance, metrics manager, and metrics cache
Heron tools, including tracker, UI, and explorer
New features, such as health manager, Python topology, delivery semantics, and API server
Author Biography
Huijun Wu (https: //www.linkedin.com/in/huijunwu/) is an engineer working at Twitter, Inc. He got his Ph.D. in the School of Computing Informatics and Decision Systems Engineering at Arizona State University. He worked for Microsoft, ARRIS and Alcatel Lucent, and has accumulated abundant project development experience. In this book, he will share the experience accumulated from Heron project.
Maosong Fu (https: //www.linkedin.com/in/maosong-fu-6 3471a34/) is the Engineering Manager for Real-Time Compute Team at Twitter.
He is the author of a few publications in the distributed area and has a master's degree from Carnegie Mellon University and bachelor's from HuazhongUniversity of Science and Technology.