When importing data using Flume, you might want to route Flume events to multiple destinations (e.g.: different directories in HDFS) based on their content. Flume has a functionality called Multiplexing to achieve this goal, this article is a guide to the configuration.
We have a source with events contain header State. As we only care about data in California and New York, we want to filter the events and route events from CA/NY to different sinks. The configuration is as below, we create a null sink to discards uninterested events:
If the events do not have headers and the state infomation is inside the content, you can implement a custom
Interceptorto modify the events.