When importing data using Flume, you might want to route Flume events to multiple destinations (e.g.: different directories in HDFS) based on their content. Flume has a functionality called Multiplexing to achieve this goal, this article is a guide to the configuration.
Example
We have a source with events contain header State. As we only care about data in California and New York, we want to filter the events and route events from CA/NY to different sinks. The configuration is as below, we create a null sink to discards uninterested events:
|
|
Note:
If the events do not have headers and the state infomation is inside the content, you can implement a customInterceptor
to modify the events.