# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
This configuration defines a single agent named a1. a1 has a source that listens for data on port 44444, a channel that buffers
event data in memory, and a sink that logs event data to the console. The configuration file names the various components, then
describes their types and configuration parameters. A given configuration file might define several named agents; when a given
Flume process is launched a flag is passed telling it which named agent to manifest.
Given this configuration file, we can start Flume as follows:
$ bin/flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console
Note that in a full deployment we would typically include one more option: --conf=<conf-dir>. The <conf-dir> directory would include a
shell script flume-env.sh and potentially a log4j properties file. In this example, we pass a Java option to force Flume to log to the
console and we go without a custom environment script.
From a separate terminal, we can then telnet port 44444 and send Flume an event:
$ telnet localhost 44444
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
Hello world! <ENTER>
OK
The original Flume terminal will output the event in a log message.
12/06/19 15: 32:19 INFO source.NetcatSource: Source starting
12/06/19 15: 32:19 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444]
12/06/19 15: 32:34 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D Hello world!. }
Congratulations - you’ve successfully configured and deployed a Flume agent! Subsequent sections cover agent configuration in
much more detail.
Using environment variables in configuration files
Flume has the ability to substitute environment variables in the configuration. For example:
a1.sources = r1
a1.sources.r1.type = netcat
a1.sources.r1.bind = 0.0.0.0
a1.sources.r1.port = ${NC_PORT}
a1.sources.r1.channels = c1
NB: it currently works for values only, not for keys. (Ie. only on the “right side” of the = mark of the config lines.)
This can be enabled via Java system properties on agent invocation by setting propertiesImplementation =
org.apache.flume.node.EnvVarResolverProperties.
For example::
$ NC_PORT=44444 bin/flume-ng agent –conf conf –conf-file example.conf –name a1 -Dflume.root.logger=INFO,console -
DpropertiesImplementation=org.apache.flume.node.EnvVarResolverProperties
Note the above is just an example, environment variables can be configured in other ways, including being set in conf/flume-
env.sh.
Logging raw data
Logging the raw stream of data flowing through the ingest pipeline is not desired behaviour in many production environments
because this may result in leaking sensitive data or security related configurations, such as secret keys, to Flume log files. By
default, Flume will not log such information. On the other hand, if the data pipeline is broken, Flume will attempt to provide clues
for debugging the problem.
One way to debug problems with event pipelines is to set up an additional Memory Channel connected to a Logger Sink, which will
output all event data to the Flume logs. In some situations, however, this approach is insufficient.
In order to enable logging of event- and configuration-related data, some Java system properties must be set in addition to log4j
properties.
To enable configuration-related logging, set the Java system property -Dorg.apache.flume.log.printconfig=true. This can either be passed on
the command line or by setting this in the JAVA_OPTS variable in flume-env.sh.
评论0