Logstash avro output cannot be decoded by apache avro-tools -
i beginner both logstash , avro. setting system logstash producer kafka queue. however, running problem avro serialized events produced logstash cannot decoded avro-tools jar (version 1.8.2) apache provides. furthermore, notice serialized output logstash , avro-tools differs.
we have following setup:
- logstash version 5.5
- logstash avro codec version 3.2.1
- kafka version 0.10.1
- avro-tools jar version 1.8.2
as example, consider following schema:
{ "name" : "avrotestschema", "type" : "record", "fields" : [ { "name" : "testfield1", "type" : "string" }, { "name" : "testfield2", "type" : "string" } ] } and following json string:
{"testfield1":"somestring","testfield2":"anotherstring"} when serializing using logstash. logstash config file:
input { stdin { codec => json } } filter { mutate { remove_field => ["@timestamp", "@version"] } } output { kafka { bootstrap_servers => "localhost:9092" codec => avro { schema_uri => "/path/to/testschema.avsc" } topic_id => "avrotestout" } stdout { codec => rubydebug } } output (using cat):
fhnvbwvzdhjpbmcayw5vdghlcnn0cmluzw== when serializing using avro-tools. command:
java -jar avro-tools-1.8.2.jar jsontofrag --schema-file testschema.avsc message.json output
somestringanotherstring command:
java -jar avro-tools-1.8.2.jar fromjson --schema-file testschema.avsc message.json output:
objavro.codenullavro.schema▒{"type":"record","name":"avrotestschema","fields":[{"name":"testfield1","type":"string"},{"name":"testfield2","type":"string"}]}▒▒▒▒&70▒▒hs▒u2somestringanotherstring▒▒▒▒&70▒▒hs▒u so our question is: how configure logstash such output becomes compatible apache avro-tools jar?
update: found out logstash produced avro output base64 encoded. cannot find happens, , how make avro-tools compatible
as mentioned in update, found out standard logstash avro codec adds non optional base64 encoding avro output. found undesirable. forked codec , made encoding configurable. tested , worked out of box on several of our systems.
the fork available on github: https://github.com/rubyan/logstash-codec-avro
to set (or unset) base64 encoding, add logstash config file:
output { stdout { codec => avro { schema_uri => "schema.avsc" base64_encoding => false } } }
Comments
Post a Comment