Stream processing with Kafka and Go June 16 2016

Tamás Michelberger Secret Sauce Partners, Inc

Agenda Let's talk about Kafka How to use Kafka from Go How we use Kafka at SSP

What is Kafka? From the docs: Kafka is a distributed, partitioned, replicated commit log service. Producers and consumers Topics for maintaining a feed of messages Originally comes from LinkedIn

Is it any good?

A typical microservice architecture

Apache Kafka + Zookeeper = 3.5 million writes per second(http://www.slideshare.net/hyderabadscalability/apache-kafka-zookeeper-35-million-writes-per-second)

Introducing Kafka to mix

Apache Kafka + Zookeeper = 3.5 million writes per second(http://www.slideshare.net/hyderabadscalability/apache-kafka-zookeeper-35-million-writes-per-second)

How does it work from a user's perspective? Topics are the main unit of organization Topics can have multiple partitions Partitions are the unit of parallelism O sets for keeping track of consumer progress

"Dumb" server and smart clients Do one thing but do it well Server is basically a transport mechanism with some housekeeping Client does: partitioning consumer group orchestration (only in <0.9) o set tracking

Message formats Messages are just byte arrays, the server never tries to make sense of them Clients are free to chose whatever format they want Send some (semi-)structured data such as JSON Even better: messages with associated schemas Avro, Protobuf, Thrift

Using Kafka from Go

Producers import "github.com/Shopify/sarama" producer, err := sarama.NewAsyncProducer([]string{"localhost:9092"}, nil) if err != nil { log.Fatal(err) } go func() { for err := range producer.Errors() { log.Printf("producer couldn't send message: %v", err) } }() producer.Input() <- &sarama.ProducerMessage{ Topic: "mytopic", Key: sarama.StringEncoder("key"), Value: sarama.StringEncoder("message content"), }

Consumers import "github.com/Shopify/sarama" consumer, err := sarama.NewConsumer([]string{"localhost:9092"}, nil) if err != nil { log.Fatal(err) } partitionConsumer, err := consumer.ConsumePartition("my_topic", 0, sarama.OffsetNewest) if err != nil { log.Fatal(err) } for message := range partitionConsumer.Messages() { log.Printf("Consumed message offset %d\n", msg.Offset) }

Consumer group 0.8.2 example but the API for 0.9 really similar import "github.com/wvanbergen/kafka/consumergroup" consumer, err := consumergroup.JoinConsumerGroup( "ExampleConsumerGroup", []string{"topic.with.single.partition", "topic.with.multiple.partitions"}, []string{"localhost:2181}, // Zookeeper nil) if err != nil { log.Fatal(err) } for event := range consumer.Messages() { // Process event log.Println(string(event.Value)) // Ack event consumer.CommitUpto(event) }

Kafka at SSP Single Kafka node Handful of topics Data pipeline for product and transaction feed processing A little bit less than 500 000 messages a day

Fit Predictor Product feed

Product consumer Style Finder

References kafka.apache.org/documentation.html (http://kafka.apache.org/documentation.html) godoc.org/github.com/Shopify/sarama (https://godoc.org/github.com/Shopify/sarama) Apache Kafka + Zookeeper = 3.5 million writes per second (http://www.slideshare.net/hyderabadscalability/apache-kafkazookeeper-35-million-writes-per-second)

Apache Kafka 0.10: Evaluating Performance in Distributed Systems (https://engineering.heroku.com/blogs/2016-05-27apache-kafka-010-evaluating-performance-in-distributed-systems/)

Thank you Tamás Michelberger Secret Sauce Partners, Inc @tmichelberger (http://twitter.com/tmichelberger)

golang-brno-201606-slides.pdf

Page 1 of 17. Stream processing with Kafka and Go. June 16 2016. Tamás Michelberger. Secret Sauce Partners, Inc. Page 1 of 17 ...

157KB Sizes 0 Downloads 174 Views

Recommend Documents

No documents