I’m working with Kafka for more than 2 years and I wasn’t sure if Kafka Consumer eats more RAM memory when it has more partitions. I couldn’t find any useful information on the internet, so I decided to measure everything by myself. Inputs I started with 1 broker, since I am interested in actual memory consumption for 1 and 1000 partition topics. I know, lauching Kafka in a cluster can differs, because we have replication processes, acknowledgments, and other cluster things, but let’s skip it for now. Two basic commands for launching Kafka single node cluster: bin/zookeeper-server-start.sh config/zookeeper.properties bin/kafka-server-start.sh config/server.properties I created two topics, topic1, with 1 partition, and topic2, with 1000 partitions. I believe, the difference between partitions is enough for understanding memory consumption. bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic topic1 bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1000 --topic topic2 It’s good that Kafka provides us with kafka-producer-perf-test.sh, a performance script, which let us load test Kafka. bin/kafka-producer-perf-test.sh --topic topic1 --num-records 99999999999999 --throughput 1 --producer-props bootstrap.servers=localhost:9092 key.serializer=org.apache.kafka.common.serialization.StringSerializer value.serializer=org.apache.kafka.common.serialization.StringSerializer --record-size 100 So, I consequently launched load tests to insert data into two topics with a throughput of 1, 200, 500 and 1000 messages/second. I collected all...
Hi, my name is Ivan Ursul and I am a freelance engineer since 2015. It’s been a while since I started my career as an independent freelancer. I started it as an engineer in Upwork, in one of their teams, where I was involved first in reporting backend service, then in the time-tracker pipeline, which is served as a backend for Upwork Tracker Application(UTA) client. Today I continue my work with Upwork, but I am also actively working with other customers, who are very different. I’ve successfully completed 22 projects since the very beginning. That’s why I decided to write an article about different aspects of everyday life of a freelance engineer. You may agree or disagree, anyway, I encourage you to leave your thoughts under this article. This article will be grounded on Upwork platform, I haven’t used other platforms, but I am quite sure the approach is the same. Learn your customer You will have to find out the common things about your clients. Are all of them technical? Do you prefer to work with non-technical people? What is your industry domain? These are the questions you should have answers to. After you realize what combines your customers...
Few days ago I had a problem on one of the projects that I am working on, we had a memory leak. During the two days period our services crashed three times, so I decided to investigate it. Everything which I’m going to talk about is not a rocket science, there’s no clever and tricky tips, it’s just a straighforward explanation how you can find memory leaks. Exposing JMX I had a problem on a production instance, so I started my services with JMX feature enabled. Just start your apps with following params: -Djavax.management.builder.initial= -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=${whatever_port} -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false Starting JVisualVM Just enter your terminal and type jvisualvm. You should get following screen: Add a remote connection, specify JMX port and connect. Waiting You have to wait some time before retained memory will take place and you will be able to analyse it. It’s up to you how long to wait, in my case, it was enough to wait 4-5 hours to get 100% proof of what part of the system is leaking. Getting heap dump Now go to Monitor section, press Heap Dump button and specify path where heap dump should be saved. In my case it was /tmp/**.hproof. Then...
Jekyll Why Jekyll ? I decided to use Jekyll, because I had a blog on Ghost platform. I was waiting for a new 1.0 release, but then I suddenly realize that I don’t want to use it, because: I have to maintain it on my own I have to pay 10$ every week for 1GB DigitalOcean instance SSL certificates Ghost is written on javascript, so there’s a specific scalability Jekyll, on the other hand, is hosted on GitHub and is a great and modern instrument for writing your blog. The idea is that you store all your images and posts on GitHub. Speed My first question was about performance. If it’s a static files, hosted on GitHub, then they have to be extremely slow. No, that’s not true and according to my measures, new version on Jekyll is even faster, than Ghost version. Convenience Another question I asked myself was how I will write posts. Because Jekyll has no admin gui, I need to find a way to write posts. I find MacDown tool very convenient for writing posts on my local laptop. Another problem comes about pre-showing posts. For example, you want to see how your post will look...
Intro Why do people do microservices architecture? Let’s start with a review of a classical, old-school, n-tier application, which is often called a monolith: Everyone worked with this types of applications, we were studying them in university, we did lot’s of projects with this architecture and at first glance, it’s a good architecture. We can immediately start doing systems on this architecture, they are perfectly simple and easy to implement. But as such system becomes bigger, it starts receiving lots of different issues. The first issue is about scalability, it doesn’t scale well. If some part of such system will need to scale, then the overall system should be scaled. Since the system is divided into modules, it runs as one physical process and there’s no option to scale one module. This can lead to expensive large instances, which you will have to maintain. The second issue is about engineers. Since it will become enormously big, you will have to hire a really big team for such system. Evolution How can you address this issues? One of the ways you can evolve is to make a transition to a microservices architecture The way it works is that you have a...