Kafka differences: topics, partitions, publish/subscribe and queue.

When I first started to look at Kafka as a event publishing system, my first question was ‘How to create pub/sub and queue type of events?’. Unfortunately, there was no quick answers for that, because the way Kafka works. My previous article was about Kafka basics, which you can, of course, read and get more information about this cool commit log. By this article I’ll try to explain why Kafka is different from other similar system, how it differs, and will try to answer to all interesting questions, which I had in the beginning. Why Kafka is a commit log? Simply, because Kafka works different to other pub/sub systems. It’s a commit log, where new messages are being appended constantly. Each message has it’s own unique id, which is called offset Okay, and how to consume this so called ‘commit log’ ? Consumer stores one thing - offset, and he is responsible for reading messages. Consider this console consumer bin/kafka-console-consumer.sh –zookeeper localhost:2181 –topic test –from-beginning As you see, this consumer will read all log from the beginning. Are messages being deleted ? Yes, after some time, there’s a retention policy. So, how to create pub/sub model ? Every consumer should...

Apache Kafka. Basics

Preface Today, we live in a world, which defines no ip addresses and is dynamically changing minute by minute. As amount of services increase every day, we receive new problems. Story Just imagine, that you have a monolithic application, which you want to rewrite into microservices architecture. You start with a single service, which, let’s say, maintains profile functionality. You use MongoDB as a database. At this step you don’t have any troubles, because, it’s a single one, and it doesn’t interact with other world. You developed some required amount of endpoints, everything works fine. Then imagine, that your next step is to start doing another service, let’s say, billing service, which uses PostgreSQL for storing transactions. Besides that, you need to know, when Profile service receives new put request and updates some profile. So you have two options - develop an external API for your case or work with PostgreSQL db straightly. So, you chose to work with db, and your problems begin with this point: you coupled profile and billing service together. You are signing a contract, that from now on, you need to think about two databases. And this sucks. Here’s why After some time you receive...

Tee Streaming

Few days ago I faced an issue with using Java InputStream in parallel. Imagine following situation: you have InputStream, which you need to use in parallel. First thing - you can’t use it in parallel, because InputStream keeps some pointer, which store information about where stream position is. More realistic scenario is to make first call asynchronous, and leave second as it is. But again, if we are working with streams, after we read it fully, there shouldn’t be anything to read again, right ? So, this article is about problem of parallel read and how to fix them. Watch this example to understand why parallel stream read is a bad idea The output will be similar to: main thread line: Number1 Number2 Number3 Number4 Number5 Number6 Number7 Number8 Number9 Number... t1 line: Number831 Number832 Number833 Number834 Number835 Number836 Number837 Number838 Number8... As we see, some of the numbers are in the first line, and some of - in the second. org.ivanursul.ghost.Main thread could be executed first The result is even funnier: main thread line: Number1 Number2 Number3 Number4 Numbe t1 line: Because main thread read everything first, there was nothing to read for t1 thread. ######TeeInputStream The idea is...

Deploying your application to cloud using docker-machine

Problem As a part of my investigation of what docker is, I want to do a simple and useful thing - deploy my application in a completely convenient manner. Let’s say, I’m using Digital Ocean as a cloud provider. Because my application is too little to think about complex deployment infrastructure, I’d like to be able to deploy everything from my laptop using digitalocean token. Solution I’ll try to deploy everything using docker-machine, together with digitalocean cloud provider. I’ll describe in step by step how to do this. Setup a new digitalocean token. Go to cloud.digitalocean.com/settings/api/tokens and generate a new token Then, got your token, and execute following commands to export your instructions for further commands. You will understand why do we need them later. export DIGITALOCEAN_ACCESS_TOKEN=${newly-generated-token} export DIGITALOCEAN_PRIVATE_NETWORKING=true export DIGITALOCEAN_IMAGE=debian-8-x64 Create new machine As simple, as it can be - I’ll create a new digitalocean instance. Open your terminal, and type following command. docker-machine create \ -d digitalocean \ my-application Few explanations, -d digitalocean means, that you will use digitalocean for deployment. Out of the box, digitalocean will use exports, that we set some minutes ago. While we’re waiting till our console end up deploying to docker, let’s open...

Combining Spring Integration Testing with Mockito

Integration tests…they are perfect for testing your data flows. You send some request to your application and can control how data is being processed throughout your application. You see how request is received by your controller, then it’s sent to service, dao or other layers, that you have in your application. Sometimes you don’t want some layer to do real work in Spring. For instance, your dao layer is using some native queries to get data from database, and some embedded database doesn’t support some query syntax. Naturally, you still want to have your integration tests, but without a real call to database. What can we do in this situation ? I’d suggest to mock this dao layer, using mockito. Let’s demonstrate how it works ? ######Project setup I use Spring Initialzr to setup projects, so let’s create a simple Spring Boot application. Code can be found here. ######Project structure ├── README.md ├── build.gradle ├── gradle │   └── wrapper │   ├── gradle-wrapper.jar │   └── gradle-wrapper.properties ├── gradlew ├── gradlew.bat ├── spring-integration-mockito.iml └── src ├── main │   ├── java │   │   └── org │   │   └── springmockito │   │   └── demo │   │   ├── DemoApplication.java │   │   ├── ExampleDao.java │   │  ...