-
Notifications
You must be signed in to change notification settings - Fork 41.2k
Add Kafka Health Indicator #14088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It wasn't lost, It was reverted for the reason exposed in #12225. If you have something that address the concern expressed there, I am more than happy to hear from you. Thanks for sharing but a piece of code with no tests is not something we can use. As for the metrics support, this is unrelated and we don't deal with several topics in a single issue. There is already an issue in the micrometers project that you could subscribe to. |
Thats why its a feature request and not a pull request.
Sorry. I thought both of them would be monitoring, but I'll use separate issues for that in the future. |
As I've already indicated we've tried to implement it already. See #12225 and the reasons why it got reverted. If you can help in that area we're most certainly interested. |
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed. |
I currently cannot help you with that. |
That's my personal solution. import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Component;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;
@Component
public class KafkaHealthIndicator implements HealthIndicator {
private final Logger log = LoggerFactory.getLogger(KafkaHealthIndicator.class);
private KafkaTemplate<String, String> kafka;
public KafkaHealthIndicator(KafkaTemplate<String, String> kafka) {
this.kafka = kafka;
}
/**
* Return an indication of health.
*
* @return the health for
*/
@Override
public Health health() {
try {
kafka.send("kafka-health-indicator", "❥").get(100, TimeUnit.MILLISECONDS);
} catch (InterruptedException | ExecutionException | TimeoutException e) {
return Health.down(e).build();
}
return Health.up().build();
}
} |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I have been looking into some out-of-box solution for health indicator for Kafka. It worth to notice that @MartinX3's solution can only provide connectivity health check while @ST-DDT 's solution can provide connectivity health check and some meta info of the cluster. It would have to combine |
Any update about it? |
@snicoll @wilkinsona I would ❤️ to pick this work back up. At my company we have many many apps all implementing their own copy/pastad variant of health checks for Kafka and KafkaStreams. I was getting ready to write an internal starter but of course wanted to first see if we could add it back into Spring Boot. I came across this issue (and the others related to the revert) and understand the reason for reverting. What are your thoughts about this in 2021? 😸 |
I'm afraid I don't know enough about Kafka and assessing its health to know for certain what we should do here. |
Yeh, It is really unfortunate that Kafka does not provide a mechanism to check health or at least give an opinion on a recommended approach. Looking back through the revert ticket it seems like w/ the opt-in repl factor check and the updates to the admin client that it was really close to being usable. It also seems that you and @snicoll came across another roadblock that made a case for reverting. I know there are 1001 other things to be dealing w/ so I'm not trying to re-hash the past - but rather understand the limitations and see if there is something that would make sense out of the box. I am not super familiar w/ Cassandra but I see in the CassandraDriverHealthIndicator that it is considered healthy if at least 1 node reports as "up". How is this case different? I am curious to get @garyrussell opinion on what a "good" Kafka health indicator would be. He seems to dabble in Kafka from time to time ;) |
One of the stumbling blocks is when using transactions - the number of active brokers and in-sync replicas are broker configuration properties, which are not available on the client. If an application is using transactions and there are not enough brokers to publish a record to a particular partition, the producer hangs until a timeout occurs. It is made more complicated because There are just too many of these corner cases to come up with a single robust health indicator. |
Thanks for clarifying @garyrussell. I understand the concern in those other issues now. |
I promise not to turn this into a KafkaStreams thread discussion but am curious if there are hidden "stumbling blocks" for KafkaStreams in this area as well? For KafkaStreams health indicators at my company we have been using the |
I you are using exactly once semantics, KafkaStreams will be affected by the insufficient in-sync replicas problem too. |
Thanks for that info @garyrussell , good to know. We are not using that currently in our streams app. It seems that "transaction" / "exactly once semantics" case is what adds a good deal of complexity to this (via need for in-sync replicas check). I wonder if it would make sense to add a simple health check that does not cover that case. Although, I don't think there is a good way to conditionally auto-configure that based on whether or not the app is configuring/using that feature of Kafka/KafkaStreams and it could be confusing to users if it works for all but that case. |
I think that's the crux of this one and, as such, I don't think we should try to provide one. I think the risk of giving an inaccurate status is too high. IMO, we should close this one. Let's see what the rest of the team thinks. |
After digging in more, I agree that there is not an easy way to provide a one-size-fits-all solution. Closing this ticket would probably solidify that decision. If something is made available by Kafka in the future that makes this feasible, then a ticket can be created at that time. |
+1 to closing. I'm going to go ahead and do that. Thanks for your efforts @Bono007 |
In previous versions of Spring-Boot there was an inbuild health indicator for Kafka, however somewhere along the way it was lost.
Refs:
Please add the HealthIndicator for Kafka again and add metrics as well.
This can be achieved using the following code:
(includes both metrics and health)
Feel free to use or modify the code as you see fit.
The text was updated successfully, but these errors were encountered: