Skip to content

ReactorClient is not refreshing the token, when response is 401 #1130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kvmw opened this issue Jan 28, 2022 · 14 comments
Closed

ReactorClient is not refreshing the token, when response is 401 #1130

kvmw opened this issue Jan 28, 2022 · 14 comments
Labels
question triaged Initial triage of issue has been performed

Comments

@kvmw
Copy link
Contributor

kvmw commented Jan 28, 2022

Looking at Operator.java in cf-java-client source code, I see that in case of 401 responses, the reactor-client retries the requests with new token. So there should be no InvalidAccessToken (401) thrown out of the library. But as you can see in the below stack trace, we are catching such exceptions, frequently, in our code.

cf-java-client v4.16.0

stack trace:

2022-01-28 00:49:57.294 DEBUG 36 --- [ry-client-nio-1] cloudfoundry-client.response             : 401    /oauth/clients (16 ms)
2022-01-28 00:49:57.623  INFO 36 --- [undedElastic-20] s.c.s.c.ServiceInstanceBindingController : Deleting a service instance binding
2022-01-28 00:49:57.756  INFO 36 --- [ctor-http-nio-3] .s.WorkflowServiceInstanceBindingService : Deleting service instance binding
2022-01-28 00:49:57.761 DEBUG 36 --- [ry-client-nio-2] cloudfoundry-client.request              : DELETE /oauth/clients/config-client-dfc1041e-5e88-45d4-8311-2ab87b2620f1
2022-01-28 00:49:57.776 ERROR 36 --- [ry-client-nio-2] .s.WorkflowServiceInstanceBindingService : Error deleting service instance binding. error=invalid_token: Invalid access token: expired at Thu Jan 27 17:49:54 UTC 2022

org.cloudfoundry.uaa.UaaException: invalid_token: Invalid access token: expired at Thu Jan 27 17:49:54 UTC 2022
	at org.cloudfoundry.reactor.util.ErrorPayloadMappers.lambda$null$8(ErrorPayloadMappers.java:89) ~[cloudfoundry-client-reactor-4.16.0.RELEASE.jar!/:na]
	Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
Assembly trace from producer [reactor.core.publisher.MonoFlatMap] :
	reactor.core.publisher.Mono.checkpoint(Mono.java:1884)
	org.cloudfoundry.reactor.uaa.clients.ReactorClients.delete(ReactorClients.java:111)
Error has been observed at the following site(s):
	|_ Mono.checkpoint ⇢ at org.cloudfoundry.reactor.uaa.clients.ReactorClients.delete(ReactorClients.java:111)
Stack trace:
		at org.cloudfoundry.reactor.util.ErrorPayloadMappers.lambda$null$8(ErrorPayloadMappers.java:89) ~[cloudfoundry-client-reactor-4.16.0.RELEASE.jar!/:na]
		at org.cloudfoundry.reactor.util.ErrorPayloadMappers.lambda$null$11(ErrorPayloadMappers.java:112) ~[cloudfoundry-client-reactor-4.16.0.RELEASE.jar!/:na]
		at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:118) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:67) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxHandle$HandleSubscriber.onNext(FluxHandle.java:112) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:213) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext(FluxDoFinally.java:123) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxHandleFuseable$HandleFuseableSubscriber.onNext(FluxHandleFuseable.java:178) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxContextStart$ContextStartSubscriber.onNext(FluxContextStart.java:96) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1812) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:121) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxPeek$PeekSubscriber.onComplete(FluxPeek.java:252) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:136) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onComplete(FluxDoFinally.java:138) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:136) ~[reactor-core-3.3.16.RELEASE.jar!/:3.3.16.RELEASE]
		at reactor.netty.channel.FluxReceive.onInboundComplete(FluxReceive.java:383) ~[reactor-netty-0.9.19.RELEASE.jar!/:0.9.19.RELEASE]
		at reactor.netty.channel.ChannelOperations.onInboundComplete(ChannelOperations.java:373) ~[reactor-netty-0.9.19.RELEASE.jar!/:0.9.19.RELEASE]
		at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:429) ~[reactor-netty-0.9.19.RELEASE.jar!/:0.9.19.RELEASE]
		at reactor.netty.http.client.HttpClientOperations.onInboundNext(HttpClientOperations.java:655) ~[reactor-netty-0.9.19.RELEASE.jar!/:0.9.19.RELEASE]
		at reactor.netty.channel.ChannelOperationsHandler.channelRead(ChannelOperationsHandler.java:96) ~[reactor-netty-0.9.19.RELEASE.jar!/:0.9.19.RELEASE]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) ~[netty-codec-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) ~[netty-codec-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) ~[netty-codec-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.logging.LoggingHandler.channelRead(LoggingHandler.java:271) ~[netty-handler-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1504) ~[netty-handler-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1253) ~[netty-handler-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1300) ~[netty-handler-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:508) ~[netty-codec-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:447) ~[netty-codec-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) ~[netty-codec-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) ~[netty-transport-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) ~[netty-common-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[netty-common-4.1.63.Final.jar!/:4.1.63.Final]
		at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.63.Final.jar!/:4.1.63.Final]
		at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_222]
@dmikusa
Copy link
Contributor

dmikusa commented Jan 28, 2022

Can you provide some additional information?

  1. How do you have your Java client configured? Code snippets that show connection context, token provider and clients are configured. See https://github.com/cloudfoundry/cf-java-client/#cloudfoundryclient-dopplerclient-uaaclient-builders.
  2. How frequently does your application need to renew the token? What is the lifetime for the tokens that are being given to your user or client from UAA?
  3. If you have more examples of the exceptions, that would be helpful.

Thanks

@kvmw
Copy link
Contributor Author

kvmw commented Jan 28, 2022

  • Here is the client config:
@Configuration
@EnableConfigurationProperties({UaaClientProperties.class, AppUriProperties.class})
class ServiceInstanceBindingConfiguration {

	@Bean
	UaaBackingAppCredentialsClient uaaBackingAppUserClient(ReactorUaaClient uaaClient) {
		return new UaaBackingAppCredentialsClient(uaaClient);
	}

	@Bean
	ClientCredentialsGrantTokenProvider bindingTokenProvider(UaaClientProperties uaaProperties) {
		return ClientCredentialsGrantTokenProvider.builder()
					.clientId(uaaProperties.getClientId())
					.clientSecret(uaaProperties.getClientSecret())
					.build();
	}

	@Bean
	ReactorUaaClient bindingUaaClient(ConnectionContext connectionContext,
                                          ClientCredentialsGrantTokenProvider tokenProvider) {
		return ReactorUaaClient.builder()
					.connectionContext(connectionContext)
					.tokenProvider(tokenProvider)
					.build();
	}

	@Bean
	ConnectionContext uaaConnectionContext(UaaClientProperties uaaProperties) {
		return DefaultConnectionContext.builder()
					 .rootProvider(new RootProvider() {
						@Override
						public Mono<String> getRoot(ConnectionContext connectionContext) {
					              return Mono.just(uaaProperties.getUaaUri());
						}

					       @Override
					       public Mono<String> getRoot(String key, ConnectionContext connectionContext) {
					              return Mono.just(uaaProperties.getUaaUri());
					       }
					})
					.apiHost(uaaProperties.getHost())
					.secure(uaaProperties.getScheme().startsWith("https"))
					.port(Integer.parseInt(uaaProperties.getPort()))
					.build();
	}
}
  • Tokens have the life time of 1 day. so the client needs to renew them every day.

I noticed that there is only one log message about token renewal: Negotiating using refresh token in 5 days logs.
given that the default retry is 5, i don't understand why retry is not happening.

@dmikusa dmikusa added the triaged Initial triage of issue has been performed label Feb 2, 2022
@dmikusa dmikusa changed the title ReactorClient is not refreshing the token, when response is 4.01. ReactorClient is not refreshing the token, when response is 401 Feb 3, 2022
@taotao20082006
Copy link

Hi @dmikusa-pivotal Please let us know if there is any update on this issue. This is related to JIRA https://pivotal-io.atlassian.net/browse/SCS-49 which we opened few months ago. Currently customer is still waiting for a fix so far. Thanks.

@pivotal-david-osullivan
Copy link
Contributor

@kvmw I've merged a PR which fixes a comparison check between a 401 status and the response received - we are hoping this is the cause. Since you can reproduce the issue, could you try testing with the latest builds of the operations and reactor libs?

@kvmw
Copy link
Contributor Author

kvmw commented Feb 17, 2022

@pivotal-david-osullivan, Thanks for the update.
As far as I know, only one client has reported this issue. I've not been able to reproduce this issue myself. I'm afraid the only way to try this fix in the client environment is by releasing a new version of SCS ( sporing cloud services) which requires new release of the library.

@kvmw
Copy link
Contributor Author

kvmw commented Feb 22, 2022

@pivotal-david-osullivan, The only way i could reproduce the issue in my local env. was by providing different reason message for 401 status code from the mock UAA server. So if the UAA server returns anything other than Unauthorized reason message for 401 code, the current cf-java-client fails to compare it properly.

The code you've merged fixes this issue. But I don't have any way to verify if UAA has ever returned a 401 code with a custom reason message in our production env.

@pivotal-david-osullivan
Copy link
Contributor

Thanks for confirming @kvmw! We were wondering if something in the user's environment could be interfering with the response to create the scenario that is only being seen in this case?

@kvmw
Copy link
Contributor Author

kvmw commented Feb 23, 2022

@dmikusa-pivotal , Apparently this is the default behaviour of uaa.
I've cloned the uaa repo and ran it locally. when the response is 401 the reason-phrase is empty (not Unauthorized).

By changing the UAA access-token lifetime to 60 seconds, i can easily reproduce the issue now with following code:

		GetClientRequest request = GetClientRequest.builder().clientId("app").build();

		System.out.println(client.clients().get(request).block().getName());  // Successful call, first call  with new access token.
		Thread.sleep(30 * 1000);
		System.out.println(client.clients().get(request).block().getName());  // Successful call, access token still is valid.
		Thread.sleep(31 * 1000);
		System.out.println(client.clients().get(request).block().getName()); // Fails with 401. because access token has been expired after 61 seconds. There is not attempt to refresh the token. 

Now the more important question is: why only one of our customers complaining about this issue. Given above behaviour all of them should have seen such issue.

@kvmw
Copy link
Contributor Author

kvmw commented Feb 23, 2022

Found this in spring-boot repo: spring-projects/spring-boot#6548

Apparently tomcat 8.5 has stoped sending the reason phrase : https://bz.apache.org/bugzilla/show_bug.cgi?id=60362

Again, this does not answer why more clients are not complaining about this issue.

@eaglerainbow
Copy link

Again, this does not answer why more clients are not complaining about this issue.

I might have one possible reason: Most consumers of the library often have only a short runtime, i.e. below the exp's timestamp. 24hrs is a lot! That is why they always use the same JWT, never have to refresh and hence never run into this issue. Only applications (such as Promregator) run typically for weeks or months and therefore have to refresh their tokens.

@eaglerainbow
Copy link

@pivotal-david-osullivan Is the PR that you are referring to in #1130 (comment) the PR at #1136?
Background to this is that I would like to try out your assumed patch hoping to fix promregator/promregator#218.

If yes, then it is already released with 5.7.0 and consumption would be rather easy for me...
If no, things would become tricky...

Thanks for the info!

@dmikusa
Copy link
Contributor

dmikusa commented Apr 1, 2022

@eaglerainbow - Sorry, we should have followed up and closed out this issue. #1136 should be the fix for this issue, and yes, that is in 5.7.0.

The environment where this was reported initially only seldom hit the issue. Due to that, it's a little difficult to confirm 100% that the issue is resolved because it would probably take months to definitively confirm, but confidence in the solution is high.

The change uses .equals() instead of == in a comparison. The item being compared is an object, so it should really be using .equals(), otherwise it's just comparing the two references are the same. We believe that == check did work in some cases because often the library will use a static instance to represent the http status. Thus when a comparison is done it's comparing two references to the same static instance of the object and == would be true. In some cases, like if the reason phrase does not match the expected reason phrase then the library will create a new instance an http status object. This would then be a different instance and not match with == when compared to the static instance. Using .equals() works because the object comparison is done on the http status code which does match, regardless of the reason phrase.

Hope that helps! I'm going to close this out, but feel free to reopen if you use 5.7.0 and are still seeing issues.

@dmikusa dmikusa closed this as completed Apr 1, 2022
@eaglerainbow
Copy link

Thanks for the verbose response!
I have bumped Promregator with version 0.10.2 to cf-java-client 5.7.0. Also for us, it's non-trivial to trigger the issue to happen. However, with promregator/promregator#218 (comment) we will try to do so. In case we don't report back here, consider this issue solved.
(BTW: As I am not a member of this repository, I do not have authorities to reopen issues).

@dmikusa
Copy link
Contributor

dmikusa commented Apr 1, 2022

(BTW: As I am not a member of this repository, I do not have authorities to reopen issues).

Oh, ok. Good to know. I thought users could do that. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question triaged Initial triage of issue has been performed
Projects
None yet
Development

No branches or pull requests

5 participants