-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Describe the bug
Sometimes I start receiving "The refresh token has not been found: : not_found" during refresh token flow. After that I no more can refresh the token, end user has to pass oauth flow again.
Reproducing the bug
I didn't spent enough time to reproduce it on local setup. First attempts fails. Even on production setup it's not reproduced every time. So I'll describe steps on production setup instead.
1.) I have client, that infinitely refreshes oauth token. Here is NodeJs code snippet of express app, that handles oauth flow and then starts infinite token refresh:
const args = require("./args");
const express = require("express");
const ClientOAuth2 = require("client-oauth2");
const fiberyTheAuth = new ClientOAuth2({
clientId: args.authClientId,
clientSecret: args.authClientSecret,
accessTokenUri: `${args.authUrl}/oauth2/token`,
authorizationUri: `${args.authUrl}/oauth2/auth`,
redirectUri: args.redirectUri,
scopes: ["openid", "offline"],
state: "asdfasfdsafd",
});
const fiberyAuth = fiberyTheAuth;
const app = express();
app.get("/", function (req, res, next) {
var uri = fiberyAuth.code.getUri();
res.redirect(uri)
});
const refresh = async (token) => {
const oldToken = fiberyAuth.createToken(token.data.access_token, token.data.refresh_token);
const time = Date.now();
try {
const refreshed = await oldToken.refresh();
logger.info(`Token refreshed.`, {elapsedTime: Date.now() - time});
return refreshed;
} catch(e) {
sendToSlack(`Token was not refreshed`, getErrorMeta(e))
logger.error("Token was not refreshed", {...getErrorMeta(e), oldToken: oldToken.accessToken});
return oldToken;
}
};
const repeatRefresh = async (token) => {
const newToken = await refresh(token);
setTimeout(() => repeatRefresh(newToken), args.refreshInterval);
}
app.get("/callback", async function (req, res, next) {
try {
const token = await fiberyAuth.code.getToken(args.redirectUri.includes("localhost") ? req.originalUrl : `/api/authCheck${req.originalUrl}`);
logger.info(token);
repeatRefresh(token);
res.sendStatus(200);
} catch(e){
res.sendStatus(401);
logger.info("Failed to get token", getErrorMeta(e));
}
});
module.exports = {app};
2.) In my kubernetes cluster I have v1.4.5-alpine hydra version. I use helm chart of this version. But I believe it is not important. I have replicaCount set to 3, and use postgres as database.
3.) At some point of time of the day my postgres db experience downtime, which is related to current cloud provider.
4.) After that sometimes my client cannot refresh token any more. Here are part of hydra logs:
{"debug":"failed to connect to `host=stolon-proxy-service.postgres.svc.cluster.local user=hydra database=hydra`: dial error (dial tcp 100.75.226.174:5432: operation was canceled)","description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)","error":"invalid_client","level":"error","msg":"An error occurred","time":"2020-06-25T09:04:36Z"}
{"debug":"failed to connect to `host=stolon-proxy-service.postgres.svc.cluster.local user=hydra database=hydra`: dial error (dial tcp 100.75.226.174:5432: operation was canceled)","description":"Client authentication failed (e.g., unknown client, no client authentication included, or unsupported authentication method)","error":"invalid_client","level":"error","msg":"An error occurred","time":"2020-06-25T09:04:40Z"}
{"error":"context canceled","level":"error","msg":"An error occurred","time":"2020-06-25T09:08:54Z"}
{"debug":"The refresh token has not been found: : not_found","description":"The provided authorization grant (e.g., authorization code, resource owner credentials) or refresh token is invalid, expired, revoked, does not match the redirection URI used in the authorization request, or was issued to another client","error":"invalid_grant","level":"error","msg":"An error occurred","time":"2020-06-25T09:08:54Z"}
I removed log records with similar content to keep posted logs short.
After 09:08:54 every attempt to refresh token gets same not_found error.
Server configuration
# Number of ORY Hydra members
replicaCount: 3
image:
# ORY Hydra image
repository: oryd/hydra
# ORY Hydra version
tag: v1.4.5-alpine
# Image pull policy
pullPolicy: IfNotPresent
# Image pull secrets
imagePullSecrets: []
# Chart name override
nameOverride: ""
# Full chart name override
fullnameOverride: ""
# Configures the Kubernetes service
service:
# Configures the Kubernetes service for the proxy port.
public:
# En-/disable the service
enabled: true
# The service type
type: ClusterIP
# The service port
port: 4444
# If you do want to specify annotations, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'annotations:'.
annotations:
kong/request-host: ""
kong/request-path: "/"
kong/preserve-host: "true"
kong/strip-request-path: "true"
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
# Configures the Kubernetes service for the api port.
admin:
# En-/disable the service
enabled: true
# The service type
type: ClusterIP
# The service port
port: 4445
# If you do want to specify annotations, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'annotations:'.
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
# Configure ingress
ingress:
# Configure ingress for the proxy port.
public:
# En-/Disable the proxy ingress.
enabled: false
admin:
# En-/Disable the api ingress.
enabled: false
# Configure ORY Hydra itself
hydra:
# The ORY Hydra configuration. For a full list of available settings, check:
# https://github.com/ory/hydra/blob/master/docs/config.yaml
config:
dsn: "postgres://{{ .Secrets.HydraPgUser }}:{{ .Secrets.HydraPgPassword }}@{{ .Secrets.PgHostname }}:5432/{{ .Secrets.HydraPgDatabase }}?sslmode=disable"
log:
level: "error"
format: "json"
serve:
public:
port: 4444
admin:
port: 4445
tls:
allow_termination_from:
- 0.0.0.0/0
secrets:
system: "{{ .Secrets.HydraSystemSecret }}"
cookie: ""
urls:
self:
issuer: "https://auth.fibery.io"
login: "https://fibery.io/oauth-login"
consent: "https://fibery.io/oauth-consent"
ttl:
access_token: 1h
strategies:
access_token: "jwt"
autoMigrate: true
dangerousForceHttp: true
dangerousAllowInsecureRedirectUrls: false
deployment:
resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
limits:
cpu: 500m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
labels: {}
annotations: {}
# Node labels for pod assignment.
nodeSelector: {}
# Configure node tolerations.
tolerations: []
# Configure node affinity
affinity: {}
# Configures controller setup
maester:
enabled: false
Expected behavior
Refresh flow does not break up
Environment
I believe it is described in helm chart.
Additional context
I was able to reproduce the issue on older version of hydra locally, but didn't find precise steps. What I tried was to turn on\off db, restart hydra container. At the same time I executed infinite token refresh. After upgrading to the latest hydra version I was not able to reproduce the issue, and hoped everything would go well, but it didn't.
I for sure understand that described details are not precise and may be not enough. Probably I would try to find exact steps. My only alternative is to move to some other oauth server which is of course time consuming.
Maybe someone can suggest any workaround of this issue. At the moment I have several client apps, like Zapier and Slack integrations. And clients are not happy with accidentally losing auth info.