-
Notifications
You must be signed in to change notification settings - Fork 389
Realtime db triggers timeout after upgrading admin sdk to 9.6.0 #1231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We're facing this issue in both versions, 9.6.0 and 9.5.0. We just redeployed all functions with 9.5.0 to see if this helps. I'll post if we face this issue again. |
There was an incident that disrupted RTDB connectivity for some users on 04/16: https://status.firebase.google.com/incidents/REUgDGe1YiqLTQScxNGP Some of you may have been affected by that (unless you can consistently reproduce the issue). If the problem persists you might want to enable RTDB debug logs and see what's causing the timeouts. Usually you can do so by adding the following statement to your code (somewhere at the initialization):
|
The 04/16 incident you mentioned above did affect us. But it only caused internal server errors from the Firebase Node.js Admin SDK. It did not make any promises wait forever to resolve. |
This was not isolated to a particular incident. We had this issue for awhile trying to figure out what's going on, until finally downgraded to 9.5.0 by fixing versions in package.json. No more timeouts after the downgrade. No other changes except 9.6.0 => 9.5.0 |
@sluramod would be great if you can also provide a debug log. How long does it take (or how many invocations) before you'd observe a timeout? Also have you gone back to v9.6.0 and confirmed the problem occurs there again? |
@hiranya911 I cannot identify a pattern how long or how many invocations it takes for the issue to present itself; if I had this info I would share it in the original bug description. We are not going back to 9.6.0 to test if the problem occurs there - our production environment is not for qa purposes. If you want to reach out personally - I can share the details observed, I cannot do it in the context of this discussion. |
@hiranya911 Perhaps we are running into this from #1194 RELEASE NOTE: The periodic token refresher background task has been decoupled from the SDK core, and moved into the RTDB module. This task no longer starts automatically, unless the admin.database() API is explicitly invoked. The thing is, we don't want to initialize admin to improve cold startup time for our functions. |
firebase-functions uses firebase-admin under the hood. So it's likely still getting initialized. So far I've been able to repro a couple of timeouts. These seem to have been caused due to the metadata server sending auth tokens with a very short expiry time. This behavior of the metadata server, coupled with some changes made in the above PR can potentially cause tight retry loops. I'm not sure if that's the exact issue, but it's something to look at. Initial auth handshake:
Proactive refresh after 25 minutes (original token has 5 more minutes to expire):
Here the metadata server has sent the same token as before, with less than 5 minutes on it. SDK now enters a tight retry loop, degrading performance, which in some cases can end with a timeout. |
I'm having a similar issue but I believe mine comes from firebase auth [although the possibilite of the realtime database being the cause is not dismissed]. I've filled my own issue #1233 [before finding this one] in my analisys is very consistent the 30 minutes interval between timeouts
How big it the problem of calling |
I believe I have isolated the issue with RTDB calls periodically timing out. This happens when the metadata server sends a token that have a TTL shorter than 5 minutes (which seems to happen every 30 minutes or so in Cloud Functions). Here's an example sequence of events leading to the problem:
The 2nd step above triggers the following event handler: firebase-admin-node/src/database/database-internal.ts Lines 118 to 120 in 011c530
That in turns calls the following logic: firebase-admin-node/src/firebase-app.ts Lines 69 to 72 in 011c530
But since the last fetched token has a short TTL, it causes the SDK to attempt a token refresh again. This results a in a tight loop, with separate RPCs for each refresh attempt, and can result in a timeout in some cases. The loop continues until the metadata server eventually sends a fresh token with a TTL longer than 5 minutes. I have a fix in progress. |
I believe this issue also happens in other firebase SDK.
During the past month I looked in many possible ways for the error always assuming the problem was with the firebase rules related to the queries... but since you confirmed there is really an issue with the tokenization that would easily explain this problem i face on the android sdk I'm using the firebase-database: 19.7.0 for android could you forward that to the responsible team? |
I'm afraid it doesn't. Server-side Firebase SDKs are quite different from client SDKs, and the problem I've outlined above is extremely specific to server-side code running on Google Cloud Functions. It cannot explain any client-side problems.
Please file a bug report at https://github.com/firebase/firebase-android-sdk |
@hiranya911 I believe that this specific feature is quite similar... The client SDK also has a token that refresh from time to time.... As you said the whole problem starts because the server sends a token with ttl less than 5 minutes.... If both client and admin sdk are connected to the same token server it is very likely to the problem be the same.... the difference is that in a firebase function this will cause a timeout and in an android app this error will cause the app to take forever to load something making the user bored [close the app] or try to use the app without load [possibly causing other errors] For sure my app has dozens of database requests but the one that signs to me a red flag are the ones related to login.... Think with me... if someone tries to use an app and the login page takes over 5 minutes this person will asume imediatally there is a bug... lots of complaims appear on google play... I will open an issue in the firebase android sdk and tag you, if you can please wave to the engineers over there it would help a lot, because many times when we open an issue here it bounces a lot until the engineers assume there is a problem in the firebasse side (since the first assumption is always that the problem is from our code) |
They are not. Admin SDKs use Google OAuth, whereas client SDKs use Firebase access tokens. These are entirely different services, and even different token standards. Generally speaking, a Firebase access token is always valid for 1 hour. Google OAuth tokens are also usually valid for 1 hour, unless they were minted by the metadata server on Google Cloud Functions -- in which case it can have a shorter TTL. |
@hiranya911 i've downgraded to 9.5.0 and i'm still facing the issue
|
9.5.0 shouldn't be affected by the same issue. As @sluramod has reported above (and as also confirmed in our own testing), the problem only affects 9.6.0. The fix #1234 is for the SDK code. We have no control over token server implementation. PS: Btw you timestamps above are all ~25 minutes apart. That's exactly what I'd expect to see when affected by this bug. I think your function is somehow still deployed with v9.6.0. |
I'm using 9.4.2 [because was the last from last year and i'm sure my problems started this year] if older versions are not affected by this issue it means my original issue #1233 shall be another problem if you can have a look please, it is basically same premisse but instead of firebase query is authtoken creation
|
[REQUIRED] Describe your environment
[REQUIRED] Describe the problem
After upgrading to firebase-admin 9.6.0 realtime database triggers timeout periodically. Downgrade to 9.5.0 solved the issues.
Steps to reproduce:
Write a realtime db trigger. Use root reference to realtime db provided by the trigger (change.after.ref.root) to access data (read or write). Do not initialize firebase admin (this maybe irrelevant). The function will work correctly for a number of invocations. Eventually, one of the invocations will lead to the function timing out with whatever timeout value is configured for that specific function (60 seconds by default).
Relevant Code:
The code is proprietary, but even the simplest code that reads data from a different location than triggered will exhibit this behavior.
I wish I could attach a screenshot of the functions Health tab showing spike in latencies after the update. Reverting to 9.5.0 eliminated timeouts.
The text was updated successfully, but these errors were encountered: