First of all, this sort of cleanup isn't really the client's
responsibility, and secondly, at the point the client received this
request to disconnect, *none of its requests will be honored anymore*
(currently the only scenario of this if another client has connected
- the server-side concurrency filter will reject this request).
When disconnection is requested, the only valid thing to do with respect
to talking to the server is to stop doing it.
This will be moved server-side in a follow-up change, although I'm not
even strictly sure that's required - I'd like to think signalr would
know to clean up a disconnecting client from all groups they were in.
There are suspicions that the straight 5s retry could have caused a
situation a few days ago for `osu-server-spectator` wherein it was
getting hammered by constant retry requests. This should make that
a little less likely to happen.
Numbers chosen are arbitrary, but mostly follow stable's bancho retry
intervals because why not. Stable also skips the exponential backoff in
case of errors it considers transient, but I decided not to bother for
now.
Starts off from 3 seconds, then ramps up to up to 2 minutes. Added
stagger factor is 25% of duration, either direction. The stagger factor
helps given that if spectator server is dead, each client has three
separate connections to it which it will retry on (one to each hub).
Closes https://github.com/ppy/osu/issues/26835.
I must have not re-tested this correctly after all the refactors...
Basically the issue is that the websocket connection would only come
online when the API state changed to full `Online`. In particular
the connector would not attempt to connect when the API state was
`RequiresSecondFactorAuth`, giving the link-based flow no chance to
actually work.
The change in `WebSocketNotificationsClientConnector` is relevant in
that queueing requests does nothing before the API state changes to full
`Online`. It also cleans up things a bit code-wise so... win?
And yes, this means that the _other_ `PersistentEndpointClientConnector`
implementations (i.e. SignalR connectors) will also come online earlier
after this. Based on previous discussions
(https://github.com/ppy/osu/pull/25480#discussion_r1395566545) I think
this is fine, but if it is _not_ fine, then it can be fixed by exposing
a virtual that lets a connector to decide when to come alive, I guess.
Closes https://github.com/ppy/osu/issues/26824... I think?
Can be reproduced via something like
diff --git a/osu.Game/Online/API/OAuth.cs b/osu.Game/Online/API/OAuth.cs
index 485274f349..e6e93ab4c7 100644
--- a/osu.Game/Online/API/OAuth.cs
+++ b/osu.Game/Online/API/OAuth.cs
@@ -151,6 +151,11 @@ internal string RequestAccessToken()
{
if (!ensureAccessToken()) return null;
+ for (int i = 0; i < 10000; ++i)
+ {
+ _ = Token.Value.AccessToken;
+ }
+
return Token.Value.AccessToken;
}
The cause is `SecondFactorAuthForm` calling `Logout()`, which calls
`OAuth.Clear()`, _while_ the `APIAccess` connect loop is checking if
`authentication.HasValidAccessToken` is true, which happens to
internally check `Token.Value.AccessToken`, which the clearing of
tokens can brutally interrupt.
Fixes this happening on staging:
[network] 2024-01-29 17:48:24 [verbose]: WebSocketNotificationsClientConnector connected!
[network] 2024-01-29 17:48:24 [verbose]: WebSocketNotificationsClientConnector connect attempt failed: Can't use WaitSafely from inside an async operation.
I'm not sure how I ever allowed that `.WaitSafely()` to be there. It
did feel rather dangerous but then I must have forgotten and never
noticed it failing. Which is weird because you'd think that would be
caught by testing that chat isn't working but I'm pretty sure that I
tested that chat *was* indeed working.
Anyway now that entire flow is replaced by something that should
hopefully be somewhat more sane? It has a whole retry flow with logging
now which should be more robust than what was there previously (failing
to start to listen to chat events killing the entire websocket
connection for very little good reason).