There are suspicions that the straight 5s retry could have caused a
situation a few days ago for `osu-server-spectator` wherein it was
getting hammered by constant retry requests. This should make that
a little less likely to happen.
Numbers chosen are arbitrary, but mostly follow stable's bancho retry
intervals because why not. Stable also skips the exponential backoff in
case of errors it considers transient, but I decided not to bother for
now.
Starts off from 3 seconds, then ramps up to up to 2 minutes. Added
stagger factor is 25% of duration, either direction. The stagger factor
helps given that if spectator server is dead, each client has three
separate connections to it which it will retry on (one to each hub).
Closes https://github.com/ppy/osu/issues/26835.
I must have not re-tested this correctly after all the refactors...
Basically the issue is that the websocket connection would only come
online when the API state changed to full `Online`. In particular
the connector would not attempt to connect when the API state was
`RequiresSecondFactorAuth`, giving the link-based flow no chance to
actually work.
The change in `WebSocketNotificationsClientConnector` is relevant in
that queueing requests does nothing before the API state changes to full
`Online`. It also cleans up things a bit code-wise so... win?
And yes, this means that the _other_ `PersistentEndpointClientConnector`
implementations (i.e. SignalR connectors) will also come online earlier
after this. Based on previous discussions
(https://github.com/ppy/osu/pull/25480#discussion_r1395566545) I think
this is fine, but if it is _not_ fine, then it can be fixed by exposing
a virtual that lets a connector to decide when to come alive, I guess.
Closes https://github.com/ppy/osu/issues/26824... I think?
Can be reproduced via something like
diff --git a/osu.Game/Online/API/OAuth.cs b/osu.Game/Online/API/OAuth.cs
index 485274f349..e6e93ab4c7 100644
--- a/osu.Game/Online/API/OAuth.cs
+++ b/osu.Game/Online/API/OAuth.cs
@@ -151,6 +151,11 @@ internal string RequestAccessToken()
{
if (!ensureAccessToken()) return null;
+ for (int i = 0; i < 10000; ++i)
+ {
+ _ = Token.Value.AccessToken;
+ }
+
return Token.Value.AccessToken;
}
The cause is `SecondFactorAuthForm` calling `Logout()`, which calls
`OAuth.Clear()`, _while_ the `APIAccess` connect loop is checking if
`authentication.HasValidAccessToken` is true, which happens to
internally check `Token.Value.AccessToken`, which the clearing of
tokens can brutally interrupt.
Fixes this happening on staging:
[network] 2024-01-29 17:48:24 [verbose]: WebSocketNotificationsClientConnector connected!
[network] 2024-01-29 17:48:24 [verbose]: WebSocketNotificationsClientConnector connect attempt failed: Can't use WaitSafely from inside an async operation.
I'm not sure how I ever allowed that `.WaitSafely()` to be there. It
did feel rather dangerous but then I must have forgotten and never
noticed it failing. Which is weird because you'd think that would be
caught by testing that chat isn't working but I'm pretty sure that I
tested that chat *was* indeed working.
Anyway now that entire flow is replaced by something that should
hopefully be somewhat more sane? It has a whole retry flow with logging
now which should be more robust than what was there previously (failing
to start to listen to chat events killing the entire websocket
connection for very little good reason).
This is a prerequisite for https://github.com/ppy/osu/pull/25480.
The `WebSocketNotificationsClient` was tightly coupled to chat specifics
making it difficult to use in the second factor verification flow.
This commit's goal is to separate the websocket connection and message
handling concerns from specific chat logic concerns.
Them being together always bothered me and led to the abject failure
that is `APIUser` and its sprawl. Now that I'm about to add a flag that
is unique to `/me` for verification purposes, I'm not repeating the
errors of the past by adding yet another flag to `APIUser` that is never
present outside of a single usage context.
Up until now, the `UserActivity` class hierarchy contained things like
beatmap info, room info, full replay info, etc. While this was
convenient, it is soon going to be less so, as the data is sent over the
wire to the spectator server so that the user's activity can be
broadcast to other clients.
To counteract this without creating a second separate and slimmed-down
class hierarchy, slim down the `UserActivity` structure to contain the
bare minimum amounts of data such that the structures aren't overly
large and complex to serialise, but also contain enough data that they
can be used by receiving clients directly without having to do beatmap
or score lookups.
When the server requests a disconnect due to a user connecting
via a second device, the client will now log the user out on the first
device and show a notification informing them of the cause of
disconnection.
Mostly places that can interact with imported replays.
There are other places that use the online ID as a sort tiebreaker, or
to check presence of a score on results screens, but they should
probably still continue to only use `OnlineID`, since all scores with a
legacy online ID should have an online ID, but the converse is not
generally true.