There are suspicions that the straight 5s retry could have caused a
situation a few days ago for `osu-server-spectator` wherein it was
getting hammered by constant retry requests. This should make that
a little less likely to happen.
Numbers chosen are arbitrary, but mostly follow stable's bancho retry
intervals because why not. Stable also skips the exponential backoff in
case of errors it considers transient, but I decided not to bother for
now.
Starts off from 3 seconds, then ramps up to up to 2 minutes. Added
stagger factor is 25% of duration, either direction. The stagger factor
helps given that if spectator server is dead, each client has three
separate connections to it which it will retry on (one to each hub).
Closes https://github.com/ppy/osu/issues/26835.
I must have not re-tested this correctly after all the refactors...
Basically the issue is that the websocket connection would only come
online when the API state changed to full `Online`. In particular
the connector would not attempt to connect when the API state was
`RequiresSecondFactorAuth`, giving the link-based flow no chance to
actually work.
The change in `WebSocketNotificationsClientConnector` is relevant in
that queueing requests does nothing before the API state changes to full
`Online`. It also cleans up things a bit code-wise so... win?
And yes, this means that the _other_ `PersistentEndpointClientConnector`
implementations (i.e. SignalR connectors) will also come online earlier
after this. Based on previous discussions
(https://github.com/ppy/osu/pull/25480#discussion_r1395566545) I think
this is fine, but if it is _not_ fine, then it can be fixed by exposing
a virtual that lets a connector to decide when to come alive, I guess.
Closes https://github.com/ppy/osu/issues/26824... I think?
Can be reproduced via something like
diff --git a/osu.Game/Online/API/OAuth.cs b/osu.Game/Online/API/OAuth.cs
index 485274f349..e6e93ab4c7 100644
--- a/osu.Game/Online/API/OAuth.cs
+++ b/osu.Game/Online/API/OAuth.cs
@@ -151,6 +151,11 @@ internal string RequestAccessToken()
{
if (!ensureAccessToken()) return null;
+ for (int i = 0; i < 10000; ++i)
+ {
+ _ = Token.Value.AccessToken;
+ }
+
return Token.Value.AccessToken;
}
The cause is `SecondFactorAuthForm` calling `Logout()`, which calls
`OAuth.Clear()`, _while_ the `APIAccess` connect loop is checking if
`authentication.HasValidAccessToken` is true, which happens to
internally check `Token.Value.AccessToken`, which the clearing of
tokens can brutally interrupt.
Fixes this happening on staging:
[network] 2024-01-29 17:48:24 [verbose]: WebSocketNotificationsClientConnector connected!
[network] 2024-01-29 17:48:24 [verbose]: WebSocketNotificationsClientConnector connect attempt failed: Can't use WaitSafely from inside an async operation.
I'm not sure how I ever allowed that `.WaitSafely()` to be there. It
did feel rather dangerous but then I must have forgotten and never
noticed it failing. Which is weird because you'd think that would be
caught by testing that chat isn't working but I'm pretty sure that I
tested that chat *was* indeed working.
Anyway now that entire flow is replaced by something that should
hopefully be somewhat more sane? It has a whole retry flow with logging
now which should be more robust than what was there previously (failing
to start to listen to chat events killing the entire websocket
connection for very little good reason).
This is a prerequisite for https://github.com/ppy/osu/pull/25480.
The `WebSocketNotificationsClient` was tightly coupled to chat specifics
making it difficult to use in the second factor verification flow.
This commit's goal is to separate the websocket connection and message
handling concerns from specific chat logic concerns.
Them being together always bothered me and led to the abject failure
that is `APIUser` and its sprawl. Now that I'm about to add a flag that
is unique to `/me` for verification purposes, I'm not repeating the
errors of the past by adding yet another flag to `APIUser` that is never
present outside of a single usage context.
Up until now, the `UserActivity` class hierarchy contained things like
beatmap info, room info, full replay info, etc. While this was
convenient, it is soon going to be less so, as the data is sent over the
wire to the spectator server so that the user's activity can be
broadcast to other clients.
To counteract this without creating a second separate and slimmed-down
class hierarchy, slim down the `UserActivity` structure to contain the
bare minimum amounts of data such that the structures aren't overly
large and complex to serialise, but also contain enough data that they
can be used by receiving clients directly without having to do beatmap
or score lookups.
When the server requests a disconnect due to a user connecting
via a second device, the client will now log the user out on the first
device and show a notification informing them of the cause of
disconnection.
Mostly places that can interact with imported replays.
There are other places that use the online ID as a sort tiebreaker, or
to check presence of a score on results screens, but they should
probably still continue to only use `OnlineID`, since all scores with a
legacy online ID should have an online ID, but the converse is not
generally true.
This time for `SocketException`s. I seem to recall looking at this and
deciding there was a reason to not catch socket exceptions, but on
revisiting it seems sane to do so.
This covers a fail case like reported:
```
2023-10-06 03:24:17 [verbose]: Request to https://lazer.ppy.sh/oauth/token failed with System.Net.Http.HttpRequestException: No such host is known. (lazer.ppy.sh:443)
2023-10-06 03:24:17 [verbose]: ---> System.Net.Sockets.SocketException (11001): No such host is known.
2023-10-06 03:24:17 [verbose]: at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
```
Closes https://github.com/ppy/osu/issues/24890 (again).
In discussion with nanaya, we likely want this now that we're adding the
ability for some icons to be extended. Historically mod icons have been
displayed in a stable but arbitrary order (based on their position in
the `Mods` enum).
This change aims to make them stable across lazer scores (where they are
stored based on .. the order the user selected them).
With the way `dev.ppy.sh` was previously configured, the `osu-web`
`/multiplayer` routes and `osu-server-spectator` `/multiplayer` routes
would conflict with each other, with the `osu-server-spectator` route
taking precedence. This would mean that some `osu-web` pages would just
straight up not be available on dev.
To counteract this, the `osu-server-spectator` routes are now exposed
under `https://dev.ppy.sh/signalr/`. This PR updates the client-side
configuration to use these new routes.
This API endpoint is intended for usage with the entire `solo_scores`
machinery and ID schema, rather than the legacy `*_scores_high` ID
schema. It also supports automagically falling back to downloading
legacy replays if a stable-imported score is requested for download
(internally this happens via `legacy_score_id` in the `data` json).
This change will allow replays to be downloaded, but it will still not
yield 100% correct behaviour, as there is further work to be done in
that respect. The download tracker is expecting score hashes to arrive
from web to verify the integrity of the incoming download, but the API
does not expose such a facility right now; we will have to decide as to
whether we want to add one web-side, or whether we want to disable the
checking client-side.
This created weird cases in logs which are very hard to understand. The
one which really got me was this:
```
[runtime] 2023-08-13 07:48:27 [verbose]: Invalidating working beatmap cache for unknown artist - unknown title (Dummy)
```
Which looks like a dummy working beatmap was invalidated, but it turns
out that's just the local user which was populated when creating a new
local beatmap.