Exposed by CI failures
([example](https://github.com/ppy/osu/actions/runs/23888446400#user-content-r0s0)).
The race occurs when a consumer calls `GetBindableDifficulty()` for the
first time and then a cache invalidation is triggered. The sequence of
events triggering the failure is as follows:
1. Consumer calls `GetBindableDifficulty()` to get a difficulty bindable
for a given beatmap tracking the game-global ruleset / mods. This
triggers difficulty calculation A.
2. In the meantime, another process requests a cache invalidation for
the same beatmap as the one supplied by the consumer in step (1). This
incurs a cache purge and triggers difficulty calculation B, but never
cancels difficulty calculation A.
3. Difficulty calculation B concludes and writes the correct, latest
difficulty value to the bindable.
4. Difficulty calculation A concludes and writes an incorrect, stale
difficulty value to the bindable.
See below for patch that reproduces this behaviour on my machine 100%
reliably:
```diff
diff --git a/osu.Game/Beatmaps/BeatmapDifficultyCache.cs b/osu.Game/Beatmaps/BeatmapDifficultyCache.cs
index https://github.com/ppy/osu/commit/d6b40639161e26af223f03761b3826b0cd7f4a87..c9604e0e58 100644
--- a/osu.Game/Beatmaps/BeatmapDifficultyCache.cs
+++ b/osu.Game/Beatmaps/BeatmapDifficultyCache.cs
@@ -252,17 +252,17 @@ private void updateBindable(BindableStarDifficulty bindable, IRulesetInfo? rules
GetDifficultyAsync(bindable.BeatmapInfo, rulesetInfo, mods, cancellationToken, computationDelay)
.ContinueWith(task =>
{
+ StarDifficulty? starDifficulty = task.GetResultSafely();
+
// We're on a threadpool thread, but we should exit back to the update thread so consumers can safely handle value-changed events.
- Schedule(() =>
+ Scheduler.AddDelayed(() =>
{
if (cancellationToken.IsCancellationRequested)
return;
- StarDifficulty? starDifficulty = task.GetResultSafely();
-
if (starDifficulty != null)
bindable.Value = starDifficulty.Value;
- });
+ }, starDifficulty?.Stars > 0 ? 400 : 0);
}, cancellationToken);
}
```
The goal of the patch is to reorder the write to the bindable in order
to trigger the scenario described above.
Due to the invasiveness of the patch it is not suitable to add as a
test, and chances are that the schedule delay may need to be tweaked if
anyone else wants to check that patch.
Anyway, the solution here is to use the same pattern of creating a
linked cancellation token even on the first retrieval of a bindable
difficulty, and registering it in the list of cancellation tokens that
already existed to service the ruleset- / mod-tracking flow.
Some extra rearranging in
https://github.com/ppy/osu/commit/9184299239a8b5e82957d46db33f1d26bab238fd
is needed to ensure the linked tokens created to do this don't stay
behind after they are no longer needed for anything.
Based on feedback I've heard more than once ([most
recently](https://github.com/ppy/osu/discussions/36933)), there's too
much overlap compared to stable. And it can sound a bit "jumbled" as a
result, especially when transitioning between two songs with loud and
"complex" audio.
This also fixes noticeable audio glitching that some users may have been
experiencing when switching tracks.
---
Unless there's any very loud objections, I'd ask that review is on the
code not the audible change. This is the kind of change that I tweak to
my personal spec and then push out to hear the general reaction.
### b8d0dfe2ed Better default preview time
based on global fade
This is an oversight. The preview point wasn't being offset to account
for the fade in. This is something we do when generating web-side
previews, which allows for the actual preview point to be audible,
rather than mid-fade.
### 8de323b68c Adjust fade in and fade out
during track transitions to reduce overlap
Yes, I'm aware that the delay is not accounted for in the constant. This
is intentional because it sounds better. I want this to be an internal
adjustment which is not to be considered by any external usages of those
constants. Just sounds better.
### aab01b0fe5 Remove weird/dated schedule
usage
This causes audible artifacts in playback. The track is not set to zero
volume early enough. Original change was made in [this ancient
PR](https://github.com/ppy/osu/pull/9792) (see
[commit](https://github.com/ppy/osu/commit/688e4479506645bdacee7cdc7ae9ee9deaa5f1b4)).
The original failure does not occur. Tests still pass. I'd argue that if
we encounter this again, it's an issue at the call site, not this code.
This is a drive-by fix, as I noticed this glitching while working on the
main issue here. Arguably should be it's own PR 🤷 .
The recent changes related to adding support for working beatmap load
cancellation exposed a flaw in the beatmap difficulty cache. With the
way the difficulty computation logic was written, any error in the
calculation process (including beatmap load timeout, or cancellation)
would result in a 0.00 star rating being permanently cached in memory
for the given beatmap.
To resolve, change the difficulty cache's return type to nullable.
In failure scenarios, `null` is returned, rather than
`default(StarDifficulty)` as done previously.
`DifficultyCacheLookup`s were storing raw `Mod` instances into their
`OrderedMods` field. This could cause the cache lookups to wrongly
succeed in cases of mods with settings. The particular case that
triggered this fix was Difficulty Adjust.
Because the difficulty cache is backed by a dictionary, there are two
stages to the lookup; first `GetHashCode()` is used to find the
appropriate hash bucket to look in, and then items from that hash bucket
are compared against the key being searched for via the implementation
of `Equals()`.
As it turns out, the first hashing step ended up being the saving grace
in most cases, as the hash computation included the values of the mod
settings. But the Difficulty Adjust failure case was triggered by the
quirk that `GetHashCode(0) == GetHashCode(null) == 0`.
In such a case, the `Equals()` fallback was used. But as it turns out,
because the `Mod` instance stored to lookups was not cloned and
therefore potentially externally mutable, it could be polluted after
being stored to the dictionary, and therefore breaking the equality
check. Even though all of the setting values were compared, the hash
bucket didn't match the actual contents of the lookup anymore (because
they were mutated externally, e.g. by the user changing the mod setting
values in the mod settings overlay).
To resolve, clone out the mod structure before creating all difficulty
lookups.
The previous code would run a calcaulation for the beatmap's own ruleset
if the current one failed. While this does make sense, with the current
way we use this component (and the implementation flow) it is quite unsafe.
The to the call on `.Result` in the `catch` block, this would 100%
deadlock due to the thread concurrency of the `ThreadedTaskScheduler`
being 1. Even if the nested run could be run inline (it should be), the
task scheduler won't even get to the point of checking whether this is
feasible due to it being saturated by the already running task.
I'm not sure if we still need this fallback lookup logic. After removing
it, it's feasible that 0 stars will be returned during the scenario that
previously caused a deadlock, but I don't necessarily think this is
incorrect. There may be another reason for this needing to exist which
I'm not aware of (diffcalc?) but if that's the case we may want to move
the try-catch handling to the point of usage.
To reproduce the deadlock scenario with 100% success (the repro
instructions in the linked issue aren't that simple and require some
patience and good timing), the main portion of the lookup can be changed
to randomly trigger a nested lookup:
```
if (RNG.NextSingle() > 0.5f)
return GetAsync(new
DifficultyCacheLookup(key.Beatmap, key.Beatmap.Ruleset,
key.OrderedMods)).Result;
else
return new StarDifficulty(attributes);
```
After switching beatmap once or twice, pausing debug and viewing the
state of threads should show exactly what is going on.