Came up as a failure when locally running tests for
ppy/osu-framework#6001 - but the test is actually a previously-known
flaky that I couldn't reproduce the failure of until the aforementioned
PR.
This appears to be a simple race; the test scene queries the track
length from update thread, but the length is actually set on the audio
thread. So it's not unreasonable that given unlucky timing, the length
will not be set by `TrackBass` before it is queried.
To fix, switch assert to until step. I'm generally not really willing
to give this more time of day until this change is proven insufficient.
If `createRandomScore()` happened to randomly pick the highest total
score when called with `friend` as the sole argument, that particular
score would not be pink.
`GetScoreByUsername()` would arbitrarily pick the first score for the
user, so in this particular case where a friend had the number 1 score,
the test would wrongly fail.
Fix by checking whether any of the 3 added friend scores have received
the pink colour. Because there is more than 1 friend score in the test,
doing so ensures that at least one of those should eventually become
pink (because, obviously, you can't have two scores at number 1).
Allows to clearly see what the failure is:
TearDown : System.TimeoutException : "friend score is pink" timed out: Expected: some item equal to "#FF549A"
But was: < "#FFFFFF", "#7FCC33", "#444444" >
The #7FCC33 colour is used for the first score on the leaderboard.
The full-stack test using the whole 9 `OsuGameTest` yards is unusable
for rapid development. I don't really get how you could ever design
anything using it without tossing your computer out the window.
`TestSceneScoring` included a local simulation of stable's Score V1
algorithm. One of the parts of said algorithm is a mysterious
"score multiplier", influenced by - among others - the beatmap's drain
rate, overall difficulty, circle size, object count, drain length,
and active mods. (An implementation of this already exists in lazer
source, in `OsuLegacyScoreSimulator`, but more on this later.)
However, `TestSceneScoring` had this multiplier in _two_ places, with
_two_ distinct values, one of which being 1 (i.e. basically off).
Unfortunately, the place that had 1 as the multiplier was the wrong one.
Stable calculates the score increase for every hit in two stages;
first, it takes the raw numerical value of the judgement, but then
applies a combo-based bonus on top of it:
scoreIncrease += (int)(Math.Max(0, ComboCounter.HitCombo - 1) * (scoreIncrease / 25 * ScoreMultiplier));
On the face of it, it may appear that the `ScoreMultiplier` factor
can be factored out and applied at the end only when returning total
score. However, once the above formula is rewritten as:
scoreIncrease = scoreIncrease + (int)(Math.Max(0, ComboCounter.HitCombo - 1) * (scoreIncrease / 25 * ScoreMultiplier));
= scoreIncrease * (1 + (Math.Max(0, ComboCounter.HitCombo - 1) / 25 * ScoreMultiplier))
it becomes clear that that assumption is actually _incorrect_,
and the `ScoreMultiplier` _must_ be applied to every score increase
individually.
The above was cross-checked experimentally against stable source
on an example test map with 100 objects, and a replay hitting them
perfectly.