`DifficultyCacheLookup`s were storing raw `Mod` instances into their
`OrderedMods` field. This could cause the cache lookups to wrongly
succeed in cases of mods with settings. The particular case that
triggered this fix was Difficulty Adjust.
Because the difficulty cache is backed by a dictionary, there are two
stages to the lookup; first `GetHashCode()` is used to find the
appropriate hash bucket to look in, and then items from that hash bucket
are compared against the key being searched for via the implementation
of `Equals()`.
As it turns out, the first hashing step ended up being the saving grace
in most cases, as the hash computation included the values of the mod
settings. But the Difficulty Adjust failure case was triggered by the
quirk that `GetHashCode(0) == GetHashCode(null) == 0`.
In such a case, the `Equals()` fallback was used. But as it turns out,
because the `Mod` instance stored to lookups was not cloned and
therefore potentially externally mutable, it could be polluted after
being stored to the dictionary, and therefore breaking the equality
check. Even though all of the setting values were compared, the hash
bucket didn't match the actual contents of the lookup anymore (because
they were mutated externally, e.g. by the user changing the mod setting
values in the mod settings overlay).
To resolve, clone out the mod structure before creating all difficulty
lookups.
Intentionally not using `[Values]` as the scale modes can be applied to
the running game instance directly, rather than recreating it all over
again.
The same could be said for the notification overlay but not sure, seems
like something that should be considered at an `OsuGameTestScene` level
instead (whether the same game instance can be reused for further
testing).
By using `Content` instead, now the logic will get the X of the settings
overlay at the `Content` space, which can be scaled in the
`ScalingMode.Everything` mode.
And in the case of `ScalingMode.ExcludeOverlays`, a subcontainer
somewhere inside `Content` that's holding the screen stack would be scaled,
but `Content` won't be affected which is what we want in that case.
Turns out we likely don't want this, as it means the testing user (using
a visual test browser) will not have access to their beatmaps. Can
revisit at a future date if the temporary files are still an issue.