Update split_one_chevs balancemode v2 #328

jauggy · 2024-06-12T23:33:06Z

Context

I played a game with split_one_chevs on and noticed a few issues. The 1Chevs were not split. This was because Macwhite was recognised by Teiserver as a 2Chev whereas Chobby showed them as a 1Chev. This is likely because Chobby gets a player's rank on login and then never updates it so they get out of sync with TeiServer.

How to resolve?

To resolve this my algorithm will now group both 1 and 2Chevs into the "Noob" bucket. There's also going to be slight change on how we draft players who are in the "Noob" bucket detailed below.

Other Findings

When putting the replay

https://www.beyondallreason.info/replays?gameId=39096966518c40a66b2130b057864aa5

into https://openskill-test.web.app/ I noticed that the library expects Team 1 to win, despite that most humans would bet on Team 2.

Since the library expects Team 1 to win, if my team were to win I would gain a lot of OS i.e. +1. If my team were to lose, I only lose -0.37.

From this we can conclude that if you were to choose from the three "Noobs": CindersFire, Macwhite, Victorious_Dead you probably want to avoid the overrated players, which are likely the ones with highest uncertainty. If an overrated player is on the other team, you stand to win more and lose less, and since they're overrated, you're also more likely to win. Therefore, for those in the "Noobs" category, we probably want to pick those with low uncertainty as they are less likely to be overrated.

split_one_chevs Algorithm v2

Based on these findings, the algorithm will now draft players based on these criteria:

Always pick experienced (3Chev+) players over noobs (1-2Chevs).
Prefer higher OS for experienced players.
Prefer lower uncertainty for noobs.

This draft mimics how a human might draft players with the given visible information in a lobby. It's not super mathematical. Players generally look at chevron level to determine how overrated someone might be. Someone did complain in chat about the lobby balance in the game I played mentioned above. They were obviously eyeballing the chevron levels and assuming those two players were overrated.

Further enchancements

Previously, if the balancer was called it would check the result and if the deviation in ratings between teams was too high, it would rerun the balancer but split all parties into solo players. Now the balance result will have a new field: has_parties?. If this is false we do not need to rerun the balancer again.
Permissions have been reduced for fake users that are not admin. The purpose of this is if you want to test that the Balance tab doesn't appear to normal users, you need to reduce the permissions of fake users.
Anyone with Staff role e.g. Tester, Contributor, etc. can now see the balance tab. The balance tab now has a dropdown allowing you to switch to difference balance modes.
fuzz_multiplier (randomness added to match rating) is now only enabled for Teifion's algos. This makes it easier to debug issues.
If there are no noobs, an alternate balancer (loser_picks) will be called. This is the default algo and it supports parties.

Known Bugs

Teiserver doesn't know the rank shown to the user in Chobby. Chobby gets the rank on login and then never updates. Teiserver, therefore, may classify a user as 2Chev but they might be shown as 1Chev in Chobby.

Unit Tests

Run this to run multiple unit tests that relate to balance

  mix test --only balance_test

Local Dev Tests

See comment here for test steps.

Theoretical Testing on past replays

Go here: https://balance-algo-web.web.app/
And enter a past replay. Change algorithm to Split One Chevs v2

lib/teiserver_web/templates/admin/match/index.html.heex

Add split-one-chevs-v2 Update moduledoc

This reverts commit 0dd77e3.

Check for bad balancer

jauggy · 2024-06-15T20:21:10Z

Have updated and now you can see dropdown to change the balancer. Also removed the admin pages (now redirects) to be consistent with the previous PR made by Perfi to remove them.

Refactor minor

Minor update

jauggy · 2024-06-25T00:14:33Z

Sample video of testing balancer tab in Integration server:
https://www.youtube.com/watch?v=KUqzvU6GBug

Note that the balancer tab only appears for rated games. Also the chevron level of players is based on current data always (since we don't store history of this).

More improvements

jauggy · 2024-06-28T16:48:30Z

Local Dev Tests

You must rerun the fake data task. This is because if you ran it previously, the fake users will have too high permissions. I modified the fakedata task so fake users will have normal permissions. Also the task will now also add fake playtime data.

mix teiserver.fakedata

Launch the website

mix phx.server

Login to the website using root@localhost
Now go to Admin > Matches > Select a Match > Balance Tab.
You will see the logs for the loser_picks algo.

There is a dropdown with the label "Balance Algorithm" near the top. Change this to split_one_chevs
You will now see the logs for split_one_chevs.

Testing permissions of normal users

Login as root admin and find a user. Copy their email which should be a guid. Relogin as that user using that email and the password is password. Check that they cannot see the balance tab.
Relogin as admin and then give that user a contributor/tester role. Relogin as the user and now they should have access.

L-e-x-o-n

Teiserver doesn't know the rank shown to the user in Chobby. Chobby gets the rank on login and then never updates. Teiserver, therefore, may classify a user as 2Chev but they might be shown as 1Chev in Chobby.

I thought rank was only calculated on login, when/where else does Teiserver calculate rank?

L-e-x-o-n · 2024-06-28T16:32:59Z

lib/teiserver/mix_tasks/fake_data.ex

@@ -96,7 +96,7 @@ defmodule Mix.Tasks.Teiserver.Fakedata do
            name: generate_throwaway_name() |> String.replace(" ", ""),
            email: UUID.uuid1(),
            password: root_user.password,
-            permissions: ["admin.dev.developer"],


Why this change?

So in order to test that a user cannot see the balance tab, they must have zero permissions. That admin.dev.developer permission basically gives them admin role.

L-e-x-o-n · 2024-06-28T16:34:21Z

lib/teiserver/mix_tasks/fake_playtime.ex

Would it be good to integrate FakePlaytime into Fakedata to avoid the need for 2 commands?

I'll investigate to see if I can do it. This is much more about my Elixir skills not being strong enough to safely modify this code.

L-e-x-o-n · 2024-06-28T16:52:42Z

lib/teiserver_web/live/battles/match/show.ex

    socket
-    |> mount_require_any(["Reviewer"])
+    |> mount_require_any(["Reviewer", "Contributor"])


Ratings were locked behind Overwatch, balance behind Reviewer, neither available to Contributor which in my opinion would find the data more useful, mainly for debugging purposes, than Overwatch for moderation. This doesn't make much sense to me and I would change it all to Contributor but this is something for Beherith to decide.

L-e-x-o-n · 2024-06-28T16:52:44Z

lib/teiserver_web/live/battles/match/show.ex

            end)
        end)
        |> List.flatten()

      past_balance =
-        BalanceLib.create_balance(groups, match.team_count, mode: :loser_picks)
+        BalanceLib.create_balance(groups, match.team_count, algorithm: balancer)


Are balancer and algorithm the same thing in this context? Some places are using one, some the other, it might be nice to always use the same.

Yeah I can update my word usage to be more consistent. Will look into it.

L-e-x-o-n · 2024-06-28T17:05:17Z

lib/teiserver/battle/balance/split_one_chevs.ex

+  end
+
+  @spec has_enough_noobs?([BT.expanded_group()]) :: bool()
+  def has_enough_noobs?(expanded_group) do


has_enough_noobs only checks if there are noobs, not how many?
Is this intended or not?

I had some ideas like enough noobs be something like

At least a single 1chev or

At least two, 2chevs

But decided to keep it simple for now. So the function is more about allowing additional complexity in the future.

A single noob actually gets treated differently compared to Teifion's balancer. Teifion's balancer will pick the noob higher (since it's based on their OS). My algorithm will always pick the noob last irrespective of OS.

jauggy · 2024-06-28T17:18:47Z

Teiserver doesn't know the rank shown to the user in Chobby. Chobby gets the rank on login and then never updates. Teiserver, therefore, may classify a user as 2Chev but they might be shown as 1Chev in Chobby.

I thought rank was only calculated on login, when/where else does Teiserver calculate rank?

To be honest this bug really baffled me. I still have no idea why they are out of sync. From my searching it only gets calculated on login.

L-e-x-o-n · 2024-06-28T17:27:09Z

Teiserver doesn't know the rank shown to the user in Chobby. Chobby gets the rank on login and then never updates. Teiserver, therefore, may classify a user as 2Chev but they might be shown as 1Chev in Chobby.

I thought rank was only calculated on login, when/where else does Teiserver calculate rank?

To be honest this bug really baffled me. I still have no idea why they are out of sync. From my searching it only gets calculated on login.

That was my understanding as well. I checked again, not sure what we are missing...

jauggy · 2024-06-28T17:28:57Z

Currently at least there is an issue for it: #332
So that the bug is at least recorded.

jauggy · 2024-06-28T17:47:37Z

@L-e-x-o-n I have updated the PR now with the following changes:

fakedata task now also calls task to add playtime stats
balancer renamed to be more consistent with other usage

…chevs-v2

jauggy · 2024-08-14T01:27:30Z

lib/teiserver/battle/libs/balance_lib.ex

    # which has an expiry of 60s
    # See application.ex for cache settings
    rating_type_id = MatchRatingLib.rating_type_name_lookup()[rating_type]
-    rating = get_user_balance_rating_value(userid, rating_type_id)
+    {skill, uncertainty} = get_user_rating_value_uncertainty_pair(userid, rating_type_id)
+    rating = calculate_rating_value(skill, uncertainty)
    rating = fuzz_rating(rating, fuzz_multiplier)


This is a mistake
calculate_rating_value should not be called since we already have the rating.

jauggy commented Jun 12, 2024

View reviewed changes

lib/teiserver_web/templates/admin/match/index.html.heex Outdated Show resolved Hide resolved

jauggy marked this pull request as ready for review June 12, 2024 23:57

Add split-one-chevs-v2

0dd77e3

Add split-one-chevs-v2 Update moduledoc

jauggy force-pushed the jauggy/split-one-chevs-v2 branch from fc47059 to 0dd77e3 Compare June 13, 2024 07:48

jauggy mentioned this pull request Jun 15, 2024

Chobby and teiserver not agree on chev levels #332

Open

Remove admin pages

093c8e4

This reverts commit 0dd77e3.

jauggy force-pushed the jauggy/split-one-chevs-v2 branch 2 times, most recently from 4f05643 to dacd9fd Compare June 15, 2024 11:19

Handle balancer via url

95d352e

Check for bad balancer

jauggy force-pushed the jauggy/split-one-chevs-v2 branch from c09fbef to 95d352e Compare June 15, 2024 13:24

jauggy force-pushed the jauggy/split-one-chevs-v2 branch from 0dd4639 to d638612 Compare June 15, 2024 20:23

Add balancer dropdown

a2c5976

Refactor minor

jauggy force-pushed the jauggy/split-one-chevs-v2 branch from d638612 to a2c5976 Compare June 15, 2024 22:42

Fix mix tasks

2cd66d9

Minor update

jauggy force-pushed the jauggy/split-one-chevs-v2 branch from 0497a07 to 2cd66d9 Compare June 16, 2024 00:11

jauggy force-pushed the jauggy/split-one-chevs-v2 branch 2 times, most recently from ed2850c to 21f8613 Compare June 25, 2024 06:30

Call teifion's algo when not enough noobs

182e034

More improvements

jauggy force-pushed the jauggy/split-one-chevs-v2 branch from 21f8613 to 182e034 Compare June 25, 2024 07:04

jauggy mentioned this pull request Jun 28, 2024

Balance tab is shown even if the user doesn't have permissions to access it #343

Closed

L-e-x-o-n reviewed Jun 28, 2024

View reviewed changes

Updates based on Lexon feedback

867ab5d

Minor updates

729d610

jauggy added 2 commits July 10, 2024 18:14

Merge remote-tracking branch 'upstream/master' into jauggy/split-one-…

21c1f11

…chevs-v2

Minor fixes to pass automated checks

f9a2ccb

L-e-x-o-n merged commit 4e28cd9 into beyond-all-reason:master Jul 10, 2024
3 checks passed

jauggy commented Aug 14, 2024

View reviewed changes

Update split_one_chevs balancemode v2 #328

Update split_one_chevs balancemode v2 #328

Uh oh!

Conversation

jauggy commented Jun 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

How to resolve?

Other Findings

split_one_chevs Algorithm v2

Further enchancements

Known Bugs

Unit Tests

Local Dev Tests

Theoretical Testing on past replays

Uh oh!

Uh oh!

jauggy commented Jun 15, 2024

Uh oh!

jauggy commented Jun 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jauggy commented Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Local Dev Tests

Testing permissions of normal users

Uh oh!

L-e-x-o-n left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jauggy Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jauggy commented Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

L-e-x-o-n commented Jun 28, 2024

Uh oh!

jauggy commented Jun 28, 2024

Uh oh!

jauggy commented Jun 28, 2024

Uh oh!

Uh oh!

jauggy Aug 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jauggy commented Jun 12, 2024 •

edited

Loading

jauggy commented Jun 25, 2024 •

edited

Loading

jauggy commented Jun 28, 2024 •

edited

Loading

jauggy Jun 28, 2024 •

edited

Loading

jauggy commented Jun 28, 2024 •

edited

Loading

jauggy Aug 14, 2024 •

edited

Loading