Network Update Season & Master Transfer (Creator Feedback Wanted!)

Hi everyone!

We’re cooking up a special update for the VRChat Worlds SDK: Reworking how networking “ownership” is handled in VRChat and Udon. This should make multiplayer worlds more stable by fixing long-standing issues for world creators.

… But this means entering the :warning:Danger Zone :warning: since it may break existing VRChat worlds. That’s why we want to share our plans with creators to ensure we’re all on the same page.

We’ll release more detailed patch notes soon, but here’s a sneak peek at our current draft:

  • The timing of the OnPlayerLeft event has been fixed. The VRCPlayerApi instance is now guaranteed to be valid in the event’s callback.
  • LocalPlayer is now guaranteed to be valid in Start and OnEnable.
  • Fixed several internal issues with ownership and owner transfer.
  • Improved the stability of VRChat’s object sync.

The Quest to hunt a bug on live

Aside from the updates above, though, there is one particular issue that we are trying to tackle: Quest and Mobile players breaking instances by suspending the app or putting their devices into sleep mode.

This has been reported to us a lot over the past few weeks, and it often causes popular worlds such as No Time Two Talk or Super VR Ball to end up with “broken” instances.

The general problem is that when a device enters sleep mode or suspends the app, there is a timeout of approximately 4 minutes before the user gets kicked. During that time, the user will not run any Udon code in general, which includes networking.

More details

If you are wondering why we don’t time users out immediately, it would give mobile users a terrible user experience.

Imagine you take off your Quest to wipe off some sweat, grab a drink, put it back on, and… you’ve been disconnected, need to rejoin, and lost all your progress.

Or worse: You’re playing a VRChat game on your phone when someone sends you a DM. You click the notification, respond with “lmao,” tab back into VRChat… and you’ve been disconnected. Yikes.

Changing the 4-minute grace period is still up for debate. For now, we’ve opted to leave it at the default values given to us by our internal systems.

The actual bug is that this timeout will not always apply. Players may stick around for hours instead of minutes. We are working on fixing this in a separate, standalone patch.¹

Once that bug has been fixed, the 4-minute timeout would work as expected again. But that’s the core issue: Instances may break even if the timeout works as expected.

So, we dug a bit deeper.

Mastering the Quest (Master Transfer)

The affected worlds seem to have one thing in common: They only break if the instance master (docs, more docs) gets stuck this way. Wouldn’t it be nice to just swap the master away when someone’s device goes to sleep?

Well, we can. Our systems allow us to switch the master client at runtime, even without the current master leaving.

This is fine for our own systems, but we don’t know how it’ll affect your systems. As such, we want to gather your feedback on this. Here are some details on what we’re talking about (warning, technical!):

  • Currently, the master behaves as follows: (Unfortunately, this is not fully explained in our documentation.)
    • The first user joining an instance becomes the network master.
    • The only time the master can switch is when the current master leaves the instance.
    • When switching, the player that has been in the instance the longest among the remaining players will be chosen as master.
  • This means, in your current scripts, it is viable to only ever check if you become master when someone else leaves.
    • You don’t need to check every frame, because it can only happen in that event.
    • Master status can only ever be given to a player, as removing it happens after Udon execution has already stopped.
    • As a side note, we already recommend not relying on this way of networking. Explicit ownership checks should be used instead.
  • Introducing what we call “Master Transfer” would change the above in a few ways:
    • Master can now be taken away from a player even if they remain in the instance.
    • This will transfer ownership of all implicitly owned objects. For example, if the master called SetOwner on something, that owner would keep it. Anything that never had SetOwner called will be transferred to the new master.
    • As before, when the current master leaves, a new master will still be chosen.
  • What could this break?
    • Any world that doesn’t expect master to be taken away from a player without them leaving (i.e. if Udon caches Networking.IsMaster at any point.)
    • Any Udon scripts that only check for master changes in OnPlayerLeft.
    • Prefabs that rely on any other aspect of the existing master behavior.
  • Despite this, we’ve seen that it produces good results in the worlds mentioned above - that is, it might break things, but it definitely fixes others.
  • Since PC players never go to sleep², this would only affect cross-platform worlds (for now?)
  • The above also implies that when master switches, the new one will not be the oldest in the instance (since that, by definition, would be the one that just went to sleep)
    • We’re considering changing this aspect separately. When the current master leaves, the new could instead chosen based on other factors (for example, preferring PC clients or choosing players with a stable internet connection).
    • We believe the potential of breaking content with just this change is low enough that it could be worth releasing it unconditionally.
  • In theory, all of this can be dynamically toggled per instance, so there’s the possibility of gating this behind world settings (e.g. an option in the SDK, or even similar to how the avatar scaling toggle on the web works)
  • Whatever we decide on, we will document the specifications properly this time so you know what you can rely on (and which parts you shouldn’t!)

We would love to hear your feedback on these changes! We recognize the potential for breaking things here. But we see this as a more reliable fix for an issue plaguing several high-profile worlds.

Drop us any notes below, and we’ll monitor this thread for a while to see what you say. Aside from fixing the “indefinitely stuck” bug, nothing has been decided here yet.

And stay tuned for more updates on the (hopefully less polarizing for existing content :crossed_fingers:) networking open beta mentioned in the intro :)

Footnotes
¹ If you have any details on the “indefinitely stuck” issue, or know how to reliably trigger it, we’ll take any hints!
² Clubs don’t fill themselves, and timezones are hard, you know?

12 Likes

Automatically switching the instance master to the user with the most stable connection and possibly the most stable application performance would be helpful for some of my content. I can imagine it breaking content though, so it might be good to make it an option for creators to keep the old behaviour? I can’t read LOL this was already mentioned

Furthermore, I do believe that there’s some people that host events that rely on the master status being in the hands of a staff member to ensure they have the right permissions. So this may cause issues for some venues/events if the master suddenly switches even though the host first created the instance.

7 Likes

I believe longer running beta could be useful here and listening to feedback of creators during that. It also provides a window for users to let creators know their creations could be broken by this behavior as well if you choose to have the new behavior enabled by default. And with a SDK toggle it should solve most issues.

1 Like

Read it through, this shouldn’t break anything in LS Media.
Also I welcome being able to remove all of my checks for OnPlayerLeft being null at long last :smiley:

Questions:
Will it be possible to assign master from udon, or just automatically?
Will there be a new event solely for master transfer?

Now thinking for events:
It could be useful for world authors to list a set of preferred hosts, for keeping security systems well… secure.

I think as long as this is handled similarly to existing toggles as mentioned, this should pose no issues that I can think of.
Will this toggle be per upload, or per world?
Will it be able to be changed at runtime etc?

To expand on this, for events it would definitely be useful if preferred host could be a permissions in groups.

6 Likes

I think Mixie’s concern regarding event hosting is very fair. On the flip side, the world itself can actually be designed around the new ability to transfer master. Players in event worlds can have a bit more autonomy if world creators can build their systems around the fact that the master of an instance can change. (I.E. security systems and keep track of the current world state.)

It may be worth looking into introducing some other functions such as something like GetInstanceOwner which could return the VRCPlayerAPI object of the instance owner to any player that calls it, instead of just checking if the local player is the instance owner or not. Perhaps also a function that can check if the world creator is present or not or if the local player is the world creator (IsWorldOwner)?

5 Likes

To expand on this, for groups “GetInstanceOwner” wouldn’t be very useful, but I believe some other method to check if the user is supposed to have some power eg. change Video Urls etc… should be introduced.

2 Likes

Let’s be clear-- we’re talking about the instance master here!

Instance master or “networking host” is the host peer for the instance. You handle unowned objects and other networking constructs. This role grants no moderation power. If you leave, you aren’t the instance master/host anymore, and it is handed to the next person in the instance who’s been there the longest.

There isn’t importance assigned to this “instance master” role in a world or instance unless the world has been built to assign importance to it, although it benefits the networking performance of the instance if the person who has it has a decent system and low ping-to-relay.

Instance owner or instance creator is the person who created the instance. This person has moderation powers like kicking or warning. They can leave and return, and they’re still the instance owner, and still have those powers.

For completion, world author is, well, the author of the world. The person who uploaded the world. They have moderation powers in public instances of their world.

We’re talking about instance master or networking host in this thread. The rest aren’t relevant to this discussion.

2 Likes

I would like to second this and say that multiple beta channels on Steam are completely normal. So I think it would be useful to provide a second beta branch specifically for this feature, this way you can be a little more fast and dirty with testing changes to find a good solution.

I believe you can also lock them behind access codes, so you can provide the access code in one of these threads so people don’t stumble upon the beta branch.

Logic like this should already be handled by checking for instance owner, not instance master - I do recognize that there are scenarios where that will not be the case though. For that I will note that our current proposal would only make this automatic switch when a device goes to sleep, hence it would only be able to trigger if the current master is on a non-PC device. This is probably a bad choice for an event host anyway.

That’s the plan!

For security reasons, it will not be possible to choose a specific user as a master. As of right now, there is also no API planned to trigger such a transfer from Udon in general.

We have already thought about introducing an “OnMasterSwitched” (or similar) event however, as well as an explicit “GetMaster” function (probably at that point also “GetInstanceOwner”).

Up for debate, but most likely not at runtime (as in: must be set at instance creation time, somehow).

1 Like

Yeah, I think that’s why I didn’t have much to say about it. Before, instance master was simply the default user we all synced to, and we had really no control over it. (If the instance master had poor networking, the instance was kind of doomed until they left). Introducing the ability to manually switch it over could be interesting in the long run (An example could be transferring it to someone with low ping like you said). I can’t imagine too many current worlds that are really built around instance master specifically.

Unless you all are saying that this “master transfer” will be automated by the servers/clients and not manually through Udon?

That seems to be the case, it doesn’t look like its planned for the world creators to have any control over who it gets transferred to currently going by _tau’s response to my questions.

Regardless of how you’re gonna go about it, this is hard to test for a world creator working alone (even when starting multiple clients).

Consider providing world creators ways to force switch the instance master (with or without Udon, in local testing or debug-enabled instances) in order to challenge this behavior.

Also, it is unclear when does master transfer occur in a scenario where an user joins an instance that is only populated by sleeping clients, if that’s even possible.

5 Likes

That’s what tau noted above, yes. It’s all automated, and we won’t be allowing you to set networking host arbitrarily. You won’t get much benefit from doing that anyhow.

Our current swapping method served us well up to this point, and having a laggy master isn’t all that bad, depending on how the world is set up. It doesn’t actually affect that much outside of Udon, as IK, position, params, voice, etc don’t care about who is instance master (that traffic goes directly p2p thru the relay)

Since the world’s construction and networking design is what assigns importance to the master/host role, this kinda doesn’t matter too much beyond what we’ve established so far.

2 Likes

Good point. I’ll let tau answer, but I bet we could bake a simulation into ClientSim…

1 Like

Logic like this should already be handled by checking for instance owner, not instance master - I do recognize that there are scenarios where that will not be the case though. For that I will note that our current proposal would only make this automatic switch when a device goes to sleep, hence it would only be able to trigger if the current master is on a non-PC device. This is probably a bad choice for an event host anyway.

Definitely agree that checking for the instance master is not the way to handle permissions like these. Unfortunately I believe that there are still worlds that do it this way. And yea an event is definitely more likely to hosted from PCVR.

I was more referring to the option where the person with the most stable connection automatically gets assigned as instance master. That might cause some confusion for some event hosting permissions here and there.

That logic would still only happen on transfer - e.g. we wouldn’t just randomly select someone with better internet, but when the current master leaves, the next one wouldn’t be the oldest, but the “best” for some definition.
As long as you stay in the world, on PC, nothing will take master away from you.

That is a good idea, and can definitely be done. As mentioned, you will never be able to select the new master, but a debug-button to give up your master position as if your device went to sleep is doable.

The current implementation (the one we are testing with internally, to be clear) would still be broken if all users in an instance are in a sleep state. This isn’t to say it couldn’t be fixed, but requires more changes on the backend.

2 Likes

This. I believe keeping a separate open beta for these changes would allow for increased time to run tests, experiment and keep making changes to the system that should be necessary. More time, especially for a change of this size, is definitely key for the changes to go over smoothly, for creators to experiment and report how the changes are impacting their projects, if any.

Definitely needs to be a per-world toggle similar to avatar scaling. Defaulting to old behavior for existing worlds and defaulting to new behavior for new ones feels sane enough.

Better network testing mechanisms in general would be welcomed; I’m not sure ClientSim can ever capture the nuances of real networking, but especially with Quest-specific behavior, testing on real devices is not very streamlined to say the least.

5 Likes