Developer Update - 28 August 2025

In addition it also wouldn’t be a good idea to get a paid asset to be supported in the SDK and having that asset be vulnerable to piracy. Instead it’d be worthwhile for VRC to make their own in-house solution to the cloth system that would be miles more performant than Unity’s own system

eh, sort of but no?

tilte discard tells the gpu dont show this, which is huge for performamce gain, its as if the triangles now dont exist anymore. its not as great as just not having them to Begin with, but toggles are not going anywhere and making toggles better are suprior them mesh objrct spam.

I will say, UV-Tile Discard has its own disadvantages compared to separate meshes. In particular, every vertex in the combined mesh has to be calculated by the CPU every single frame, regardless of whether it gets culled in the shader or not. This could definitely cause a much higher performance impact than people might think, since every blendshape would need to be compared against every vertex every frame, even if that blendshape didn’t originally exist on the model. In the vast majority of cases, UV-Tile Discard will be more performant than separating every single toggle to a new mesh, but I could absolutely see some edge cases where it performs worse

Also, one of the biggest issues with UV-Tile Discard is that it requires the creator to have a certain degree of modelling knowledge, the same as other noteworthy optimizations like texture atlasing. Unless tools are developed to make these systems easy to implement and modify within Unity, most creators will simply ignore it by virtue of having to use a 3D modelling program to use it

All this isn’t to say that I necessarily disagree. Having UV-Tile discard persist through fallback and exist on Quest would fix what is probably its largest drawback currently. I simply just wanted to point out the other potential flaws with UV-Tile discard that don’t often get talked about

for sure! and that is understandable

It wont be as performent then just *not having it to start with* aka just dont have the item at all you want to toggle

but as you mentioned, there is a balance between item count, blendshapes and the combined overhead vs the same thing but with uv-tile discard.

There always gonna be edge case where something might perform worse. its a tug of war where you got to balance it and pick the middle option

1 Like

Based on my records of debugging and analyzing performance in Unity3D, and also verified with Nsight, hybrid shapes do not add extra vertex computation overhead. While it’s complicated to explain the detailed hardware and software implementation of the geometry pipeline, the main issue is a continuous drop in GPU utilization. This problem stems from a very early stage of the rendering process, causing poor resource management for the stencil, which leads to two distinct but related issues:

Increased Draw Calls: It generates additional draw calls that cannot be batched, adding significant overhead to the CPU thread responsible for issuing them.

Inefficient GPU Execution: This also results in a separate issue where the individual shader pipeline threads are not efficiently batched on the GPU, leading to poor utilization.

Both of these problems degrade overall performance, with the CPU-side draw call overhead affecting the main thread, and the GPU-side thread inefficiency impacting rendering speed.

Starting with Unity 6, if VRChat’s problematic implementation is replaced with compute shaders, this could largely solve the issue. This approach would allow for the automatic merging of blend shape references when the overhead exceeds 24-32 blend shapes. By consolidating every five blend shapes into one, it would reduce the front-end reference switching overhead and optimize both the CPU and GPU sides.

Ideally, this process could be made even more efficient through custom, programmable solutions. However, Unity3D has to account for the complexities of compute shaders, which are closely tied to specific hardware architectures (like the ratio of CUDA cores and other components). There is no single optimal solution for dynamically allocating and scheduling these computations, making it a relatively dynamic but not overly complex problem. For reference, AMD has been working on compute shaders for a long time, and large game studios have been researching these types of geometry pipeline optimizations for about 7-8 years now. I’m currently focused on AI work and haven’t followed these specific topics for a while.

Of course, template resource management is one thing, but I won’t go into the more intricate details. Simply put, it’s a relatively difficult part to describe, as it involves a complex interplay of memory, threads, sort-of-reference, and other finer details. Some might argue that hybrid shapes just run on the GPU and incur some overhead, but I would simply suggest they take a look at the profile and tracing data first. The real issue arises when blend shape values are active (i.e., greater than zero); otherwise, they just waste VRAM.

This topic was automatically closed after 14 days. New replies are no longer allowed.