I have discovered that certain avatars now possess as many as 200 or even 250 animation layers.
This figure is reached by summing all animation layers in the avatar descriptor.
When animation controller overrides are active, the execution is essentially near-sequential.
Regarding the Shadow Clone and Mirror Clone systems, which VRChat implemented due to visibility and other factors, PhysBones, constraints, and animations are all duplicated.
Currently, the overhead for PhysBones and constraints is much lower for remote avatars.
On the local client, they may remain active constantly unless they are disabled by default or via animations.
This causes the CPU overhead for a single avatar to rapidly increase to 2.5 or 3 times its original cost.
This increase is nearly linear, such as reaching 2x when only shadows or only mirrors are present.
In cases where the avatar is invisible and rendering costs are excluded, the time spent on animation layers can account for or even exceed 90% of the CPU usage.
The remaining tasks occupy less than 10%.
Within that 10%, miscellaneous items excluding PhysBones and constraints account for only 2% to 3%.
Most of the animation layer load does not come from Face Tracking (FT).
The number of layers for FT is relatively much lower and generally better optimized, typically consisting of only a few layers.
Instead, the vast majority of layers result from various expansion features automatically added by tools like Modular Avatar (MA).
This ultimately leads to very severe L3 cache misses, potentially reaching 50% to 60% or higher, which interacts with other overhead items to further increase processing time.
The load characteristic of a few animations differs from that of an extreme number of animations.
The memory data generated by these animations can exceed 100MB. For instance, processing time might be only 3.6ms with low L3 misses but increases to 4ms or even 4.5ms when misses are high.
This impact is technically lower than the previous issue where VRChat disabled the graphics worker thread, which had caused draw call performance to drop 4 to 12 times compared to standard Unity3D and increased the X3D CPU performance gap from 1.25x to 1.5x under heavy rendering.
However, because the total volume of animation layers is so massive, their actual impact is far more significant than that of rendering overhead.
Shadows themselves usually do not generate significant CPU overhead from draw calls, but the lighting overhead is much more substantial.
Within the game, a small number of avatars now utilize such extreme and excessive animation layers that the processing time for a single avatar can reach as high as 10ms or even exceed 12ms in certain maps.
Occasionally, the processing time may decrease to 4–5ms through resource releasing, but it quickly returns to and remains at an elevated state.
Data was obtained through the use of profiling analysis and with the assistance of others.