3 Feature Requests - Additional audio options for performers

First I want to say Hi as I’m new to these forums. Also I apologise in advance if this is the wrong place to post this. As well as that I am TERRIBLE at being concise, and want to ensure I get my FULL point across with reasoning.

To make this easier to read.

@ is my requested Feature

! is the details of my request

  • is my reasoning for requesting this feature

Now to my 3 requests.

As said in the title, I’d like to request the addition of more audio options for Audio performers (Primarily Musicians) like myself and many other players. In order I’d primarily ask for:

@ Additional Simultaneous audio inputs (Ability to use multiple inputs from interfaces ect at same time) Ideally Minimum of 4. (separate Channels with individual Volume sliders/ mute buttons)

! For example I and many others use a Focusrite 2i2 USB Audio interface or similar for our Mic audio. Physically this interface has 2 inputs, however VRChat seems to kneecap its capability by only picking up and outputting audio coming from input 1. I am not a developer or have any comparable understanding of how VRChat or Unity works, but I’d assume this is either intentional or an oversight.

! While it is possible to get around this limitation with software like Virtual Audio Cable, and VoiceMeeter Banana, these workarounds are NOT sufficient, as they put more load on the users system (even if minimal VRC is heavy enough by itself, even on my i9 9900 and RTX 2080 Super) they can add latency, and can add audio artifacts reducing the clarity of sound.

  • It is in the best interest of VRC Musicians like me to present the best/ cleanest audio possible, as to try to avoid irritating players hearing us. I have witnessed even great musicians getting blocked/ muted simply because in order to capture their singing alongside their Guitar / piano, they have to use a single mic at a distance to pickup everything evenly. Unless you have a decently high end audio setup and good audio environment, this degrades the audio quality as your sending a LOT of sonic information into a single microphone, into a single preamp, then Into VRChat which like with many Voice chat systems compresses this causing it to come out in some level of garbled mess to end listeners. By allowing the use of more inputs from Audio interfaces or multiple different inputs simultaneously, (assuming users have the equipment required) we can separate out our performance to different inputs, possibly on different channels, this way for example I can Keep My Mic on Input/ channel 1 to sing or communicate, Guitar on Input/ Channel 2, Backing track on input/ Channel 3. This would allow us to offer cleaner audio to listeners, making us less of an irritation to those around who aren’t actively listening but still hear us, hopefully avoiding us getting unnecessarily muted/ blocked (since most users pretend the per-user volume slider doesn’t exist) and reducing our trust ranks and future potential audiences.

2nd Feature Request

@ Stereo panning of Inputs/ Channels 2, 3 and 4 (with panned audio sources up to 1 metre on either side of user, with individual PER-channel volume and panning sliders)

! Building off my previously requested feature, it would greatly help/ be really cool to be able to pan audio from the extra channels. For both performative and quality reasons being able to take input coming from Input/ Channels 2,3 and 4, and move them Left or Right of us spreading out where the sonic information is coming from. This way we can both create a more immersive performance audio wise, as well as avoiding muddiness.

  • For example If I am playing Guitar and singing where both parts are in a lower tonal register the bass in my voice on top of the bass in my guitar sound, compounded by the compression of the Voice over ip system/ Voice chat all coming from the EXACT same source/ place will result in a muddy mess of audio. By being able to separate IN SPACE even to a limited degree where each audio Channel is coming from, we can reduce frequency muddiness and increase clarity. Not only that but it would be MUCH more immersive to listeners as the sound would not be as one dimensional and flat.

Last Feature request (could address potential concerns of Request 2 and fix current problem)

@ Audio Proximity Barrier

! Pretty simple and straightforward, I like to ask for the ability to set a barrier with a VISUAL proximity (visible while in menu, and visible to ALL players in menu) where audio coming from us doesn’t breach AT ALL. This way we can consciously limit how far our audio can travel without having to drastically reduce our volume. This way if we’re chilling out with a group by a mirror performing to them, we can stop our audio reaching beyond that local area meaning those NOT in the group don’t have to hear us without having to mute us.

  • I myself have been muted by people who even said they enjoyed it but were finding me distracting while they were trying to talk with others, despite other listeners saying my volume was fine.

! Would also be cool to have the option to block audio from outside the barrier getting in, Music/ audio from other players not listening can be distracting for the performer too, and its not practical for us to stop mid performance to deal with that.

Hi Audio Phoenix, as others will say, welcome!

I’d guess feature requests are best placed under feedback reports for the relevant thing you want it added to, wherever that may be. I’d suspect this is aimed at the VRChat World SDK.

As a fellow DJ and musician with some audio engineering experience, I can understand your feature request. I also have a third-generation Focusrite Scarlett 2i2 in my studio, though as popular as they are, I rarely use mine because of its limitations of being USB2 and awful lag. Worth a look is their Clarett+ which is USB3, zero lag , clean sounds regardless how many tracks you throw at it, and can power your headphones, but I digress…

Apologies for the long and detailed answer, I’m going from my own experience here how I deal with the same issues, in the hopes it helps…

The good news is, most of what your asking already exists. Unity, the VRChat world SDK, and Oculus SDK have these things. How accessible they are to you, depends on how you’re getting your signal into VRChat, and how accommodating the world/venue you’re performing in can stream it, though technically some of this out of our control… Everyone’s setup is going to be different.

On the first request, inputs, sorry the onus is on you for this one. If you haven’t invested in a mixer, and just using the Scarlett’s 2 in’s and outs for monitoring, then all you have is the USB from it going into your computer – it’s an audio interface giving you 24bit 192k, which may be okay in most, but not all circumstances. You provide the final mixdown, the mixing is done long before it gets online to the world… You chose to provide it as mono, stereo, 3.1, 5.1, or full immersive 360 degree sound, whatever your gear is capable of and appropriate for your set.

If you need more inputs, then your ASIO drivers and/or mixer should be doing the heavy lifting on your end, If you’re thinking of trying to remux 4 individual channels in real time, the Internet just doesn’t work that way, unfortunately your audio would lose sync and drift all over the place like musicians who can’t keep time with each other, believe me I’ve tried it.

Nevertheless, once you get your source outs the way you want them, you then route that like you would to any other web streaming platform, such as Twitch or YouTube, passing through something like OBS. Most inworld TVs will accept this type of stream and play it without issue, and it will sound great for everyone.

(As an aside, I’ve asked for Shoutcast support, but that’s another topic).

As a last resort perhaps, some people will pipe their sound through their VR headset or desktop mic input in lieu of doing a proper stream, so instead of voice you’re hearing the person’s computer mix. IMHO it’s going to sound pretty awful since it was made for voice and never designed for high quality audio, that and due to earmuffs or personal volume sliders you can’t really guarantee everyone will hear you. (Hence another reason we use world audio instead).

Which brings us to request #2 – I don’t know about Vive or index users, but this is indeed built into the Oculus SDK. Some desktop users may have fully spatial sound while others could be completely mono with one speaker, there’s no telling who has what, but you can indeed pipe spatial data through the world SDK as well control your output range. We are talking world audio, the VRC Spatial Audio component can be added to or removed from any audio source in the world. You can pan these, change gain, add effects, define the range and priorities of sounds over each other, dial it in as you like, however, it is done on the Unity end of it for the world map, and the world builder would need to set this up for you ahead of time if you had special considerations, unless they code in sliders to make these accessible in real time. Check out the SDK docs, there’s a lot there to consider.

For request #3 we have the earmuffs indicator in the client, but it’s up to the world maker to setup audio zones for how the world or other audio performs. I suppose a world maker could put in sliders or toggles of sorts or draw out the distance of the audio zones, though usually if you’re at a venue, a reasonable person is going to be able to hear the performer and could get muted and reported if they’re being disruptive while a live concert is going on. If they don’t want to hear the music they can, as you say, just mute or dial it down, it’s their choice really, I’ve seen some clubs that have quiet zones (or designated separate voice zones), but again as with request #2 it’s really up to the world maker to build this.

I haven’t been to a lot of clubs yet myself, though most seem to just have a simple media player that they can lock down as needed, and that seems to work.

This is an interesting and important topic for me, I hope others chime in with additional input, perhaps we can all come to a good consensus and build out a guide for world makers, how to give the best venue for the many live performers out there? Or a guide for non-technical musicians, to setup their studio for playing within VRChat? Thus far, I haven’t really seen any official documentation or recommendations on this; Most of what I’ve learned has been word of mouth from other musicians and DJs.

Semi related to this I’ve been meaning to find a world that accepts an rtsp URL and plays it in low latency mode. Iwa sync3 will accept rtsp but it has high and variable latency. 7-15 second and not consistent for other watchers. I’ve seen one world by a DonK that accepts rtsp and does low latency but it’s got the video output mounted a bit high. I think it’s 2 seconds in low latency?

I’ve seen twitch streamed to VRChat, not sure about youtube live, I’ve read one reference saying it won’t work, but I haven’t tried.

I think club worlds have the URL hard coded, and on days they’re live the video player knows to connect, and maybe on off days the world is a version without the URL, or it knows to not connect.

I guess secondary audio streaming within VRChat could work, but I’m already in the habit of greatly turning down anyone walking around playing music into VRChat. Some people might need reminder about volume control rather than immediately reaching for mute. But maybe mute is remembered and lowering the volume is per instance. Maybe an icon in the name plate indicates available secondary audio program and then users enable it.

In general VRChat focuses on making things possible client side, and it’s up to the community at large to do the rest. Like VRCDN, a service for live streaming into VR, no affiliation to VRChat.

I think in general there is three layers to figure out.

Venue, streaming service, and local setup.

Topazchat is service I’ve seen hosted in Japan that seems intended for small performances. Can use either thier program or OBS to stream to the service. Japanese only documentation but it worked the time I tried