Nokia has demonstrated the first spatial audio phone call over a cellular network. Spatial audio uses the Immersive Video and Audio Services (IVAS) codec, part of the next-generation 3GPP standards. While we’ve all heard spatial audio before, doing it over a cellular connection is an industry first.
Nokia is a big contributor to the IVAS codec standards, which are part of 5G Advanced—the next generation of 5G that companies and researchers are hashing out to give us even more capability when it comes to using both our phones and other products like VR and gaming devices.
I’ll admit, I think things like this are cool because I love seeing what smart people with huge budgets can come up with. But this is also the kind of news that needs a bit of explaining.
What is spatial audio?
You’ve probably heard or read about spatial audio. You’ve also experienced it even if you didn’t know it at the time.
The easy way to describe spatial audio is by giving an example—you’ve almost certainly listened to music through a pair of headphones and have noticed different sounds coming from each ear. That’s an example of spatial audio at work.
Spatial audio is multidirectional sound that lets your brain think sound is coming from a certain direction, and we see it in movies or games. It’s well supported by Apple, Microsoft, and Google, so almost any device you buy can do it. Google is adding the feature to Chromebooks, too.
Some communications software already uses spatial audio, like Google Meet or Microsoft Teams. As long as you have hardware that supports it and software that can deliver it, things can sound like they are coming from the proper direction in real time.
Right now, the supporting “software” isn’t available for use over a phone’s voice connection. You might be able to experience it via apps like Meet mentioned above, but those use your phone’s data connection, not the regular voice channel.
Incorporating the IVAS codec into the 3GPP (3rd Generation Partnership Project) standards means every phone on every network can use spatial audio in both voice calls and “standard” video calls through your phone provider.
Do we really need this? Why is this important?
Do we need this? No. We also don’t need 4K displays on phones, expensive smartwatches, or phones that fold in half. Just like you may want to use one of these things, someone else can want to use spatial audio over a phone call.
Spatial audio in a phone call does sound like an unnecessary gimmick. Do you need to know which direction a passing car is coming from when someone is calling you, or do my coworkers need to know which shoulder my screaming parrot is sitting on? Nope. But it could be handy in conference calls where more than one person is in the room or in some other imaginary scenario that probably isn’t going to happen.
But that’s not why this is being worked on or why it might matter.
Think back to before we had 5G and all the promises we heard. 5G isn’t just about your phone. 5G is about the Internet of Things.
I hate that phrase as much as you do, but just because it gets abused and thrown around until we’re sick of hearing it doesn’t mean it’s not a real thing. Plenty of the products we use now could benefit from spatial audio over 5G, like a security camera with audio enabled. Other things could just be a little cooler with it enabled, like a gaming handheld that uses a SIM card.
With a standard in place, we wouldn’t have to depend on support from the people who make the products and the software that powers them. As long as they follow accepted worldwide standards, it will work.
I’m not going to get hyped over a spatial audio call like The Finnish Ambassador of Digitalization and New Technologies did because I don’t feel a need for immersive and rich phone calls. Maybe I’m old-fashioned that way.
I do look forward to seeing the next cool stuff tech companies can invent, and the more tech available for them to leverage, the better. Bring on the immersion!