VR’s Biggest Problem and How We Could (Maybe) Solve It
When you see yourself move somewhere in VR, while your inner ear doesn’t feel any movement, you get motion sickness.
And that’s a huge problem, because it doesn’t matter how good the gameplay or narrative or art is…if you feel like throwing up the entire time.
And, yes, you can get used to motion sickness and many VR players have, but good luck asking most people to go through weeks of genuine discomfort just to play games that may or may not be good.
Why not just play Baldur’s Gate 3 instead?
So, unless we want to immediately lose half of our potential audience, we end up having to either (a) not have any movement or (b) spend a lot of time designing a custom, restricted movement system that doesn’t cause motion sickness.
Either way, movement is one of the most important actions in all of videogames, so we’ll have to spend a lot of effort just to make up for what we lost by not being able to do proper movement.
Just imagine trying to make an FPS game without flanking, a MOBA without the ability to sidestep abilities, or an action-adventure game without running and jumping.
Yes, there’s always a solution. But it takes a lot of development time away from other things.
And so, without proper movement, it ends up being very difficult to create games with anywhere near the depth of a PC or console game.
And that’s not even the worst part. Because, even if we avoid movement entirely, we’ll still have to fight a never-ending battle against motion sickness.
As it turns out, motion sickness can be triggered by all sorts of seemingly random things, like a UI popping up in the wrong way or an enemy sword bouncing around your peripheral vision.
And even small amounts of motion sickness can stack up over 30 minutes or more to create an inexplicable feeling of discomfort that players won’t necessarily know is motion sickness.
They’ll just come away not enjoying your game “for some reason”, which makes testing a huge pain.
So, every time a developer makes a VR game, they’ll have to spend months of time just on testing for motion sickness and coming up with ways to mitigate it.
Imagine how much time the market has collectively wasted on this problem.
So, if we sum up all these problems and consider that VR is already a multi-billion dollar market, I don’t think it’s radical to propose that motion sickness is a billion dollar problem.
And that’s not even taking into account how much bigger the VR market could be if motion sickness wasn’t a thing and we’d be able to make much better games.
Ok, it’s clearly a problem. But what do we do about it?
A GENERAL approach
Remember that motion sickness was all about the mismatch between seeing yourself move with your eyes, but not feeling it with your inner ear.
So there are three important systems that matter when it comes to motion sickness:
The visual motion processing system that takes in inputs from your eyes
The vestibular motion processing system that takes in inputs from your inner ear
And the system that compares outputs from the two and detects a mismatch
Now, while the latter two are very important for motion sickness, we don’t really have easy access to them as VR developers:
To access the vestibular system, we’d have to exert influence inside your inner ear. That’s not impossible to do and there are already pills and vibrating devices that can do that. But we’d have to convince players to buy those devices or pop some pills, which wouldn’t be easy.
And, to access the comparison system, we’d have to affect something deep inside the brain, likely the medial superior temporal cortex (MST). And that seems way harder to access than your inner ear, especially since we don’t even know much about that system yet.
So we believe it’s best to focus on the visual motion processing system, since there’s already a screen on the player’s face and we won’t have to try and sell hardware to anyone. We already have free access to their eyes.
Our goal, then, is to trick that system into NOT seeing any movement when you’re moving around in VR, which should prevent the mismatch from occurring in the first place.
And we think that messing with this system should work, since we’re accidentally already doing that in VR — you’re not actually moving around in a 3D world, you’re just seeing pixels on a 2D screen.
Plus, it’s also just very easy to fake movement with optical illusions, so it should be possible to cheat the system with even relatively simple approaches.
That’s why we think it’s worth putting our eggs into this basket.
Nothing is actually moving in this illusion by Dr. Kitaoka.
Visual motion processing
To trick the visual motion processing system, then, we’ll have to understand EXACTLY what that system sees.
So that's what we've been looking into over the past year.
And it turns out that, according to the neuroscience literature, motion processing happens mostly through the magnocellular pathway (M-pathway) to the visual cortex.
But what’s more important is what that system “sees”:
It only sees fast movement (high temporal frequency)
It won’t see things that are moving very slowly. We can intuitively experience this when animations stop feeling fluid when we go below around 10 frames per second. Movement below a certain speed doesn’t register that strongly.
But, itll see much faster things than most other systems involved in vision. That’s why you’re able to recognize that something moved, even if you didn’t recognize, say, its form or color.
We ran some unofficial tests in VR and, yeah, very slow movement doesn’t really cause motion sickness. Neither does insanely fast movement or movement over a very short period of time.
Studies have found that around 10hz is its preferred frequency (Mikellidou et al., 2018), above 30hz is where it starts dropping off hard (Derrington and Lennie, 1986), and above 20hz is the frequency at which most other components of your vision will stop being able to keep up (Skottun, 2013; Pawar et al., 2019; Chow et al., 2021; Edwards et al., 2021)
It only sees big things (low spatial frequency)
It won’t see small details. Specifically, it’ll stop responding to things below 1 cycle per degree (cpd). That’s about the size of a small tree that’s 10 meters away (Skottun and Skoyles, 2010; Edwards et al., 2021).
Our unofficial tests in VR showed that the stripes that caused the most motion sickness were indeed around the 1-1.5 cpd range. Stripes smaller than that didn’t do much.
An approximation of what the motion processing system would see if it had 1.5 cpd vision.
An approximation of what the motion processing system would see if it filtered for <1.5 cpd spatial frequency (Edwards et al., 2021)
It only sees brightness (luminance contrast)
It (mostly) does not see color, which we can experience when some object moves by us super fast and that object will just appear as a shadow.
Color is still somewhat involved in motion processing, but brightness carries the overwhelming load (Aleci and Belcastro, 2016; Edwards et al., 2021).
It can apparently see brightness contrasts as low as 0.5% (Aleci and Belcastro, 2016) and responds clearly to contrasts of around 4-8% (Butler et al., 2007).
Our own experiments revealed that there was a pretty substantial drop-off in motion sickness when we went below 20% brightness contrast (in whatever units Unity uses) and an even bigger one below around 5%. Motion sickness also didn’t get much worse going from 40% to 100%.
Keep in mind that it isn’t tricked by global brightness changes, like what happens when the sun pops up from behind the clouds.
This is because it has cells that look for brightness changes in a particular direction, as well as cells that look for brightness changes that cover all directions. If the latter cells spot a brightness change that covers all directions, they send a signal that cancels out the signal from the directional cells and there’s no perception of movement (Im and Fried, 2016).
What it sees depends on speed, spatial frequency, and temporal frequency
It sees fast motion at low spatial frequencies just as well as slow motion at high spatial frequencies (Mikellidou et al., 2018)
More reading: Wichmann and Henning, 1998, O’Carroll and Wiederman, 2014
Contrast sensitivity = how easily people can tell that something is moving (Mikellidou et al., 2018)
It’s more sure of what it sees if there are lots of edges in different orientations that move together
Intuitively, there’s a pretty low chance of random crap all moving together in the same way, unless it’s you that’s moving. So the system probably relies on a lot of stuff in different orientations moving together in a correlated way.
More on this: Diels and Howarth, 2009; Palmisano et al., 2015
Not directly related, but it probably feels much worse to move diagonally than forward in VR because you’re seeing things move along both the z- and x-axis, rather than just one. That’s double the visual information and therefore double (or more) of a mismatch with what your inner ear is feeling.
Isotropic noise should theoretically be the most effective at causing motion sickness, since it’s noise “in all orientations”. Image from 3Delight Cloud
It sees things in peripheral vision
It’s much better at responding to stuff that’s happening in your peripheral vision than other systems involved in vision.
You can test this out by testing how hard it is to see an object far in your peripheral vision when it’s not moving versus when it is. Moving objects are much more noticeable in peripheral vision, since this system is taking care of that.
More reading: Baseler and Sutter, 1997
It gets used to things after they move at a constant speed for a while
Basically, it’ll stop responding as strongly to the constant movement and start looking for smaller relative speed differences between objects in a scene (Clifford and Wenderoth, 1999).
This is probably because relaying the same “you’re going this fast” information over and over again isn’t particularly useful, so it’s mostly communicating important changes (i.e., accelerations).
One study found that this adaptation started to happen after 3 seconds at the neuron level (Wezel and Britten, 2002). Our crappy tests found that it started to happen around 5-10 seconds of being at the same constant speed.
This effect is why games where you constantly run forward (like Pistol Whip) don’t cause much motion sickness.
It sees nothing during eye movements and is most sensitive right after them
Our eyes do small, imperceptible movements multiple times a second to correct for drift. Plus, they’ll also do bigger movements as you focus on new things in your environment. These movements are called saccades.
To prevent us from confusing our own eye movements with actual movements in the scene, our motion processing turns off during that small period of time (Binda and Morrone, 2018).
There’s also a moment right after a saccade where the system tries to catch up to what it missed by being extra sensitive to motion (Frost and Niemeier, 2016).
The system also obviously turns off when you’re blinking.
So, if we do eye tracking to track blinks and saccades, we can use those periods to hide all sorts of stuff.
One well-known technique shifts the entire scene around during these moments, so that you can do what seems like walking forward, but actually be walking around in a circle perpetually (Sun et al., 2018).
And here are some other things we found:
It seems very good at dealing with visual noise. Noise of roughly a 1 cpd spatial resolution seemed most effective, but no amount of noise we tried would significantly prevent motion sickness. In fact, it would just cause more motion sickness unless that noise was exactly equal in all possible directions.
It seems to care about perceptual depth. If the motion occurs in what we perceive to be the foreground against a stationary background, it apparently causes less motion sickness (Nakamura, 2006). This is likely because we don’t want to falsely think we’re moving if some objects in the foreground move. But the entire background is unlikely to move unless we’re the ones moving. We briefly tested this and it appears to be true, but more testing is needed.
Having no brightness contrast at all might make motion sickness worse (Bonato et al., 2004). But we couldn’t replicate this when we tried it.
It might be turned off if there’s diffuse red background and rapid flickering (Lee et al., 1989; Hugrass et al., 2018; Edwards et al., 2021). But we couldn’t replicate this either.
It might struggle when there are lots of objects right next to each other (Atilgan et al., 2020). Also couldn’t replicate this.
There are mathematical models of neurons in this system that I haven’t bothered to look into much, so maybe you can look into them. Here’s some reading on that: Adelson and Bergen, 1984; Simoncelli and Heeger, 1998; Mather, 2013
How to trick the system
Ok, so we have a system that only responds to specific things in the image you see. Let’s call these motion cues.
And, importantly, some of these motion cues are things that the rest of your vision either doesn’t see or sees poorly.
This means we can theoretically do two things to reduce the mismatch between your visual and vestibular systems:
Remove motion cues from the image you see, in a way that minimally affects the rest of your vision
Add motion cues from a stationary environment, in a way that minimally affects the rest of your vision
The reason why we’ll probably need the latter too is that, even if we remove most motion cues from the image, there are no counteracting cues when you’re fully in VR.
So the motion processing system may simply shift to relying on the few remaining cues and still trigger at least some motion sickness.
That’s probably why fully blocking off the player’s peripheral vision (with a vignette effect) doesn’t fully solve motion sickness. Some cues still seep in from the center of the screen.
Plus, it might also be the case that motion cues get increasingly difficult to remove beyond some threshold.
So, since adding in cues from a stationary room might reduce how much removal we’ll need to do, we might be able to fully stop motion sickness with less overall visual disruption.
But, anyways, here’s an example of what this adding and removing process might actually look like: we already know that motion processing is mostly about brightness contrast in your peripheral vision.
So, to reduce motion sickness, we could remove ONLY brightness contrast in your peripheral vision, keeping color the same in peripheral vision and not touching your central vision at all.
And we could also be able to add brightness blobs from a stationary environment, so that you get the motion cues you’d expect if you were just chilling in your room and moving your head around.
If we do all that, we’d theoretically remove motion sickness in a way that isn’t very perceptible.
From top to bottom: a color gradient with all brightness differences removed > a more normal gradient > greyscale version of gradient with no brightness differences > greyscale version of normal gradient. Note how we just removed brightness without touching color. Image source: Ottoson, 2020
So the general idea is to (1) do further research into what counts as a motion cue and then (2) go through each cue and figure out ways to add or remove it in a (3) minimally perceptible way.
Here are a few more examples to illustrate what that could look like:
Sees low spatial frequency data = use a high-pass filter to remove those low frequency motion cues from the image. And then add low frequency cues from a stationary room to maximally communicate that you’re not moving anywhere.
Sees high temporal frequency data = remove stuff that’s moving fast from the image and then add fast-moving stuff from your stationary room
Sees directional brightness changes = make brightness blobs move in the exact opposite direction of where the objects in the scene are moving (or use the earlier optical illusion to counteract the direction of movement)
Left = applying a high-pass filter to remove low frequency information. Right = if we now apply a blur that represents how your motion processing system sees things, it no longer sees anything from the filtered image.
If we generalize even further, our goal is simply to make sure that the following basic equation holds true:
Where Smov is the total strength of motion cues from the moving game scene, and Sstat is the total strength of motion cues implying that you’re stationary.
It’s also possible that we’ll also need Smov to be below some threshold, even if there’s enough stationary cues to theoretically counterbalance it.
It all depends on how the comparison between visual and vestibular inputs actually works.
Regardless, if we satisfy the above equations, we should no longer have a mismatch strong enough to cause motion sickness.
But does this work? That’s what we’re testing right now.
It looks promising so far, but it’ll take a lot of time to test everything.
So maybe…you could help us? :)