Women with dark hair in low ponytail wearing magenta turtleneck and futuristic VR headset

Beyond Screens: Spatial computing and the dynamics of intent.

Warren Schramm | Technical Director

Warren Schramm

With more than two decades of information technology experience, ranging from product development to enterprise architecture, Warren provides a 360 degree perspective of technology.

As tech interfaces continue to evolve, spatial computing takes natural immersion further by joining the physical with the virtual. Not augmented or mixed reality, spatial computing places apps in virtual floating screens projected onto our real environment through a headset or other wearable device. The resulting interface resembles a full monitor, with the same functionality of an app’s desktop or mobile versions.

If you place a virtual movie screen over your coffee table and turn your head, the screen stays over the coffee table–as though it were an object in the real world. Supporting all this processing power in a tiny headset is a feat of techno-wizardry, but the real magic is spatial computing’s new predictive interaction model, which developers use to infer a user's intent. Spatial computing will help computers understand elements in our world and how we shift our attention, leading to new collaborative interfaces that will feel as natural and conversational as a visit with a hairdresser.

Spatial computing could understand context in the future.

Today, most applications act as a set of controls which let a user do something, but there is a shift to make applications predictive. For example, Microsoft Teams via Loop listens to your meetings and tries to generate action items. Many apps, from Google Docs to your phone's touch keyboard, try to finish your sentence as you type. What would it be like if applications instead provided expertise or skills? Because spatial computing provides a way for apps to understand the user's intent, it has potential to provide functional expertise.

Interfaces this smart need to understand a user’s intent and context, and spatial computing provides a way to collect both.

Just like a film director tells the editor to use jump cuts, or keep the pace of the action fast, a user could tell iMovie to bring in all the clips of their kid’s soccer match and do the same. Even beyond at-home projects, professional filmmakers and sound mixers could use similar algorithms to collaborate with advanced editing software. Natural language interfaces enable voice-commands like, “bring up Sally’s dialog, it’s hard to hear her over the background”. Or, even better, the App would realize Sally’s audio was quiet and make the change automagically. Interfaces as smart as these would need to understand context, and spatial computing provides a way to collect and process it.

Can an app learn culture?

Following cultural social norms comes naturally to humans; creating algorithms to teach an app is far more difficult. For years, the tech industry has been on a quest to understand context and intent. Location-based services on our phones let apps know when they believe we are heading home, and can suggest destinations based on calendar invites. Using pattern recognition, Siri will suggest opening YouTube at noon on weekdays for my lunch break. Siri can also read incoming texts on my walks, but her response is always the same: “Would you like to reply?” With more information about my patterns and the context of the message, Siri should be able to know if I’m likely to reply, and anticipate when I’ll start speaking. If Siri was incorporated into my routine through spatial computing, it would be possible for her to read my reactions in real-time, and therefore predict my intentions and preferences over time.

Eye tracking presents potential.

Spatial computing makes three significant new pieces of data available for apps to work with: where the user is looking, what they’re doing with their hands and body, and what objects are around them. Eye tracking is likely the most significant of these three, identifying what the user is interested in without requiring them to move a mouse or touch anything. In spatial computing, simply looking at an app can bring it into focus.

In spatial computing, simply looking at an app can bring it into focus.

With gaze detection, a system would know when a user was reading information from a PDF in one window, and when they’re writing about it in another. Imagine working on your laptop waiting for a message to come in from a coworker in Slack. As you do your work, you’ll likely be glancing over to Slack to see if there is a new message. With eye tracking, the system could be taught to recognize this distraction and give the option to pause whatever you’re doing the second a new message comes in so you can focus on your other tasks.

The next generation of spatial computing.

In the absence of neural interfaces tracking our thoughts, observed behavior will train cooperative interfaces. AI that is empowered with context will lead to truly engaging, collaborative user experiences. Applications will be the experts in their field, proficient in editing videos, producing songs, writing code, and coordinating meetings. The next generation of spatial computing applications will feel more like a personal entourage supporting your interests and the way you move through the world, rather than a collection of widgets hovering in your field of vision.