What AI is Missing: AI, Language, & Behavioural Health

August 25, 2023

9 min read

TLDR

AI experience has shifted from passive to interactive. While traditional AI was largely behind-the-scenes, the rise of large language models has brought about AI applications that interact directly with users, providing real value on an individual level.
Language alone is insufficient to understand complex behavioural health. There are many aspects of the human experience that don't translate cleanly into language.
Analogous to seeking advice from a friendly stranger versus a thoughtful friend, AI should have the ability to understand an individual's unique circumstances and needs.
LLMs are our most natural interface with computers to date, but we need to bridge the gap between AI and emotional intelligence to create truly personalized, insightful supports for our wellbeing.

For a long time now, it’s been clear in a broad sense what AI could do for us. AI was something we heard about. It worked from behind the scenes, powering investment decisions for quant traders, at the heart of computer vision for self-driving cars, and designing our next swipe on TikTok. But AI wasn’t something we experienced first-hand.

AI wasn’t something we interfaced with. It wasn’t something we felt.

Today, large language models have brought about an influx of AI applications that interface directly with end users. Now, we have real palatable use cases of AI generating value for users at an individual level.

Today, we feel the presence of AI.

We’re at an inflection point. Everyone today has heard of generative AI and most of us have touched it in one way or another. The question isn’t if – it’s how – we make best use of these massive capabilities.

Modern Technology Isn’t Aligned With Human Wellbeing

Imagine for a moment a world where people had no ability to understand or respond to the emotional needs of others. It would be cold, isolating, and incredibly difficult to navigate. While this may seem completely alien, it essentially describes the state of modern technology.

The goals of modern technology are not aligned with our goals as humans. This is evidenced by the loneliness epidemic we’re confronted with despite being in contact with one another more than any other time in history. It’s evidenced by the fact that digitization has made healthcare the most accessible it has been in all of human history, and yet rates of mental illness and anxiety are continuing to climb.

As AI systems and software become ubiquitous, their lack of emotional intelligence will cause the world to feel increasingly inhuman. At UpBeing, we believe that software needs to be emotionally intelligent: it needs to consider how you’re feeling before and while interacting with you.

Why Language Isn’t Enough For AI to Understand Behavioural Health

The question of whether or not AI understands people is a complicated one.

I believe that many AI models do understand people – at scale. When you prompt ChatGPT with a question on how to cope with stress, it understands that you’re stressed insofar as you are a person belonging to the macro-category of people who have experienced and documented stress through language. It understands, based on the prompts you’ve provided, that you are a stressed person.

But you’re not a stressed person in the general sense, you’re a stressed person in the individual sense. Your stress is the product of a combination of factors that only you have experienced: your sleep, your busy calendar, your motivation, your relationships, to name a few. The way that stress shows up, meanwhile, isn’t always easy to put into words. Stress is a psychological experience, but it’s also a physical, environmental, circumstantial and social experience. There are aspects of what it means to be stressed that we simply can’t capture in language.

Abstract image of a figure looking in the mirror - represents AI trying to understand people.

As much as 90% of human communication is non-language based. When language is the only source of knowledge, AI is operating on a (very) incomplete picture of human behaviour and wellbeing.

The Difference Between A Friendly Stranger and a Thoughtful Friend

Imagine you’ve had a challenging day at work. You need to talk some things through. You need some advice.

You could step outside and go strike up a conversation with the first friendly stranger you see.

A Friendly Stranger

You could walk up and introduce yourself. You could explain to them that someone on your team at work recently left the company, so for the last two weeks you’ve been doing the work of two people – this has meant more meetings, longer hours, and less time for the things that normally bring you up.

They want to help you, but they don’t know you.

Various circles of different colours intersecting to represent the complex connections between emotions

When you say you’re “busy”, this friendly stranger has no frame of reference. They don’t understand what your life looks like when it isn’t busy. They don’t understand what “busy” means for you. When you say you’re “stressed”, this friendly stranger has no context for what that means. They don’t know what your baseline is – are you often stressed, or is this unordinary for you? What does it look like when you aren’t stressed?

Even if they’re an especially observant stranger and they take note of your tone, your body language, and your general appearance, they don’t know how to interpret these signs without any prior understanding of you.

Many of us struggle to answer these questions even for ourselves.

This friendly stranger does the best they can with the knowledge available to them. Based on your description, they know that your stress is work-related, and they have some idea of what generally constitutes workplace stress. They have some idea of how you’re feeling based on the way you’ve described your thoughts and feelings to them at that moment. But the information they would need to truly understand you just doesn’t translate cleanly into language.

They’re a stranger to you, but you’re also a stranger to them. It’s an incomplete picture.

So, they do what a kind person ought to do. They infer what they can from your words. They offer sympathy. They offer the best advice they can come up with at that moment. They perform the actions that a compassionate person ought to when another person is stressed, but they can’t truly understand your stress.

A Thoughtful Friend

What if, instead, you sought support from a thoughtful friend.

This friend knows that someone on your team at work recently quit, and that you’ve been having to compensate for their absence – this has meant more meetings, longer hours, and less time for the things that normally bring you up. They also understand what work usually looks like for you.

They know that this stress didn’t turn up yesterday, and they know that “stressed” here doesn’t only mean worried. So, they dig deeper. They investigate. They notice your body language: the shoulders hunched, the dragging foot steps. They notice the uncharacteristic mumble to your speech.

Stressed doesn’t only mean worried. Stressed, in this moment, means strained. It means exhausted, burnt out – at a breaking point.

Now, this thoughtful friend has known you for some time. They’ve seen you this way before. They’ve also seen you at your best: excited about work and full of energy. They know what your stress is and what it isn’t. They can infer the components of your stress that don’t translate cleanly into language.

They know what brings you up and they know what brings you down.

Together, you look back at what’s worked for you and what hasn’t. You think up a way to get through this challenging period of time.

The Difference Between Emotional Alignment and Emotional Intelligence

A thoughtful friend wants to bring you up, but so does a friendly stranger. The difference between them isn’t their intention. The difference between seeking advice from a friendly stranger and a thoughtful friend is context.

Compassionate artificial intelligence doesn’t just need emotional alignment. It needs emotional alignment with emotional intelligence, and emotional intelligence can’t be built through language alone.

Language is only a representation of our experience as human beings. It’s a crucial connection point – but it’s nonetheless an act of translation and interpretation. One person attempts to translate the dynamic psychological, physiological, environmental and culturally-informed experience of “feeling stressed” into a limited selection of words all strung together neatly. The other person attempts to interpret and infer from that collection of words what those psychological, physiological, environmental, and cultural elements are that compose their wellbeing.

A thoughtful friend has the context to know what to look for and, as a thoughtful friend, they analyse what they’re observing. Their successful interpretation of your wellbeing isn’t just derived from language. It’s derived from a multitude of inputs and understandings layered on top of language.

As you remove any one of those inputs, the interpretation gets harder.

If you strip this friend of their ability to see you and hear your voice then the interpretation gets harder. If you strip this friend of their prior knowledge of you, then the interpretation gets even harder. When you combine this with the fact that only a tiny fraction of communication occurs through language then the interpretation seems near impossible. We’re left with 10%: language. And that 10% is imprecise at the best of times – especially when it comes to describing our feelings. The thoughtful friend becomes a friendly stranger.

This is how we interact with artificial intelligence today.

Two lines weaving out of sync as a representation of not understanding someone's emotional context

A friendly stranger, no matter how well intentioned, just doesn’t have the background knowledge to see the complete picture. They can listen, ask questions, and try to get a holistic idea of you, but they’re limited by their context. They have intention without understanding; alignment without intelligence.

But a thoughtful friend knows you. They’ve seen you through ups and downs, so they can be proactive and anticipate how you’re feeling. Through time and experience they’ve learned to see into the 90% that’s missing in language alone. They have intention with understanding; alignment with intelligence.

This is what we’re trying to accomplish at UpBeing.

Meet UpBeing: An AI Wellbeing Companion

We can create a language model with the intention of it being a thoughtful friend: to help you feel your best, to listen, to bring you up. We can align it with the goals of human wellbeing. But intention alone doesn’t make the friendly stranger a thoughtful friend. Intent is only part of the equation. The other part is having the context to understand why you’re feeling a certain way – that’s why we built UpBeing.

UpBeing explores why you feel the way you do by finding connections between your behaviours, your environment, and your feelings. UpBeing is designed to understand your wellbeing.

But your wellbeing isn’t simple. The context in which your wellbeing exists isn’t simple. It’s physical. It’s psychological. It’s environmental. It’s social. That’s why UpBeing doesn’t just look at one of these things for context. It looks at all of them:

Physiological: wellbeing isn’t just a product of your mental health. It’s also a matter of physical health: through wearable devices, health metrics, and sleep tracking, UpBeing finds relationships between what your body is experiencing and how you’re feeling.
Psychological: through quick research-based feeling check-ins, UpBeing helps users initiate reflection on their feelings and then connect those feelings to their behaviours.
Environmental: through passive data like e-calendars, UpBeing draws connections between how you’re feeling and what you’re doing.
Social: wellbeing is a team effort, not an individual experience. UpBeing connects people to their support networks and helps them better understand and support each other’s wellbeing.

Over time, as someone spends more time with UpBeing it is able to draw increasingly meaningful connections between what you do and how you feel and deliver concise, actionable insights.

The difference between two different feelings represented on a scale

At scale, UpBeing has the potential to generate novel understanding of human wellbeing as a whole. With UpBeing providing the context, an LLM can do more than just understand people. It can understand the person interacting with it.

What is the Role of Language in Human-Computer Interaction?

At the end of the day, AI exists in the cloud. AI isn’t a friend.

We’ve drawn a thread between emotionally intelligent software and a thoughtful friend – but it’s simply a thought exercise. It’s a way for us, as humans, to think about our mission with UpBeing. But this is an impossible analogue – we can’t give AI intention in the way that we have it as people.

Computer intelligence doesn’t exist as language – it exists, in its purest form, as binary code. Human intelligence doesn’t exist only as language either – it exists in our neural circuitry, in abstract thought, in our ability to synthesise thousands of external and internal inputs all at once.

What we’ve identified through large language models is common ground: an interface through which computer intelligence can translate its knowledge to human intelligence, and through which human intelligence can translate its knowledge to computer intelligence.

Natural language is only the interface. Language alone is not the path to creating artificial intelligence that possesses emotional intelligence because language alone doesn’t capture what it is to be human. It’s only a translation – an interpretation – of the human experience.

To be able to communicate and leverage computer intelligence through the use of natural language is a groundbreaking accomplishment. This is undeniable. Large language models are an opportunity for us to access and interact with insights, knowledge, and intelligence that was previously inaccessible to most humans.

But generative AI and large language models aren’t the answer to all problems of human-computer interaction. They are, however, a vital part of the answer. They are a tool for us to interact with computer intelligence: the most powerful tool yet.

Conclusion

Today’s large language models aren’t built to understand you and your wellbeing. They’re built to understand people and wellbeing broadly; they’re built to understand the way that human wellbeing has been translated into language. Language alone, no matter how much has been collected over time or in your conversation, is only 10%. Especially when it comes to understanding behavioural health and wellbeing, what you do matters more than what you say.

Since day one at UpBeing, we’ve had one chief aim. To bridge the gap between artificial intelligence and emotional intelligence and in doing so help folks learn, understand, and improve their wellbeing.

With or without this bridge, it seems clear that we will continue to see the immersion of AI into more of our everyday experience. If we don’t understand the emotional impact and alignment of this AI, and if we don’t ensure that AI has the capacity for compassion, the results could be disastrous.

But we can choose the path forward. We believe that we can create a future where technology and human wellbeing aren’t at odds with one another and instead create a path where technology supports human wellbeing.

This task is bigger than any one company. Thankfully, we’re not alone.

Inflection’s Pi is a mark of the massive strides that have been made toward creating LLMs that are emotionally intelligent. Pi actually adapts to a person’s unique interests and needs – over time, the Pi you interface with will be different from the Pi someone else interfaces with. Aside from Inflection, there are plenty of other important innovators working on emotional alignment and intelligence for AI. Hume AI is building a comprehensive tool for understanding nonverbal behaviour from any input: text, video, audio, or image. Kinstugi Health is driving important innovation using voice recognition to identify signs of depression and depressive episodes. Similarly, Affectiva is developing human perception AI that can identify cognitive states and nuanced emotions by analysing vocal and facial expressions while iMotions is capturing human behaviour through eye tracking and biosensors. At UpBeing, we’re building Emotion Detection and Recognition (EDR) technology at scale leveraging existing behavioural and environmental data.

AI, Language, & Behavioural Health

Large language models are a tool that we’ve only begun to use. They’re a tool that we’ve only just started to see the full potential of. Maybe the way that we’ve been thinking of and using large language models needs to shift. They aren’t the source of intelligence. They’re the translators of intelligence.

The question, then, is how do we make the most of this tool. How do we translate what it means to be human for computers? How do we create something analogous to emotional intelligence in computers and how can we use that intelligence to better understand ourselves? How do we bridge the gap between emotional intelligence and artificial intelligence?

Different sources of feelings and emotions filtering through the UpBeing orb

With Gratitude,

Sam, Andrew, and UpBeing