How Neat Rethought Speech Quality for Large Rooms

Øystein Birkenes, May 26, 2026

Instead of forcing microphones to compete for attention, Neat’s multiple-mic mixing allows the entire room to work together as one intelligent, coordinated audio system.

For years, the video collaboration industry has focused heavily on visuals. Better cameras. Smarter framing. More immersive experiences. But in large meeting spaces, speech quality is often the real challenge.

As rooms grow larger, conversations become harder to capture naturally. Voices arrive from different distances. Multiple people speak at once. Background noise becomes less predictable. Adding more microphones helps—but it also creates a new problem: how should the system decide which microphone to listen to at any given moment?

This is where large-room experiences often start to fall apart. Conversations stop feeling fluid and natural because the technology underneath is constantly trying to simplify something that is inherently dynamic.

At Neat, solving this problem has been a gradual evolution across several generations of products. What began as focused beamforming for smaller spaces has evolved into a fundamentally different way of thinking about room audio—one where microphones no longer compete with each other, but collaborate intelligently in real time.

Building an entire listening system

Our first-generation approach with Neat Bar focused on directional speech pickup for huddle spaces and small meeting rooms. In these environments, a single beam—a focused listening area aligned with the camera—works extremely well. People sit relatively close to the device, conversations tend to happen more sequentially, and fewer participants create less competing noise. As a result, isolating speech from background noise is much simpler.

As we expanded into larger spaces with Neat Bar Pro, we faced a different challenge: people often sat much farther away from the device. Capturing distant voices more clearly required sharper microphone beams with greater reach. But sharper beams also cover narrower areas, so a single beam was no longer enough. We introduced a 16-element microphone array capable of forming multiple simultaneous beams, each covering a different part of the room with greater precision and reach.

But introducing multiple beams created a new challenge. Video platforms still expect a single mono audio stream, which meant the system had to decide how those beams should be combined. Our first approach was dynamic mic selection: continuously analyzing speech levels across beams and selecting whichever beam contained the strongest voice at that moment.

Capturing speech from any direction

That approach worked remarkably well and became the foundation for our next evolution. With Neat Center, we expanded the concept even further. Instead of capturing audio primarily from the front of the room, Neat Center introduced full 360-degree beamforming, allowing speech to be picked up clearly from any direction.

But while Neat Center could capture audio from all angles, range still mattered. To extend coverage across larger spaces, we added support for multiple Neat Centers working together throughout the room. Because Neat Center operates alongside a primary device such as Neat Bar Pro, the system no longer manages audio from a single device—it coordinates audio across an entire network of microphones distributed throughout the space.

Beyond “best beam wins”

As the system evolved into a coordinated network of microphones and devices, the limitations of traditional mic selection became increasingly apparent.

At first, we used the same logic we had relied on previously: whichever beam or device detected the strongest speech became active. But large-room conversations rarely happen one speaker at a time. People talk over each other. Side comments overlap with main discussions. Someone laughs at one end of the table while another person begins speaking elsewhere. In these moments, traditional “best beam wins” systems struggle because only one microphone can dominate the mix at any given time.

The effect is subtle but important: one voice becomes clear while another becomes less intelligible. Localized noises can occasionally steal focus. Systems introduce switching delays to avoid instability, but those delays can make interactions feel less immediate and natural.

The issue was never the microphones themselves. The limitations of forcing devices to compete for the “best” signal had always been clear—we simply needed a better way to coordinate audio across the room.

From mic selection to intelligent mixing

That realization led to a completely different approach.

Instead of selecting a single microphone, we developed a system where multiple microphones and beams can contribute to the conversation simultaneously, balanced intelligently in real time depending on speech activity, beam geometry, and acoustic conditions.

Under the hood, the system intelligently blends audio across multiple microphones in real time. Externally, we call it multiple-mic mixing.

Conversation feel fundamentally more natural. If several people speak simultaneously from different parts of the room, they can now all be heard clearly without abrupt switching between microphones. Because the system no longer depends on hysteresis-based mic selection, speech transitions feel more immediate and natural. And localized noises no longer hijack the conversation because noisy beams contribute only minimally to the final mix.

This is less about drawing attention to the technology and more about removing friction from the conversation itself. Meetings feel calmer, clearer, and more inclusive as the room adapts naturally to the conversation.

Why traditional mixing still sounds artificial

While some in the audio industry may compare this to traditional gain-shared mixing, our implementation goes considerably further. The system is tuned specifically for speech collaboration and adapts continuously to device placement, beam overlap, and room geometry.

In some situations, a naive blend between beams can actually increase reverberation and reduce intelligibility, so the mixer continuously adapts to favor what sounds most natural instead.

These details matter more than most people realize. They are often the difference between audio that constantly draws attention to itself and audio that simply feels natural.

A smarter approach to large-room audio

With our latest NeatOS 26.1.0 software update, we’ve extended multiple-mic mixing across devices. Multiple Neat Centers and the primary device can now contribute simultaneously within the same intelligent mixing system, allowing the entire room to operate as one coordinated audio environment rather than a collection of isolated listening points.

We’re moving beyond systems that capture meetings toward environments that understand conversations, with much more still to come.

Hear the difference when the room understands the conversation.
Book a live demo.