Skip to content
Jae Hoon Kim
Projects

BreathLink · drawing set · v1.0

Part I · Cover

U.S. Provisional Patent Application · Drawing set
Appl. No.
Sheets 5
Rev. F · v1.0
Title · Method and system for real-time computation of inter-participant respiratory coherence in a video-conferencing session using sub-pixel motion features of the existing camera stream
Inventor · J. H. Kim
Internal · BreathLink · descendant of RoomMirror
Filing target · bundled with RoomMirror parent
Classification (proposed) · H04N 7/15 · A61B 5/08
About this Descendant filing · v1.0 figures aligned with v0.4 claim set Companion to RoomMirror
BreathLink extends the RoomMirror biometric primitive from CSI-derived respiration into the video-conferencing camera stream. The novel hops are (i) speech-gated inter-participant respiratory coherence as an ambient indicator — not a corrective nudge, per dyadic-synchrony findings that pushed inter-personal synchronization can decouple intra-personal cardiorespiratory rhythms — and (ii) an adaptive respiration band centered per participant. Per-caller respiration extraction itself is rPPG / Eulerian-VM prior art. v1.0 honest acknowledgment (2026-05-18): the underlying single-participant rPPG-from-webcam mechanic is a heavily-commercialised field. Philips runs a Biosensing-by-rPPG patent-licensing program covering the foundational pulse-and-respiration-from-camera IP. NuraLogix Anura (launched 2023 specifically for video-call telehealth) computes per-participant vitals during video calls using patented Transdermal Optical Imaging — same setting as BreathLink, but single-participant. Binah.ai and Lifelight are adjacent commercial rPPG SDKs (the latter holds Class II medical certification). The surviving claim is unchanged: inter-participant respiratory coherence, speech-gated, computed peer-to-peer with no server aggregator and ambient (non-corrective) UI — none of which the cited single-participant rPPG art teaches.
Abstract of the disclosure Cover-sheet boilerplate v1.0
A system and method compute, in real time during an active video-conferencing session, the respiratory coherence between two or more participants from sub-pixel motion features of the camera stream already running for that session — without any auxiliary physiological sensor and without any third-party aggregator. Per-participant respiration is recovered by Eulerian-style motion magnification or analogous sub-pixel-displacement methods applied to a region of interest comprising lower-face landmarks, with chest landmarks used as a fallback when in frame; the analysis band is adaptive, centered on each participant's detected respiration mode rather than fixed. Each host independently computes the pairwise magnitude-squared coherence γ (or analogous phase-locked-value statistic) over a sliding analysis window, gated to intervals of bilateral non-speech detected from the existing conferencing audio stream so that ordinary turn-taking anti-phase respiration is not misread as decoherence. The surfaced UX is an ambient indicator reflecting the current coherence state — not a corrective nudge — consistent with dyadic-synchrony findings that pushed inter-personal synchronization can decouple a participant's own cardiorespiratory rhythms. For N > 2 participants, the indicator is derived from the median pairwise γ or a per-participant coherence-to-group score. The disclosed system is, by construction, symmetric and peer-to-peer: every host computes its own γ locally, and no server or aggregator sees the respiration traces. Raw video, raw audio, and per-participant respiration traces remain on each host; no per-user identifier, raw video, raw audio, or respiration trace is transmitted off-host except, in the case of the respiration trace itself, between participating peers and only over the conferencing session's own data channel.
Field
H04N 7/15 · conferencing
Acknowledged prior art (rPPG-from-video)
  • Philips Biosensing-by-rPPG · foundational IP, licensing program
  • NuraLogix Anura (2023) · closest commercial · vitals during video calls, single-participant
  • Binah.ai SDK · Lifelight (Class II) · commercial rPPG SaaS
  • Wu et al. SIGGRAPH '12 — single-source Eulerian VM
  • Verkruysse '08 — single-source rPPG
Distinguished from (surviving novelty)
  • All cited rPPG art is single-participant · this is inter-participant coherence
  • Speech-gating to avoid turn-taking false decoherence
  • Peer-to-peer · no server aggregator (cf. cloud SaaS)
  • Ambient (non-corrective) UI per "synchrony decouples self" (bioRxiv '25)
  • Outbound rPPG suppression (claim 9) — separate novelty
  • Dyadic ANS synchrony (Frontiers '21) — lab dyads, contact sensors
  • Zoom AI Companion · Calm-app — different modalities entirely
Index of sheets Tap a row to jump 5 sheets

Part II · Drawings

Sheet 1 / 5 Representative FIG. 1 · Two-tile video call · breath ROIs 100
102 · participant A tile LIVE · webcam A 104 · participant B tile LIVE · webcam B 112 · γ(A, B) see FIG. 2 106 106 107 107 106 primary ROI · lower-face · Δ ≈ 0.4 px sub-pixel motion 107 fallback ROI · chest motion · only when chest in frame RESP · A RESP · B 108 · respiration · A 110 · respiration · B Each webcam supplies sub-pixel motion; respiration 108 / 110 is recovered on-host. No added sensor.
FIG. 1
Sheet 2 / 5 FIG. 2 · Coherence trace · breath in phase / out of phase 200
RESP. 202 · caller A respiration 204 · caller B respiration γ(t) 1.0 0.8 0.4 0.0 208 · γ ≥ 0.8 · high coherence 210 · γ ≤ 0.4 · indicator-state threshold 206 · magnitude-squared coherence γ(t) 212 · sustain ≥ 180 s crossing γ = 0.4 214 · low-coherence indicator end of sustain window 0 120 240 360 480 600 session time (seconds) 216 · LEGEND 202 caller A 204 caller B 206 γ(t) 212 sustain 214 indicator
FIG. 2
Sheet 3 / 5 FIG. 3 · Per-side pipeline · ROI → VM → respiration 300
300 · LOCAL HOST webcam frames 302 ROI tracker 304 · lower-face + chest VM / sub-pixel 306 · Eulerian-class respiration 308 · adaptive band raw video / respiration trace do not leave the host Each host runs blocks 302–308 on its own webcam stream; only the per-host respiration trace (308) is forwarded to peer hosts of FIG. 4 via the existing conferencing transport.
FIG. 3
Sheet 4 / 5 FIG. 4 · Peer-host coherence · symmetric · no aggregator 400
402 · host A 404 · host B respiration · A 308 · per FIG. 3 respiration · B 308 · per FIG. 3 audio · VAD module · 406 audio · VAD module · 408 γ(A, B) · A-host view VAD-gated · non-speech only γ(A, B) · B-host view VAD-gated · non-speech only respiration traces only · existing conferencing transport Symmetric: each host computes γ from its own trace and the peer trace, exchanged on the existing conferencing data channel. No third-party aggregator required.
FIG. 4
Sheet 5 / 5 FIG. 5 · Outbound rPPG suppression · local-only physiology 500
webcam capture 502 · single source LOCAL · unmodulated · used for γ computation local ROI / VM 503 · per FIG. 3 respiration trace 308 · unmodulated γ · per FIG. 4 on-host only OUTBOUND · suppressed · what the peer can recover rPPG / VM suppression 504 · sub-pixel modulation encoded outbound 506 · perceptual quality kept peer side 508 · rPPG / VM fails Local pipeline (top) operates on unmodulated webcam frames for γ computation; the outbound stream (bottom) is sub-pixel-modulated so a peer running rPPG or motion-magnification cannot reconstruct the sender's heart rate or respiration from the received video. Perceptual quality is preserved.
FIG. 5

Part III · Specification

Background of invention Prior-art context

Eulerian Video Magnification (Wu et al., SIGGRAPH 2012) and the broader sub-pixel motion-magnification family enable recovery of respiration and pulse signals from ordinary video. Remote photoplethysmography (rPPG; Verkruysse et al., 2008; and later) recovers heart-rate from facial colour variation in webcam feeds. Both literatures focus on the per-source physiological signal.

A parallel literature establishes that interacting dyads spontaneously synchronize cardiac, respiratory, and electrodermal rhythms during naturalistic conversation, with the strength of synchrony correlating with engagement and reciprocity (Frontiers in Neuroscience 2021; Cognition 2022; and related work). That literature is uniformly built on contact sensors (chest belts, ECG, EDA) under laboratory conditions; it does not address recovery of the signals from a camera stream already in use for a video-conferencing session, nor the conversion of the synchrony measurement into an in-session UX surface.

Two findings from the dyadic-synchrony literature constrain the UX claim. First, interpersonal respiratory synchronization can coincide with intra-personal cardiorespiratory decoupling (bioRxiv 2025), so a UX that pushes participants toward synchrony may not be physiologically benign. Second, dyadic respiratory synchrony is driven by the predictability and stability of each participant's rhythm, not by online mutual adaptation (Cognition 2022); nudging toward a target γ may therefore target the wrong variable. The disclosed system accordingly surfaces an ambient indicator rather than a corrective nudge: the indicator reflects the current state to the participants, who may attend or ignore.

A second, distinct threat motivates the outbound suppression of physiological signal. The same recovery primitives, applied in reverse by a recipient of the conferencing video, allow a peer, a conferencing platform, or any downstream consumer of the transmitted stream to infer heart rate, respiration, and stress correlates of a participant who is unaware that such inference is tractable on ordinary video. Nature Communications Engineering (2025) demonstrates the attack on standard video streams and characterizes signal-suppression countermeasures. The disclosed system addresses this threat directly: the same primitive that enables the disclosed coherence UX, applied to the unmodulated local source, is rendered non-recoverable on the outbound stream by a sub-pixel modulation that preserves perceptual video quality. The local UX and the outbound privacy posture are thus two faces of the same observation: that respiration and pulse are present in any reasonable webcam stream and that the question is which observers are permitted to see them.

The disclosed system applies the per-source primitives in the multi-participant video-conferencing context and treats the pairwise coherence of the recovered signals as a UX-surfaceable metric, gated to intervals of bilateral non-speech detected from the existing conferencing audio stream so that ordinary turn-taking anti-phase respiration is not misread as decoherence.

Stated structurally — and this is the load-bearing framing of the disclosure — the system is symmetric and peer-to-peer, recruits no auxiliary physiological sensor, and depends on no third-party aggregator. Each host independently computes its own γ from its own respiration trace and the traces of its peers, exchanged only over the conferencing session's own data channel. There is no biometric server in the architecture. This distinguishes the disclosed method from any inter-participant physiological-coherence system that funnels traces to a backend (e.g. Apple Health cloud, hospital telemetry aggregation, or wellness-platform analytics), and from any chest-strap / wearable inter-participant coupling system that adds physical sensors to the conferencing setup.

Summary of the invention per 37 CFR § 1.73

The disclosed system computes inter-participant respiratory coherence symmetrically and peer-to-peer, using only the webcam already running in the conferencing session, with no auxiliary physiological sensor and no third-party aggregator. At each participant's local host, ROI tracker (304) selects lower-face landmarks within the host's outgoing webcam frames (302), with chest landmarks used as a fallback when in frame. VM block (306) applies Eulerian-style sub-pixel motion magnification to the ROI to recover a respiration trace (308) within an adaptive band centered on each participant's detected respiration mode. Per-host respiration traces are exchanged peer-to-peer over the existing conferencing data channel; each host computes pairwise coherence γ symmetrically over a sliding analysis window, restricted to intervals of bilateral non-speech detected from the existing conferencing audio stream. Upon sustained crossing of γ below a configured threshold, the conferencing UI surfaces an ambient indicator reflecting the current coherence state — not a corrective nudge to alter behavior. Upon sustained high γ, a corresponding high-coherence indicator state may be surfaced. For N > 2 participants, the indicator is derived from the median pairwise γ or a per-participant coherence-to-group score.

Brief description of drawings Sheets 1 – 5

Part IV · Claims

Claims 2 independent · 7 dependent · 1 apparatus Draft v1.0
What is claimed is:

1. A method for computing inter-participant respiratory coherence in an active video-conferencing session, comprising:

  1. (a)at each of a plurality of participant hosts, selecting a region of interest (304) in that host's outgoing webcam frames (302) comprising lower-face landmarks, optionally extended to chest landmarks when in frame;
  2. (b)recovering, by sub-pixel motion magnification (306) applied to said region of interest, a respiration signal (308) within an adaptive analysis band centered on said participant's detected respiration mode;
  3. (c)exchanging said respiration signal with at least one peer host over an existing conferencing data channel, without exchanging raw video or audio for said purpose;
  4. (d)computing, at each host, a pairwise magnitude-squared coherence γ of the local respiration signal and the received peer respiration signal over a sliding analysis window restricted to intervals of bilateral non-speech detected from the existing conferencing audio stream;
  5. (e)upon a sustained crossing of γ below a configured threshold, surfacing within the conferencing user interface an ambient indicator with a period or amplitude derived from a function of the participants' respiration signals over said analysis window, said indicator reflecting the current coherence state without prescribing a target respiration rate to any participant; and
  6. (f)wherein steps (a)–(e) are performed independently at each said participant host, no server or third-party aggregator computes or aggregates γ, and the only off-host transmission is, at step (c), of the per-host respiration signal between participating peers over the existing conferencing data channel.

2. The method of claim 1, wherein the sustained-crossing condition at step (e) is satisfied by either (i) a continuous interval of at least 180 seconds with γ below the configured threshold, or (ii) an integrated coherence deficit ∫(γthreshold − γ(t)) dt over γ < γthreshold exceeding a configured value over the analysis window.

3. The method of claim 1, further comprising surfacing a distinct high-coherence indicator state upon a sustained interval of γ above a high-coherence threshold, said high-coherence indicator state likewise reflecting current state and not prescribing a target respiration rate.

4. The method of claim 1, wherein the only physiological-signal input to the system is sub-pixel motion features of frames produced by the camera primarily used to encode the participant's outgoing video for the conferencing session.

5. The method of claim 1 extended to N > 2 participants, wherein for each pair (i, j) the system computes γi,j, and the surfaced indicator is derived from either (i) the median of γi,j over all pairs, or (ii) for each participant i, said participant's mean coherence to all other participants, surfaced as a per-participant coherence-to-group score.

6. The method of claim 1, wherein the ambient indicator surfaced at step (e) comprises one or more of: (i) a graphical element overlaid on or adjacent to one or more participant tiles, said graphical element animated with a period substantially equal to the median respiration rate of said one or more participants over the analysis window; (ii) a chromatic shift applied to a non-foreground region of the conferencing user interface, said shift varying monotonically with γ; or (iii) a subtle modulation of an audio cue distinct from the conferencing audio, said modulation reflecting current γ; and wherein no element of said ambient indicator includes a text prompt or other instruction directing any participant to alter their respiration rate.

7. The method of claim 1, wherein the coherence statistic γ computed at step (d) is replaced by, or supplemented with, a phase-locking value (PLV) computed over the same analysis window, said PLV being the magnitude of the mean of the unit complex vectors representing the instantaneous phase difference between the participants' respiration signals; and wherein step (e) is gated by the resulting statistic in the same manner.

8. The method of claim 1, further comprising obtaining, at session start and prior to step (a) at any participant host, an explicit per-participant opt-in to the recovery and inter-participant exchange of the said participant's respiration signal; and wherein non-opted-in participants are excluded from γ computation at all hosts, with the conferencing user interface reflecting their exclusion in the manner that an absent peer is reflected.

9. A method for suppressing physiological-signal leakage in outbound conferencing video, comprising:

  1. (a)capturing webcam frames (502) at a participant host;
  2. (b)computing, locally and from the unmodulated said frames, a per-participant physiological signal (503) comprising at least one of respiration and pulse;
  3. (c)applying, prior to outbound encoding of the conferencing video, a sub-pixel modulation (504) in the spatial and temporal bands recoverable by remote photoplethysmography (rPPG) and by sub-pixel motion magnification, said modulation configured to preserve perceptual quality of the encoded video;
  4. (d)transmitting the modulated outbound stream (506) over the existing conferencing transport; and
  5. (e)wherein the unmodulated local physiological signal of step (b) is reconstructable by the local host at full fidelity, while a peer running rPPG or motion magnification on the received stream (508) cannot reconstruct said physiological signal at signal-to-noise ratios comparable to said unmodulated source.

10. A host computing device comprising a webcam, one or more processors implementing ROI tracker 304, a voice-activity-detection module reading the conferencing audio stream, VM block 306, coherence computation, an optional rPPG / VM suppression module 504, and a non-transitory memory storing instructions to perform claims 1 – 9.

Claims · 10 total · 2 independent · 7 dependent · 1 apparatus

Part V · Appendices

Prior-art bibliography Selected; not exhaustive

Part VI · Execution

Version history Draft · not filed

This descendant cites the RoomMirror parent specification (see /roommirror) for the underlying respiration-inference primitive and adds one narrow claim group (the pairwise-coherence UX trigger in a video-conferencing context).

/breathlink · v1.0 · drawing-stage · child of /roommirror
Index