PalmEcho · drawing set · v1.0 ▸ NEXT EXPERIMENT
Part I · Cover Docket · Abstract · v1.0 skeleton
- Apple US 9,086,738 (Tsudik 2013) — chassis taps via accelerometer/gyroscope
- Apple Back Tap (iOS 14, 2020) — commercial, accelerometer-based
- Touch & Activate (Ono UIST '13) — active-acoustic, attached transducer
- Apple Back Tap uses accelerometer · this uses acoustic impulse-response
- Touch & Activate attaches transducer · this uses built-in commodity I/O
- EclipseTouch (UIST '25) — external surface, worn IR
- UbiTap / MM-Tap — external table
- Apple Force Touch — trackpad pressure only
Part II · Drawings FIG. 1 – 4 · Sheets 1 – 4
Part III · Specification Background · Summary · Brief description
Prior systems for acoustic touch input on commodity hardware include VibSense (SECON 2017), UbiTap and S-UbiTap (SenSys 2018; TMC 2023), MM-Tap (TMC 2023), and most recently EclipseTouch (UIST 2025). All such systems treat an external surface — a tabletop, wall, or ad-hoc plane — as the input region, and require the user to set down or position the host device adjacent to that surface.
The disclosed system inverts the assumption: the host computing device's own chassis surfaces (palmrest, lid, bezel) serve as the input region, using the device's built-in microphones and loudspeakers as acoustic ranging transducers. No external surface is needed; no setup ritual is performed; no environmental priors are required. The novelty lies in where the tap is detected (on the device that hosts the input pipeline), not in the acoustic-ToF physics, which builds on the cited prior art.
Capacitive trackpad input (e.g. Apple Force Touch, US 10,162,447 B2) covers a restricted spatial region of the chassis (the trackpad surface) and depends on a dedicated force sensor. The present disclosure expands the input region to substantially the entire upper chassis without additional hardware.
The broader concept of mapping a tap on a device's non-touchscreen chassis surfaces to an input event is anticipated by Apple US 9,086,738 (Tsudik, 2013, "Fine-tuning an operation based on tapping"), which discloses detection of taps on a device's sides and other non-touchscreen portions via the device's accelerometer and gyroscope, mapped to granular on-screen controls. Apple's Back Tap feature (iOS 14, 2020) is the commercial embodiment, detecting double- and triple-tap gestures on the back glass of an iPhone via accelerometer pattern matching. Both are acknowledged prior art on the chassis-as-input-region concept; PalmEcho is distinguished by sensor modality (active acoustic ranging via built-in speakers and microphones, recovering an impulse-response perturbation) rather than passive accelerometer pattern matching. Touch & Activate (Ono et al., UIST 2013) is the closest active-acoustic prior art and is similarly distinguished: it requires an attached vibration speaker and piezoelectric microphone affixed to the target object, whereas the disclosed system reuses the host device's built-in loudspeakers and microphones without any added transducer.
The disclosed system converts the host computing device's own chassis surfaces into an input region, using only the loudspeakers and microphones already integral to the device. No external sensor, no piezoelectric or accelerometer surface instrumentation, and no surface attachment is required.
A host computing device comprising at least one loudspeaker (102), at least one microphone, and one or more processors, is programmed to emit a periodic inaudible swept-sine chirp (202) in the 18–22 kHz band, receive the resulting acoustic signal at the microphone, recover an impulse-response estimate by matched-filter deconvolution, and classify a region (R1, R2, R3, none) of the device chassis at which a finger tap (106) has occurred within the impulse-response window.
Region classification is performed by a small classifier (208) operating on an eight-dimensional feature vector (206) extracted from the impulse-response window (204). The classifier outputs are converted to host-device input events (e.g. application-launch, focus-toggle) by an event dispatch module (312). All processing is on-device; raw audio samples are not retained after feature extraction.
- FIG. 1A top-down view of laptop chassis 100 showing loudspeakers 102, 104; microphones 110, 112, 114; out-of-scope keyboard region 116 (hatched); trackpad 118; candidate input regions R1, R2 (palmrests, dashed) and R3 (lid / bezel); hinge 130; finger approach trajectory 120; tap event 126 at t = 0; direct-path acoustic arrival 122 (Δt₁, solid) and reflected-path arrival 124 (Δt₂, dashed); chassis dimension 140; and legend 150.
- FIG. 2The acoustic ToF pipeline across three panels: swept-sine chirp 202 (18 → 22 kHz over 1 s) shown as a time-frequency band; impulse response 204 with direct-path peak 206 (Δt₁), reflected-path peak 208 (Δt₂), and peak-pick window 210; 8-dimensional feature vector 212 (τ₁, τ₂, Δf, ρ, E₁, E₂, E₃, c) fed to kNN classifier 214 yielding region probabilities 216 (R1, R2, R3, ∅); timing budget 218 totalling ≤ 5 ms compute per chirp.
- FIG. 3A two-lane functional block diagram of the on-device pipeline 300: emit path 322 comprising chirp generator 302, CoreAudio out 304, and speakers 306 (hardware, hatched); sense path 326 comprising mic array 308 (hardware), CoreAudio in 310, deconvolution 312, feature extractor 314, kNN classifier 316, and event dispatch 318; cross-flow 324 (in-air chassis-borne acoustic); OS target 320; clock domains and legend 328 / 330.
- FIG. 4A per-chirp state machine 400 with initial-state indicator 412 and four states arranged as rounded rectangles: IDLE 402, LISTEN 404 (primary state, double-stroke emphasis), CLASSIFY 406, EMIT EVENT 408. Nominal transitions labelled with event/guard/action; hysteresis pair 414 (LISTEN ↔ IDLE with τ_hi / τ_lo) and recovery transition 416 (CLASSIFY → IDLE on low confidence) shown dashed. Legend 410 defines thresholds τ_hi, τ_lo, τ_p and arrow conventions.
Part IV · Claims 5 total · 1 indep · 3 dep · 1 apparatus
1. A method for detecting a tap event on a chassis surface of a host computing device, comprising:
- (a)emitting, from at least one loudspeaker (102, 104) integral to said host device, an inaudible swept-sine acoustic signal (202) in a frequency band above 18 kHz;
- (b)receiving, at at least one microphone integral to said host device, an acoustic response to said emitted signal;
- (c)recovering an impulse-response estimate by matched-filter deconvolution of said received signal against said emitted signal;
- (d)extracting a feature vector (206) from a temporal window (204) of said impulse-response estimate; and
- (e)classifying said feature vector to one of a plurality of regions of said chassis (R1, R2, R3) or to a no-tap state, and emitting a host-device input event (410) as a function of said classification;
- (f)wherein the loudspeaker(s) of (a) and the microphone(s) of (b) are integral to said host device and serve as the sole acoustic transducers for said tap detection, with no external sensor, no piezoelectric or accelerometer surface instrumentation, and no surface attachment recruited for said detection; and
- (g)wherein raw audio samples received at said microphone are not retained on-host beyond feature extraction, and no raw audio, no impulse-response sample stream, and no per-user identifier is transmitted off-host.
2. The method of claim 1, wherein said feature vector (206) comprises lag and amplitude of the two strongest impulse-response peaks within said temporal window, energy in three temporal sub-bands, and a temporal centroid.
3. The method of claim 1, wherein said classification is performed by a k-nearest-neighbor classifier (208, 310) trained on a small per-user calibration set covering each of said plurality of regions and a no-tap baseline.
4. The method of claim 1, wherein said host-device input event (410) is mapped via a configurable map to one of an application-launch command, a window-focus command, a media-control command, and an accessibility command.
5. A host computing device, comprising:
- (a)at least one loudspeaker (102, 104) and at least one microphone integral to said device;
- (b)one or more processors implementing a chirp generator (308), deconvolution and feature extractor (306), classifier (310), and event dispatch (312); and
- (c)a non-transitory memory storing instructions which, when executed by said processors, cause said device to perform the method of any of claims 1 – 4.
Part V · Appendices Prior-art bibliography
- Wang, J. et al. VibSense: Sensing Touches on Ubiquitous Surfaces through Vibration. SECON 2017. (Best Paper.)
- Nandakumar, R. et al. UbiTap: Leveraging Acoustic Dispersion for Ubiquitous Touch Interfaces on Solid Surfaces. SenSys 2018.
- Liang, Y. et al. S-UbiTap: Scalable Ubiquitous Touch on Commodity Phones. IEEE TMC 2023.
- Zhang, Z. et al. MM-Tap: mm-Level Acoustic Touch on Ad-hoc Surfaces. IEEE TMC 2023.
- Su, Y. et al. EclipseTouch: Touch Segmentation on Ad Hoc Surfaces using Worn Infrared Shadow Casting. UIST 2025.
- Apple Inc. Force-Sensitive Input Device. US 10,162,447 B2.
- Lyon, R. F. Speech Recognition in Scale Space. Acoustic-ToF foundational text, IEEE ASSP 1982.
Part VI · Execution Version · v1.0 · Skeleton
- v1.02026-05-12 · Skeleton draft. Cover, 4 drawings, summary, 5 claims, short bibliography. Gate-pending the W1 experiment.
The applicant retains this draft in personal records. No filing has been made; no priority claim is asserted. Promotion of this page to a full provisional-application draft (in the form of /echocast or /roommirror) is conditioned on the W1 gate reaching ≥ 70 % kNN(3) leave-one-out accuracy on a four-class chassis-tap task.