Deep Room Recognition Using Inaudible Echos ubicomp '18, Sempter, 2018 QUN SONG, CHAOJIE GU, Rui Tan, Nanyang Technological University
ABSTRACT increasing need of localization by mobile application, output 2ms single-tone inaudible chirp by speaker, capture the echos by microphone, a narrow inaudible band for 0.1s, learning to capture the subtle fingerprints, two layer nn achieve best performance, design a RoomRecoginition cloud service and client, infrastucture-free and no add-on hardware, robustness against interfering sounds.
Physical perspective
Software Framework
CONTENTS 01 02 03 04 05 06 07 08 INTRODUCTION RELATED WORK MEASUREMENT STUDY 04 DEEP ROOM RECOGNITION 05 DEEP ROOM RECOGNITION CLOUD SERVICE 06 Performance Evaluation 07 DISCUSSIONS 08 CONCLUSION
01 SECTION INTRODUCATION
1. INTRODUCTION Indoor localization, Various RF, VL, imaging, acoustics, geomagnetism, Each sensing modality bears limitation, this paper design room-level localization approach for off-the-shelf smartphones using audio system
1. INTRODUCTION Room level localization desirable widely The requirement of existing indoor localization R1.dedicated/existing infrastasture R2.add-on equipment R3.(training)process for data acoustic-based room-level localization only R3 casted into a supervised multiclass classification problem easy training data collection(enter,click,key),no-expert
1. INTRODUCTION existing indoor localization system incorporated acoustic sensing SurroudSense(only outperform random guessing, wide audible band, susceptible ambient) Two basic challenge to room recoginition system privacy concern 20khz annoyance limited information about the measured room
1. INTRODUCTION The emerging DL methods demonstrated in image classification, speech recognition, NLP and etc, This paper present the design of a deep room recognition approach, The experiment shows a two-layer CNN fed with spectrogram of the captured inaudible echos achieves the best performance, 100ms audio recording after a 2ms 20khz single-tone chirp batphone 需要10秒的声音文件长度,对隐私问题就不是很好;
1. INTRODUCTION The contributions in-depth measurement study on rooms' acoustic responses to a short-time single-tone inaudible chirp, design of deep model, evaluation of our approach in real-world, engineer implement, chirp信号是,频率随着时间变化;
02 SECTION RELATED WORK
2. RELATED WORK infrastructure-dependent RF infrastructure, 802.11[8,18], cellular[20], FM radio[10], aircraft broadcast[15], WALRUS inaudible beacons to localize mobile, infrastructure-free geomagnetism[11], imaging[13], acoustics acoustics divided into passive sensing, surroundsense[36] active sensing, senmantic location[16,24,34] RoomSense[31],audible,0.68s,SVM, MFCC
2. RELATED WORK active acoustic sensing ranging, BeepBeep[30], SwordFight[40], moving object tracking, finger[38],breath[28], a human body using inaudible chirp[29], gesture recognition,Soundwave[17], doppler-shifted reflection.
03 SECTION MEASUREMENT STUDY
3. MEASUREMENT STUDY lab, measured room 'Lx', opern area 'OA'
3.1 Passive Acoustic Sensing confusion matrix of batphone Batphone,看出 Guoguo, infrastructure acoustic can obtain high accuracy
3.2 Rooms' response to single-tone chirps use loudspeaker to emit an acoustic chirp and use microphone to capture the measured room's response, conduct measurement study to obtain insightful observations on the rooms' responses, every 100ms emit a time duration of 2 ms chirp wave 44.1khz sample rate chirp not overlap echos 34cm away 每个房间选两个点,每个点放一个手机,运行程序半小时,以达到足够的统计意义;第6小节调查了最小训练数据量;
3.2 Rooms' response to single-tone chirps existing active acoustic sensing, sine sweep chirp, maximum length sequence, multi-tone chirp, propose use a sine-tone inaudible chirp to avoid the annoyance to the user and improve the robustness of the room recognition against interfering sounds, 每个房间选两个点,每个点放一个手机,运行程序半小时,以达到足够的统计意义;第6小节调查了最小训练数据量;
3.2 Rooms' response to single-tone chirps 20khz and 21.khz chirp 高于20khz之后,回音的畸变很大,
3.2 Rooms' response to single-tone chirps The decreasing trend indicates that audio system's performance decrease with the frequency. choose 20khz, the lowest inaudible frequency, android Near Ultrasound Tests 高于20khz之后,回音的畸变很大, 声音总可以被分解为不同频率不同强度正弦波的叠加。这种变换(或分解)的过程,称为傅里叶变换。 一般的声音总是包含一定的频率范围。 人耳可以听到的声音的频率范围在20到2万赫兹(Hz)之间。 高于这个范围的波动称为超声波,而低于这一范围的称为次声波。
3.2 Rooms' response to single-tone chirps 2ms,0.5ms 97.5ms echo data period 13.8ms in L3 and outdoor corrletion relationship 高于20khz之后,回音的畸变很大, 声音总可以被分解为不同频率不同强度正弦波的叠加。这种变换(或分解)的过程,称为傅里叶变换。 一般的声音总是包含一定的频率范围。 人耳可以听到的声音的频率范围在20到2万赫兹(Hz)之间。 高于这个范围的波动称为超声波,而低于这一范围的称为次声波。
3.2 Rooms' response to single-tone chirps Frequency-domain analysis guess different rooms have different freqency responses 97.5ms, 10.3hz 4s语音数据,用来手机数据,40个回音数据周期,可能会引起隐私问题,增加计算负载; 最小化处理语音计算的时间来减少隐私问题和计算负载;
3.2 Rooms' response to single-tone chirps Frequency-domain analysis guess different rooms have different freqency responses 97.5ms, 10.3hz 4s语音数据,用来手机数据,40个回音数据周期,可能会引起隐私问题,增加计算负载; 最小化处理语音计算的时间来减少隐私问题和计算负载;
04 SECTION DEEP ROOM RECOGNITION
4.1 Background and Problem Statement Traditional classification algorithm bayes classifier and SVM dimension reduction MFCC, PLP coefficient
4.2 Raw Data Format/Deep Model PSD and spectrogram are two possible raw data formats for deep learning FFT on the 4300 data points in the echo data use 147 points in the [19.5,20.5]khz
4.2 Raw Data Format/Deep Model 22000samples from 22 rooms, every sample 100ms training set, validation set and testing set
4.3 Hyperparameter Settings 理论上是可以通过优化图来解决SLAM的问题;但是实际上,SLAM框架对false positive比较敏感; 提出新的方法来解决;
DEEP ROOM RECOGNITION CLOUD SERVICE 05 SECTION DEEP ROOM RECOGNITION CLOUD SERVICE
5.1 System Overview RoomRecognition support a participatory learning mode, in which CNN is retrained when a mobile client uploads labeled training samples existing studies shows that smartphones and even lower-end IoT platform can run deep models 1、经常会离线
PERFORMANCE EVALUATION 06 SECTION PERFORMANCE EVALUATION
6.1 Evaluation Methodology low performance of passive acoustic sensing only compare RoomRecognize with other based on active acoustic sensing, RoomSense
07 SECTION DISCUSSIONS
7 DISCUSSIONS Moving people in target rooms results in RoomRecognize's performance because of human body To address this issue, other sensing modality geomagnetism, in future, Similar room worth study
08 SECTION CONCLUSION
8 CONCLUSION To address the challenges of limited information carried by the room's response in such a narrow band, applied deep learning capture the subtile difference in room's reponses. two-layer CNN fed with the spectrogram of the echo RoomRecognize
Software Framework
Thank you!