大亚湾实验离线数据处理 何苗 中国科学院高能物理研究所 2017年6月6日 中国科学院成都情报文献中心
大亚湾反应堆中微子实验 在科技部、基金委、科学院、以及地方政府和中广核的联合支持下,设计和建造了大亚湾实验装置(2012年建成),目标为测量中微子混合角θ13 ,性能国际领先。 中国承担全部土建和一半的探测器,由高能所牵头。 美国承担约一半的探测器,俄、捷、港、台实质性贡献。 3000 米隧道 5 个地下实验厅 8 个 110 吨重的中微子探测器 3 个水切伦科夫探测器(4400 吨纯净水) 3200 m2 阻性板探测器 8000 道电子学读出。 2017-06-06 大亚湾实验离线数据处理
大亚湾实验运行历史 其中一个探测器退出正常运行,专门用于JUNO液闪研究 2017-06-06 大亚湾实验离线数据处理
大亚湾完成的物理研究 发现新的中微子振荡,PRL108, 171803 (2012),QSPIRES 引用1689次 探测器性能、建立分析方法,NIM A685, 78(2012),引用100次 论文 引用 备注 2013 13改进,CPC37, 011001 342 2014 13能谱分析与质量平方差,PRL112, 061801 234 编辑推荐 氢俘获测量13,PRD90, 071101(2014) 41 寻找惰性中微子,PRL113,141802 59 2015 13和质量平方差(8AD),PRL115, 111802 103 2016 反应堆中微子谱, PRL 116, 061801 57 氢俘获测量13的改进,PRD93, 072011 15 亮点 惰性中微子改进,PRL117,151802 13 惰性中微子联合分析,PRL117,151801 14 2017 改进反应堆中微子谱,CPC41, 013002 12 13的改进(4%精度),PRD95, 072006 6 引用次数截至2017年3月 2017-06-06 大亚湾实验离线数据处理
Online data taking monitoring of trigger rate Daya Bay data taking Multiple data streams from each experiment hall (EH) Data taking efficiency Data taking time > 97% Physics data taking time > 95% Typical trigger rate EH1: 1.3kHz EH2: 1.0kHz EH3: 0.6kHz Data volume 320 raw data files per day, 1GB per file, raw data volume 100TB/year EH1 physics run: ~48 hours/run EH2 Pedestal run and electronics diagnosis run EH3 Online data taking monitoring of trigger rate 2017-06-06 大亚湾实验离线数据处理
A global picture of data processing Raw Data Daya Bay onsite IHEP/lxslc5 LBNL/pdsf spade spade spade Onsite data processing/PQM DB scraper Keep-Up Production Online DB DB scraper Onsite DB Offline DB DCS DB DQ strip charts at SJTU Data Quality DB Offline data monitor (ODM) at IHEP/LBNL 2017-06-06 大亚湾实验离线数据处理
Data transfer and storage IHEP, Beijing in 10~15 minutes LBNL, California in 15~20 minutes Daya Bay onsite, in 5 minutes Data transfer monitoring Data storage: Two copied of raw data on disk: one at IHEP, the other at LBNL Four copies of raw data on tape: two at IHEP, two at LBNL Disk at IHEP: 2.0PB (1.4PB used) Daya Bay to IHEP IHEP to LBNL 2017-06-06 大亚湾实验离线数据处理
Offline software (1) Software framework (NuWa): Neutrino at Daya Wan 女娲 Offline software (1) Software framework (NuWa): Neutrino at Daya Wan Adoption of LHCb/ATLAS Gaudi framework provides a fully developed component system for simulation, reconstruction and analysis Bitten-slave based auto building and testing system running on multiple offline servers Extending Gaudi Transient Event Store (TES) to Archive Event Store (AES) for prompt-delayed analysis Keeps data objects in memory across execution cycles. Allows users to look for correlated events in past. Configurable based on TES location. 2017-06-06 大亚湾实验离线数据处理
Offline software (2) A Lightweight Analysis Framework (LAF) 女娲 Offline software (2) A Lightweight Analysis Framework (LAF) Compatible with NuWa data objects with higher I/O performance Flexible data buffer allows to access events backwards and forwards Multiple analysis modules running simultaneously 2017-06-06 大亚湾实验离线数据处理
Database 2017-06-06 大亚湾实验离线数据处理
Calibration and reconstruction Calibration sources Weekly calibration runs: LED, radioactive sources Calibration samples in physics runs: PMT dark noise, spallation neutron (spn) Calibration automation File-by-file track (Rolling gain, spn energy scale, channel quality): automated accumulation of calibration sample and generation of calibration constants Run-by-run track (LED gain, Co60 energy scale): semi-automation to find calibration data and generate calibration constants Calibration constants version control Using the rollback date of DBI: choose the latest calibration constants that were inserted into offline database before the rollback date 2017-06-06 大亚湾实验离线数据处理
Onsite data processing and monitoring Physics Performance Monitoring (PQM): quasi real-time data processing onsite using the offline software Raw data are processed file by file using the Portable Batch System (PBS) 16 dedicated CPU cores and 40 additional cores shared with users Web display Latency: ~40 minutes Example1: Number of blocked triggers in one event vs. run time Example2: Reconstructed energy distribution for all triggers in one AD 2017-06-06 大亚湾实验离线数据处理
Offsite data processing and monitoring A “Keep-up” production (KUP) is running at IHEP and LBNL, using the latest calibration constants and full reconstruction KUP job trigger by data transfer Web display using an Offline Data Monitor (ODM) Occupied computing resource: ~ 30 cores Latency: ~ 3 hours Multifunctional Daya Bay ODM Example plots on ODM 2017-06-06 大亚湾实验离线数据处理
Physics data production Physics production (PP) uses the validated and frozen calibration constants and reconstruction algorithms. Software version and calibration version are separated. A special production strategy for the first two publications Fix offline software, update calibration constants and extend production on the weekly basis Data volume: ~1.2×raw data up to 2014. 0.6×raw data now. Physics production takes place once per year, 1-2 month processing time for each production. Weekly data production Weekly calibration 2017-06-06 大亚湾实验离线数据处理
Data Management Reconstruction data set P15A ls /dybfs/rec/ P11A/ P12A/ P12B/ P12C/ P12D/ P12E/ P13A/ P14A/ P14B/ P15A/ P15A ls /dybfs/rec/P15A/GoodRun_6ADv1_8ADv3/ EH1/ EH2/ EH3/ ls /dybfs/rec/P15A/GoodRun_6ADv1_8ADv3/EH1 more /dybfs/rec/P15A/GoodRun_6ADv1_8ADv3/EH1/run45290.list 2017-06-06 大亚湾实验离线数据处理
Data quality Good/bad file tagging based on auto check and manual check on data quality Data quality strip chart: history of the detector performance Daily auto check Manual check and comment Antineutrino candidate rate Energy scale stability 2017-06-06 大亚湾实验离线数据处理
总结:大亚湾数据及数据处理特点 多数据流:三个实验大厅独立取数、独立传输与存储、独立重建。 两个离线数据中心站点:①高能所;②伯克利实验室。各自独立存储数据、进行数据重建和分析。 事例时间关联分析:反应堆中微子具有多个信号先后关联的特点,要求软件提供事例缓存。给事例筛选增加难度。 数据积累分析:每次物理分析总是使用过去所有积累的数据。为提高重建及分析效率,原始数据始终在磁盘上。 数据实时监控:在线监控(DQM)、现场离线监控(PQM)、离线站点监控(ODM)三级监控手段保证数据质量及时反馈和控制。 2017-06-06 大亚湾实验离线数据处理