Presentation is loading. Please wait.

Presentation is loading. Please wait.

高能物理实验离线数据处理 李卫东 计算中心,2016/11/04.

Similar presentations


Presentation on theme: "高能物理实验离线数据处理 李卫东 计算中心,2016/11/04."— Presentation transcript:

1 高能物理实验离线数据处理 李卫东 计算中心,2016/11/04

2 内容 系统简介 蒙特卡洛模拟 刻度和重建 Gaudi和离线软件平台 参考文献 Gaudi 软件框架 BESIII 软件平台的实现
数据和作业的管理 参考文献 Weidong LI

3 系统简介 Weidong LI

4 BEPCII/BESIII 北京正负电子对撞机重大改造工程 (BESIII/ BEPCII),总投资 6.4 亿元,对撞机亮度要提高约100 倍。 高能物理实验:加速器、探测器、离线软件和物理分析。 Weidong LI 4

5 BESIII 探测器 主漂移室(MDC): xy = 130 m 超导磁铁 1.0 Tesla P/P = 0.5 %@1 GeV
dE/dx = 6-7 % 超导磁铁 1.0 Tesla 飞行时间计数器: T = 90 ps barrel 110 ps endcap Muon 子计数器 : RPC based 量能器 (EMC):E/E = 2.5 % @ 1 GeV   z, = GeV Weidong LI

6 离线数据处理 event filter online selection event classifying event
detector reconstructed data processed data raw data event classifying event reconstruction DST data DST data event simulation DST data Weidong LI

7 离线数据处理任务 Traditional offline data processing: Physics generators
Detector simulation Calibration and alignment Reconstruction Physics analysis Weidong LI

8 软件设计总体思路 (1) 需求:BESIII 实验生命周期约10年,产生 PB量级的海量实验数据。 和国际上先进的高能物理实验软件技术接轨
软件设计总体思路 (1) 需求:BESIII 实验生命周期约10年,产生 PB量级的海量实验数据。 和国际上先进的高能物理实验软件技术接轨 采用通用框架软件 GAUDI,面向组件/面向对象技术。 广泛使用ROOT、CLHEP、Geant4、GDML等高能物理外部库。 主要计算机语言:C++、Java、Python 具有规范的软件配置管理、测试和发布流程,确保软件质量。 使用数据库管理实验相关的大量参数,同时实现异地自动同步更新。 软件系统能够适应计算机硬件和操作系统的升级,网格/云计算环境。 Weidong LI

9 软件设计总体思路 (2) 建立一套完整的离线数据处理和物理分析的核心算法,包括: 探测器模拟、数据重建、物理分析工具
软件设计总体思路 (2) 建立一套完整的离线数据处理和物理分析的核心算法,包括: 探测器模拟、数据重建、物理分析工具 同时实现离线数据处理的高效率和自动化 Weidong LI

10 离线软件的规模 300多个离线软件包 约47万行源代码 6年时间,30人/年 培养25名博士,10名博士后。 Weidong LI

11 硬件环境 Document Management Web Content Management 4200 CPU Cores
Work Nodes Storage Disk Iaas SaaS Tape Lib 4200 CPU Cores 5PB Tape Lib IaaS/PaaS/SaaS 1.3PB Storage Weidong LI

12 粒子物理实验模拟原理和运用 透明片提供(刘怀民研究员)

13 Simulation in HEP 模拟什么? 从反应开始到数据获取
。束流对撞产生的各种反应、物理事例,事例产生子(event generator) 根据反应机制计算微分截面,给出事例末态粒子四动量 。粒子在探测器中输运和相互作用(tracking) 根据探测器的结构(geometry and material),计算各种可能相互作用截面,给出粒子在灵敏探测器中的击中信息(hit) 。探测器的响应,信号的产生(digitization) 根据探测器的工作原理和击中信息,经给出输出信号。这一过程需要许多beam test 的实验结果,如效率、分辨率等

14 physics accelerator e+ e- generator event tracking MC detector hit
digitization digit data efficiency reconstruction track MC background analysis signal uncertainty

15 BESI or BESII BESIII π+ π- π+ ρ0 π- ρ0 J/ψ moving e+ e- J/ψ at rest
crossing collision Head-on collision e+ e- π0 γ π0 γ γ γ BESIII

16 信号(Signal)与本底(Background)
J/ψρ0π0 π+π-  不变质量(invariant mass) E12=p12+m12; E22=p22+m22 M122=(E1+E2)2-(p1+p2)2 信号 本底 M12

17 模拟(simulation)与重建(reconstruction)
模拟是已知反映过程和探测原理产生数据(探测器信号) 重建是根据探测器信号还原事例末态粒子(γ,e,μ,π,K,p)的信息(粒子种类和四动量) 分析是根据重建信息还原反应过程(中间态粒子,判断有无新粒子产生和新物理现象出现) 无论模拟数据还是实验数据都必须经过重建处理才能被最终的物理分析用户使用. 从某种意义意义上讲, 模拟是重建和分析的逆过程

18 事例:物理反应经探测器后的记录 (MC)

19 事例:物理反应经探测器后的记录 (data)

20 Simulation in HEP 如何模拟? 抽样实验法,模拟主要过程
。Generator:根据理论编写或利用(移植)各种事例产生子库(如 jetset),末态四动量依微分截面抽样。 对e+ e–  μ + μ – ,参考 p2mumu 产生子 。Tracking:利用模拟软件包,如GEANT(GEOmetry ANd Tracking),GCALOR等。相互作用依截面大小抽样,相互作用末态依微分截面抽样 。Digitization:根据探测器工作原理和电子学设计。是否产生信号依效率抽样,信号幅度依谱分布抽样。依赖于具体探测器(同时考虑触发和噪声)

21 Simulation in HEP 模拟三阶段
。实验设计阶段:物理目标的模拟,探测器性能的模拟。通常采用快速近似的模拟方法(fast simulation) 。完整模拟(full simulation)阶段:根据探测器的设计法案开发模拟软件程序。相应的模拟用来调试离线重建(reconstruction)和分析软件 。调试(validation)阶段:与实验数据相比较,调整各种相互作用模型参数,检查各种分布

22 探测器模拟常用软件包 EGS3/EGS4 (Electron Gamma Shower)
(Oak Ridge National Laboratory) FLUKA(FLUktuerende KAskade) (INFN and CERN) GEANT3/4(GEmetry ANd Tracking) (CERN) MCNP(Monte Carlo N-Particle) (Los Alamos National Laboratory)

23 GEANT 模拟粒子在物质中传输和相互作用的软件工具 用户 负责开发的部分 1。探测器描述(几何、物质,基本粒子定义)
2。物理相互作用和粒子追踪(强相互作用通常有不同的模型,采用第三方软件) 用户 负责开发的部分 1。构建探测器(灵敏探测器) 2。事例产生子 3。记录击中信息(HITS) 4。根据击中信息产生信号(DIGITS) 5。建立MC-truth信息

24 应用实例:测量分支比 Phys. Rev. D 70, , 2004 MC

25 事例选择(CUTS):保留信号,压低本底
。好带电径迹:Vxy<2cm, VZ<20cm, Pt<60MeV, cosθ<0.8 。好光子:Ebsc>60Mev, θ(γπ)>100 。好事例: 好带电径迹数=2, 好光子数>=2 。粒子鉴别:运动学拟合5C,保证好光子来自π0 。去本底事例 K+K- π0,运动学拟合 。去本底事例 :γ->ee, θ(π π)>100, γee, γη’ 经以上选择条件得到 个事例

26 mc 选择效率(接收度) 。用J/Ψ->ρπ产生子产生事例100000个,并经探测器模拟和重建。
。用前面同样的事例选择条件分析,选择出17830个事例 。由此得出效率 17.83% mc

27 mc 本底估计(污染率) 。用J/Ψ->anything产生子产生事例58M个,并经探测器模拟和重建。
。用前面同样的事例选择条件分析,选择出3799个事例(K*K,γη’) 。归一后得污染率 1.7%,由此得出修正因子98.3%

28 mc 误差分析和结果

29 BESIII 探测器精确模拟软件 Weidong LI 07/05/2012

30 复杂的几何物质描述 在高能物理实验中,模拟对于物理分析至关重要。基于Geant4开发包括探测器几何描述、数字化和真实化等过程。
实现大量不规则复杂结构的精确描述;发展基于GDML格式的统一几何数据服务 ,保证软件系统内几何描述的一致性。 Weidong LI 30

31 探测器真实化 利用随机触发数据,将本底信号与模拟信号在击中级别相叠加。考虑了对撞亮度的变化在本底混合中的影响,再现真实探测器的情况。
MDC总击中数 TOF击中谱 EMC光子数目 MUC击中数 31 Weidong LI

32 模拟与数据一致性 模拟得到的关键物理量均与数据符合得很好,径迹重建效率和粒子鉴别效率模拟与数据的差别约1%,国际先进水平。
模拟运行稳定,满足海量模拟数据产生的要求。 Pion粒子鉴别效率模拟和数据对比图 Pion径迹重建效率模拟和数据对比图 EMC能谱形状模拟与数据对比图 Weidong LI MUC径迹重建效率模拟与数据对比图 32

33 性能优越的数据重建软件 Weidong LI

34 漂移室(MDC)重建 径迹重建的任务是根据漂移室产生的原始击中进行模式识别并进行径迹拟合,计算带电粒子的动量和方向等物理信息。
特点:BESIII漂移室重建采用联合重建方法解决高本底下径迹识别难题,一种是基于径迹段的模板匹配的方法,另一种基于共形变换的方法进行寻迹。 高能物理 实验 空间分辨(mm) dE/dx分辨 CLEOIII (美国) 110 5.0% Babar (美国) 125 7% Belle(日本) 130 5.6% BESIII 135 = 135 mm 桶部Bhabha事例重建效率约99%。 Weidong LI 34

35 飞行时间计数器(TOF)重建 TOF离线重建主要根据 TOF测量的带电粒子的原始时间和脉冲幅度
漂移室重建得到的带电径迹的动量以及通过径迹外推得到的击中位置 飞行距离等信息以及事例起始时间等 计算粒子的飞行时间等物理量,进行粒子鉴别。 高能物理实验 时间分辨 BELLE (日本) 90 ps CDF II (美国) 100 ps BESIII 77 ps Weidong LI

36 量能器(EMC)重建 量能器的重建,就是寻找电磁簇射在晶体中沉积能量形成的簇团,进而计算入射粒子的总能量和击中位置。 特点
量能器重建中进行TOF沉积能量重建,并与EMC匹配计算光子能量,大幅度提高了能量分辨与光子探测效率; 还首创了利用EMC时间信息有效去除本底的方法,使低能下束流本底排除率高达75%。 电磁簇射 Fake photon after Time cut True photon 使用时间信息排除本底 Weidong LI 36 07/05/2012

37 缪子鉴别器(MUC)重建 MUC事例重建分为以下步骤:几何构建、径迹寻找、径迹拟合、径迹参量计算。
特点:拥有主漂径迹外推算法,自重建算法,和EMC/MUC击中联合对撞点外推算法。 外推算法 Weidong LI 自重建算法

38 可靠的物理分析工具软件 Weidong LI 38

39 粒子鉴别 子探测器各自给出用于粒子鉴别的特征物理量。 使用优化的耦合算法联合各个特征信息量。 Weidong LI 39

40 运动学拟合和顶点拟合 利用粒子相互作用或者衰变时所遵循的物理定律来提高测量精度的工具 运动学拟合 顶点拟合
在BES III中运动学拟合使用拉格朗日乘子法和卡尔曼滤波方法来提高测量精度 除了能够处理常用的约束外,还能处理带有虚粒子的约束 顶点拟合 采用卡尔曼滤波方法和全局的最小二乘法,得到顶点位置。并有效地去除假径迹,提高测量精度 Weidong LI 40

41 推广程度 软件系统正被国内外49个大学/研究单位,300余名科学家使用。 http://bes3.ihep.ac.cn China (29)
IHEP, CCAST, GUCAS, Univ. of Sci. and Tech. of China Shandong Univ., Zhejiang Univ. Huazhong Normal Univ., Wuhan Univ. Zhengzhou Univ., Henan Normal Univ. Peking Univ., Tsinghua Univ. , Zhongshan Univ.,Nankai Univ. Shanxi Univ., Sichuan Univ Hunan Univ., Liaoning Univ. Nanjing Univ., Nanjing Normal Univ. Guangxi Normal Univ., Guangxi Univ. Hong Kong Univ. Chinese Univ. of Hong Kong Huangshan College, Lanzhou Univ. Hangzhou Normal Univ. Henan Univ. of Sci. and Tech. Sun Yat-sen Univ. Europe (11) GSI, Germany University of Bochum, Germany University of Giessen, Germany Johannes Gutenberg University of Mainz , Germany Helmholtz Institute Mainz, Germany JINR, Dubna, Russia Budker inst. of Nucl. Phys.,Russia KVI/University of Groningen, Netherland University of Turin, Italy INFN, Laboratri Nazionali di Frascati, Italy Turkish Accelerator Center,Turkey USA (6) University of Hawaii University of Washington University of Minnesota Carnegie Mellon University University of Rochester Indiana University 软件系统正被国内外49个大学/研究单位,300余名科学家使用。 Other institutes in Asia (3) Tokyo University, Japan Seoul National University, Korea University Of Punjab Lahore, Pakistan Weidong LI

42 刻度和重建 Weidong LI

43 MDC Calibration (1) X-T relation, T0 and Q-T function calibrations
Begin TrkReco / MdcPatRec KalFitAlg MdcCalibAlg New calibration data End MdcCalibFunSvc Event loop T0刻度是对电子学通道时间零点的标定。BESIII漂移室总共有6796个电子学时间测量通道,各通道之间会存在差异,因此每个单元必须有各自独立的T0。为了获得精确的T0,在径迹重建中必须对粒子飞行时间以及信号在丝上的传播时间进行修正。 X-T关系刻度即漂移时间与漂移距离关系的刻度。漂移室通过测量电离电子的漂移时间来获取入射粒子的位置信息,因此需要精确地标定漂移距离与漂移时间的关系。 BESIII漂移室采用小单元结构,单元内电场分布很不均匀,这增加了X-T关系刻度的复杂性。由于不同丝层间电场分布差异较大,为此需对不同丝层使用不同的X-T关系。此外,刻度中还要考虑单元径向电场不对称引起的X-T关系左右不对称,以及不同入射角下X-T关系的差别。 Weidong LI

44 MDC dE/dx Calibration (1)
Physical Principle Energy loss function In which So dE/dx depends on Weidong LI

45 MDC dE/dx Calibration (2)
MdcReconstruction MDC track Correction Drift distance correction Entrance correction Layer gain correction Wire gain correction Path length correction Saturation correction Z deposit correction Global correction Truncation Mean 击中级别的dE/dx刻度,主要包括以下内容: 径迹长度的修正:修正径迹穿过单元时取样长度的不同; run by run的刻度:修正取数时外界环境如气压、温度等效应,给出不同run的增益值; 单丝增益的修正:由于不同单元电场以及电子学增益等的差异,针对每根信号丝做单丝增益修正; 漂移距离和入射角联合修正:负电性气体对电子的吸附效应导致漂移距离增大时信号丝收集到的电子变少,而dE/dx随入射角的变化主要是由于不同方位角上电场的不均匀性造成的。同时漂移距离和入射角之间相互影响,因而对它们使用二维联合修正。 径迹级别的dE/dx刻度,主要包括空间电荷效应,能损曲线和σdE/dx的刻度等。 空间电荷效应的修正:空间电荷效应,主要表现为dE/dx随极角的变化,该效应修正的难点在于不同能损范围,空间电荷效应的强度不同。在刻度过程中,首先利用Bhabha数据做电子空间电荷效应刻度。再利用其它样本:μ、π、k、p进一步做剩余的强子空间电荷效应修正。 能损曲线的刻度:在完成上面各种修正后,挑选干净的单粒子样本,刻度dE/dx能损曲线,给出重建时dE/dx期望值。 σdE/dx的刻度:为了更好地利用dE/dx对粒子进行鉴别,需要仔细研究dE/dx的标准偏差σdE/dx。σdE/dx与很多因素有关,包括径迹动量、极角θ、取样次数Nhit等。我们用如下经验函数来拟合σdE/dx [[i]] 其中函数f,g,h为经验公式。 DedxCorrecSvc Calibration constant Output Weidong LI

46 MDC Tack Finding (1) Data Flow Two Major Steps Track Recognition
Segment finding Mdc Digi Segment Track finding Track Track Recognition Weidong LI

47 One segment pattern in cell A cell in one superlayers
MDC Tack Finding (2) Segment Finding Method Segment finding based on finding segments in superlayers. 4 3 2 1 7 6 5 Pattern No.0 One segment pattern in cell A cell in one superlayers We have 8 4-hit patterns and 20 3-hit patterns (2,0) (2,1) (2,2) (2,3) (2,4) (2,5) (1,0) (1,1) (1,2) (1,3) (1,4) (1,5) (3,0) (3,1) (3,2) (3,3) (3,4) (3,5) (0,0) (0,1) (0,2) (0,3) (0,4) (0,5) clockwise Segment in superlayers Reference wire Weidong LI

48 MDC Tack Fitting (1) Track fitting algorithm with the Kalman filter method Energy loss Material effect: multiple scattering before fitting after fitting Weidong LI

49 MDC Tack Fitting (2) after fitting before fitting
Non-uniform magnetic field after fitting before fitting Weidong LI

50 EMC Digi Calibration (1)
Converting ADC counts into MeV PEDi -pedestal value with respect to ADCi , ei -the electronic gain constant, ci -the energy conversion constant. obtained from online ci obtained from the cosmic calibration, before counter installation into the container Pressure of structure Geometry of calorimeter Radiation damage Crystal non-uniformities etc. To achieve more accurate ci , counter-by-counter calibration is done using Bhabha events . Weidong LI

51 EMC Digi Calibration (2)
Find constants gi minimizing Ee - electron or positron energy from kinematic f ( Ee , ,  ) -Shower leakage correction obtained from MC Ekexp - expected (deposited ) energy By minimizing the 2 , matrix equation is extracted. Q is matrix with order 6240, is sparse. All gi are decided simultaneously by inverting matrix equation. Sparse matrix package (SLAP etc.), solving the matrix equation.. Weidong LI

52 EMC Digi Calibration (3)
Ee - electron or positron energy from kinematic f ( Ee , ,  ) -Shower leakage correction obtained from MC Ekexp - expected (deposited ) energy Index k -shower range around maximum energy counter E peak Weidong LI

53 Electro-magnetic shower in EMC
EMC Reconstruction (1) Cluster A contiguous region of crystals above an energy threshold. Clustering Recursive searching of neighbors to find cluster. Several clusters might be found. Clustering crystal Electro-magnetic shower in EMC A typical cluster for 1GeV photon Weidong LI

54 EMC Reconstruction (2) Cluster Splitting cluster Seed: local maximum.
If only one seed found, cluster = shower. If more seeds, the cluster is split into several showers. Each crystal in this cluster contributes a shared energy to each shower. seed finding seed1 seed2 seed seed splitting shower1 shower2 Weidong LI

55 EMC Reconstruction (3) Shower Energy E5x5 spectrum for 1GeV photon
peak E3x3: Energy sum of 9 crystals around the seed E5x5: Energy sum of 25 crystals around the seed EAll: Energy sum of all crystals in the shower Energy spectrum Peak: most probable energy : full width at half maximum Tail: caused by front material and crystal leakage Energy resolution Most important quantity Definition: tail Energy resolution VS photon energy Weidong LI

56 EMC Reconstruction (4) Shower Position Linear weighting function
Logarithmic weighting function Position correction before correction after correction Weidong LI

57 MUC Reconstruction Simulation Reconstruction
RPC as sensitive detector Detailed to each read-out strip Reconstruction Tracking algorithm is seeded by the tracks extrapolated from MDC. Searching for hits gap by gap within predefined windows. Reconstruction efficiency: ~ 1GeV Ext track Fired strips Window Weidong LI

58 Gaudi 软件框架 Weidong LI

59 What’s Gaudi? What’s Gaudi Gaudi’s design criteria
Originally developed by LHCb and used by LHCb, ATLAS, HARP, GLAST, BESIII, DayaBay etc. Defines standard interfaces for the common components necessary for event processing. Gaudi’s design criteria Data centered architectural style All components with well defined “interfaces” Algorithm, Data Object , Service, Converter …… Clear separation between “data” and “algorithms” Clear separation between “persistent data” and “transient data” Encapsulated “User code” localized in few specific places: “Algorithms” and “Converters” Weidong LI

60 Gaudi Components Algorithm Data Object Transient Data Store Services
Data processing unit (visible & controlled by the framework) Data Object Data unit (visible and managed by transient data store) Transient Data Store Central service and repository for data objects (data location, life cycle, load on demand, …) Services Globally available software components providing framework functionality JobOptions Service, Message Service, Event Data Service, Histogram Service, N-tuple Service, Random Number Generator Data Converter Provides explicit/implicit conversion from/to persistent data format to/from transient data Weidong LI

61 Gaudi Object Diagram Converter Application Manager Data Message
Algorithm Event Data Service Persistency Data Files Transient Event Store Detec. Data Transient Detector Store Message JobOptions Particle Prop. Other Services Histogram Transient Histogram Store Application Manager Weidong LI

62 Dataflow MDC Tracking Calorimeter Clustering Electron/photon
MDC digits MDC digits Transient Event Data Store MDC Tracking Tracks Tracks Calorimeter Digits Calorimeter Clustering Calorimeter showers Showers Tracks, Showers Electron/photon Identification Electron/photon Apparent dataflow Real dataflow Electrons/photons Weidong LI

63 Loading from Data Store
Unsuccessful if requested object is not present Data Store Data Service (2) Search in Store Algorithm (3) Request load Persistency Service (1) Retrieve object Converter (4) Request creation Conversion Service Request dispatcher Objy, ROOT,.. (5) Register Weidong LI

64 Algorithm(1): Basic Users write concrete Algorithms derived from base class Algorithm Implements - at least - three methods in addition to the constructor and destructor initialize(), execute(), finalize() execute is called once per physics event Concrete Algorithm EventDataSvc IDataProviderSvc IHistogramSvc IMessageSvc IAlgorithm IProperty Obj_B DetectorDataSvc HistogramSvc MessageSvc ApplicationMgr ISvcLocator Obj_A ParticlePropertySvc IParticlePropertySvc Weidong LI

65 Software Configuration Management (1)
CMT (Configuration Management Tool) structures software development (concepts of areas, packages, versions, constituents) organises software into packages describes package properties describes package constituents operates the software production (management, build, import/export, etc...) Weidong LI

66 Software Configuration Management (3)
use Release area User area MDCGeomSvc MDCGeomSvc BesRelease BesRelease MDCGeomSvc BesGeoMdc BesGeoMdc-00-* External/CLHEP CLHEP * BesPolicy BesPolicy-01-* CLHEP External area use Weidong LI

67 Part 3: BESIII 软件平台的实现 Weidong LI

68 Software Environment Underlying framework Simulation
GAUDI (originally developed by LHCb) Simulation GEANT4 Other external LIBs: CERNLIB, CLHEP, ROOT, AIDA, XercesC, GDML … Database: MySQL Software configuration management CMT and CVS Computer language: C++ (BESII legacy code written in Fortran ) Operation system: SLC4/ gcc3.4.6 Reused code from Belle, BaBar, ATLAS,GLAST … Weidong LI

69 Framework and Benefits
Framework definition A skeleton of an application into which developers plug in their code and provides most of the common functionality. Benefits Common vocabulary, better specifications of what needs to be done, better understanding of the system. Low coupling between concurrent developments. Smooth integration. Organization of the development. Robustness, resilient to change (change-tolerant). Fostering code re-use Weidong LI

70 BESIII Framework’s Evolvement
BES Analysis Framework (BESF) Based on Belle software Supports two types of data management: Panther and ProxyDict Added more features e.g. new dynamic library loading and new software components e.g. Service etc Implemented Event I/O: Raw Data and NDST Data Build system: Automake and CMT BOSS Gaudi framework A concrete implementation of the underlying architecture: Gaudi Weidong LI

71 Why Gaudi ? Benefits Cost of migrating to Gaudi
In general, gain more in simplicity, modularity, flexibility and extendibility Clear separation of experiment-dependent and experiment-independent parts. Only need to focus on BES III specific packages. Save coding time. Many common utilities and services available. More control available in work flow: sequence, branch and filter Support of multi-threaded implementation: no extra constraints if online event filter system reuses offline packages. Cost of migrating to Gaudi A steep learning curve for core developer although well documented. Weidong LI

72 BESIII Offline Software Architecture
Base on GAUDI, ROOT , GEANT4 etc. ~259 BESIII packages,180 MB source code,2.7 GB external library Developers are from IHEP, PKU, USTC, SDU, JINR etc. Phy. Const. Svc Calib Material Geom Sim Calib/Rec Analysis BESIII Offline DB Event Data RawData Rawdata Cnv RecData Dstdata DstData Weidong LI

73 Event Data Flow HepEvt McTruth G4Event Hits Digits RawData CnvSvc
BES III Generator Event Converters Simulation Digitization Reconstruction Algorithms Calibration RecHits RecTracks Analysis Tools DstTracks Histograms Ntuples RootDstCnvSvc DstData Rec2DstAlg Weidong LI

74 Event Data Generator HEPEVT Data RAW Data HEPEVT Data
Kinematics information only RAW Data Delivered by DAQ for reconstruction Byte stream format Simulated Event Data Contain digits, hits and other MC truth information Ascii file format and ROOT format REC & DST Data Reconstructed Data is event data written as output of reconstruction procedure DST Data is a reduced event representation suitable for analysis Both in ROOT format HEPEVT Data G4 Simulation RAW Data Simulated Event Data Reconstruction REC & DST Data Weidong LI

75 Event Data Conversion Services
Real data flow T1 Apparent data flow Algorithm A Data T1 Data T2, T3 T2 T3 Algorithm B Data T2 Data T4 T4 Algorithm C Data T3, T4 Data T5 T5 Weidong LI

76 Raw Data Raw Data format was defined by DAQ group
Byte stream data Mainly consists of detector identifiers, time channels, charge channels and status etc. RawRoot Data is a data format for MC as input to reconstruction algorithms and contains: Digits in MDC, TOF, EMC and MUC Hits in MDC, TOF, EMC and MUC MC truth information e.g. particle ID, tracks and vertices etc. Relationships between above three Converting Raw Data to RawRoot Data is done by a simple algorithm. Weidong LI

77 Reconstructed and DST Data
TObject TRawData TRecEvent TMdcDigi TMdcTrack TTofDigi TTofTrack TEmcDigi TEmcTrack TMucDigi TMucTrack TDedx ROOT REC Data contain both Digits and “Track Lists” ROOT DST Data only have “Track Lists” Weidong LI

78 None-Event Data Non-event data
for the overall geometry and structure of the detector, including information about magnetic fields, and as they are not perfectly still, to understand the real positions of the subdetectors; to have the correct detector calibration as detector response also changes (e.g. with temperature); and about the run conditions of the accelerator, e.g. beam energy, at the time of the collision. All of these non-event data must also be stored and managed. Weidong LI

79 Detector Description (1)
Goals Single source of generic detector description information Independent of clients: simulation, reconstruction, event display etc. Not limited to geometrical information but also includes description of material Flexible and extensible description Detector description Full detector description instead of description of basic detector dimensions Using GDML syntax Weidong LI

80 Detector Description (2)
Based GDML (Geometry Description Markup Language) Expanded the GEANT4 Schema and developed a new ROOT Schema for BESIII applications. GDML Detector Description has been used for simulation, event display and reconstruction. Objects for Simulation Geometry, Materials, Alignment… XML description for Reconstruction for Event Display ROOT Schema XML writer GEANT4 Schema Weidong LI

81 Access to Geometry Data
TOF GeomSvc EMC MUC MDC Application Layer Geometry, Materials XML description Objects for Reconstruction Weidong LI

82 BESIII Simulation ( Dr. Liu Huaimin’s Lecture )
BOOST (BESIII Objected Oriented Simulation Tool) is based on Geant4. Originally developed in a independent framework. Material and geometry data are read from GDML files. GENBES: BESII event generators. Geometry Geant4 Tracking Detector Digitization Response Hit objects Event GENBES Generator HepEvt format Raw data MC truth BOOST persistency transient Weidong LI

83 Simulation Integration
BESII has ~30 event generators written in Fortran Use C++ Hepevt_Wrapper to access the kinematics information generated by the generators Transient Event Store Generator Simulation Integration with BOOST simulation is based on ATLAS/Athena software. Currently both BES generators and BOOST have been integrated with the offline framework. Weidong LI

84 Calibration Calibration framework Data Service database GUI Client
constants (ROOT) MySQL database CalibFunc Svc Data Service Reconstruction algorithm CalibRoot CnvSvc Simulation GUI Client Algorithm Weidong LI

85 Reconstruction Algorithms
Sub-detector Reconstruction Algorithms MDC MdcPatRec : tracking algorithm based on BaBar software TrkReco: tracking algorithm based on Belle software MdcDedxAlg: calculating dE/dx information for MDC tracks. KalFit: track fitting algorithm using the Kalman Filter method. TOF TofRec: calculates the flight time of charged particles. Calorimeter EmcRec: a clustering algorithm for EM Calorimeter. Muon Chamber MucRec: tracking algorithm for Muon Chamber Other Algorithms Event Time Calculation: determining the event start time. Tack Extrapolation: considering particles’ deflection in the magnetic field and the ionization energy loss of particles in the material. Weidong LI

86 Event Display Tool: BesVis
Based on ROOT, OpenGL, X3D and XML Support both 2D and 3D view Operations and controls available through menu and toolbar items First version was released in December 2005. Weidong LI

87 数据和作业的管理 Weidong LI

88 先进的数据管理软件 Weidong LI 88

89 离线数据管理 BESIII在实验生命周期中将产生PB量级的海量实验数据。
为使数据处理更为可靠和自动化,采用基于先进的J2EE技术开发了BESIII实验数据管理系统。 实验数据管理,包括Raw、REC和DST数据 数据处理历史 数据质量管理 刻度流程和刻度常数管理 重建和物理分析软件使用常数的管理 软件版本管理 实现基于Java,采用STRUTS/Spring/Hibernate等web应用开发技术。 Bookkeeping Server DB BookkeepingSvc DAO Job Management Web Server Side Client Side Tools XML-rpc servelet Ganga application Weidong LI

90 The Scheme for Data Production Job Management
Job Configuration and Submission Module Setting up the job running environment Converting the input dataset to a list of input data files Defining the policies for dividing the job into sub-jobs and for combining outputs. Providing the functionality for automatic/manual re-submitting the failed jobs. When all the sub-jobs are successfully finished, bookkeeping database can be updated through web or command line. Weidong LI 90 07/05/2012

91 The Scheme for Analysis Job
Job Management Tool Setting up the job running environment Contacting Bookkeeping Service to get input data files Depending on the need, the job can be split into sub-jobs by default or according to user’s own policy. Monitor helps users to keep track of their job status. Job Management Tool User Splitter Application Task PBS Job splitting Working Node Monitor result Result merging Result Merger Results Weidong LI 91

92 Part 4: 参考文献 高能物理实验离线数据处理Gaudi软件框架
( User Guide,Architecture Design Document) 北京谱仪(BESIII)的设计与研制, 上海科学技术出版社,王贻芳主编 Weidong LI

93 谢谢 ! Weidong LI


Download ppt "高能物理实验离线数据处理 李卫东 计算中心,2016/11/04."

Similar presentations


Ads by Google