Institute of High Energy Physics (IHEP) 高能物理实验中的 误差处理和信号显著性 金 山 Institute of High Energy Physics (IHEP) jins@mail.ihep.ac.cn
物理学是以实验为基础的科学。
“四好”学生 物理好 (包括理论基础好、物理概念清楚) 计算机知识好 C++好 英语好 统计知识好
统计知识好 不要求人人成为统计(数学)专家,也不要求人人会推导复杂的公式等。 要求: 概念清楚——用物理的思想理解其本质。 (至少要明白别人在说什么)。 至少会运用已有软件进行有关计算。 必须清楚各种方法的适用条件或范围。
统计的本质 几率——Probability 对几率的理解有很多“paradox”,因此要保持正确的概念和理解并非易事。 例:
报告内容 统计误差和 (Toy) Monte Carlo实验 系统误差和Monte Carlo模拟与真实数据的一致性 信号“统计显著性”及其系统误差估计
误差 物理学是以实验为基础的学科: 实验——测量——估计物理参数——误差
参数估计方法好坏的标准 无偏性 一致性: 样本趋于无穷大时,方差趋于0 有效性: 对同样的测量,方差越小,方法越好
统计误差 和 (Toy) Monte Carlo实验
统计误差 由于实验样本统计涨落而引起的测量参数估计的不确定性。 理想实验:同样本大小情况下重复实验很多次,求出多次测量值的均方根RMS,就是该样本大小情况下的统计误差。 实际只能一次测量——估计误差
简单的例子: 比较复杂的实验测量的参数及其误差估计: 简单测量事例数实验 预期产生100个事例, 服从高斯随机分布 预期统计误差为 sqrt(100)= 10个事例 测量值为N个事例, 估计误差为sqrt(N) 比较复杂的实验测量的参数及其误差估计: 最大似然法 —— Max. Likelihood (ML) (高能物理实验最常用的方法!)
Likelihood We have data: x (could be a vector, discrete or continuous) and a probability model or probability density function f (x;q ) (q could be vector of parameters) Now evaluate the probability function using the data that we observed and treat it as a function of the parameters. This is the likelihood function: (here x is treated as constant) If we have n independent observations of a random variable x, then
Maximum Likelihood The likelihood function plays an important role in statistics. E.g., to estimate the parameter q, the method of maximum likelihood (ML) says to take the value that maximizes L(q). ML and other parameter estimation methods would be a large part of a longer course on statistics.
如何能进行“重复”实验 (Toy) Monte Carlo 实验
What is Monte Carlo? Monte Carlo is a name of casino in Monaco. gambling probability Monte Carlo is a technique of simulation based on probability using known theory/model/knowledge.
如何能进行“重复”实验 Toy Monte Carlo 实验 例:根据已知的 f (x;q ) 抽样得出 n 组 x,此即为n组 Toy MC 实验,这样可以得出这个n组实验的参数q 估计值的RMS和置信区间等。 Toy MC是现代高能物理实验分析的高级技巧
系统误差和 Monte Carlo模拟与真实数据 的一致性
系统误差 什么是系统误差? 高能物理实验文章中经常把 Systematic Error 称作 Systematic Uncertainty 由测量系统或方法本身不准确引起的误差 高能物理实验文章中经常把 Systematic Error 称作 Systematic Uncertainty 简单例子: 用尺子测量一批产品(绳子)的长度 不同绳子——统计误差 不同尺子——系统误差
高能物理中的MC模拟 Generator Theoretical model simulation Detector Simulation
高能物理实验中,MC 模拟软件至关重要 探测结果 = 理论模型 × 探测效 率 为了得到与理论模型可比较的物理结果,必须进行 探测效率修正。
MC不确定性估计方法(1)——Data/MC不一致性估计法 例:粒子鉴别的系统误差估计 关键:选取高纯度的控制样本 (high purity control sample) 分析系统误差的重要环节,很重要!
SIMBES明显优于SOBER,系统误差显著减小 例:TOF 探测器 π粒子鉴别效率 TOF efficiency (pion) vs Momentum 利用J/ 3 事例选取 粒子样本 SIMBES明显优于SOBER,系统误差显著减小
进一步改进SIMBES后误差进一步减小: π粒子鉴别效率系统误差最后定为约1%.
例:Tracking eff. of pion from events Data/MC agrees well
Lamda Lamdabar – missing trk proton CUT Data SIMBES SOBER Trk Rec. 94.7% 95.0% 99.1% Good Trk 94.6% 98.9% antiproton CUT Data SIMBES SOBER Trk Rec. 94.4% 94.2% 99.3% Good Trk 94.1% 93.8% 99.2%
Lamda Lamdabar – missing trk pion+ CUT Data SIMBES SOBER Trk Rec. 88.8% 88.6% 93.0% Good Trk 80.8% 81.7% pion- CUT Data SIMBES SOBER Trk Rec. 90.0% 90.2% 94.0% Good Trk 85.6% 84.9% 90.5%
MC不确定性估计方法(2)—— 改变“源量”MC模拟方案方法 例: 分支比测量和角分布系数 值测量。 不同的主漂移室MDC丝分辩模拟方案引起的测量结果变化。
系统误差的处理的 两种不同方法比较 测量变量(如 trk eff.,动量,2 ) 的直接 data/MC 比较方法 优点:直观 缺点:需较纯净的样本或本底的完全知识;很难考虑各个变量之间的关联(例:在分波法中很难应用此方法) 改变 “源量” (如丝分辨)方法 优点:包括了各个测量变量之间的关联;“易行”。 被越来越多的实验广泛使用 (例 LEP WW,Higgs,B 物理等多数物理研究)。 缺点:不直观;需要有专家提供改变“源量”大小的依据和方法。
信号“显著性”及其系统误差估计
高能物理实验一直处于寻找和发现新信号 —— 新粒子和新物理现象的前沿。 如何定量地描述新发现的可能性大小? (定量是一门科学成熟的标志) 如何确定你的结果是否可以发表并被接受?
International Convention in HEP Community 3 sigma – evidence of a possible signal 5 sigma – discovery of a signal 什么是信号显著性? 如何计算(可信的)信号显著性?
outline 信号“显著性”的本质和表述 信号“显著性”的常见的几种算法及其系 统误差估计
信号“显著性”的本质和表述 本质:统计假设检验的几率(Prob)或置信度 表述 可直接表述为Prob. (例:Babar X(3872)) 常用高斯分布将上述几率对应直观表述为 “n ”
信号“显著性”的常见几种算法及其系统误差估计
信号“显著性”的常见几种算法 及其系统误差估计 (I) CLb 频度法——严格估计 当本底的知识能够从MC较好估计时, -----Statistical Estimator Test Statistic
Example: Simple Event Counting: We expext: B=10000 events, We observe: N0=10500 events (1-sided probability of 5 ), This is the probability that we observe Nb larger than N0 in the pure background distribution. Or we can understand it is as 5 deviation from no signal.
Systematic Uncertainties for Method I In this method, all possible factors causing the uncertainty of b should be taken into account. Example: ALEPH’s observation of “3 golden Higgs candidate events” with ~3.0 significance: Likelihood function includes number of events, mass and b-tagging distributions, etc
ALEPH Collaboration, PLB526 (2002) 191
Method II: Goodness of fit -- 2 tests (with known background shape) d.o.f = Nbin – Npara In statistics books, it reads “p-value it is the probability, under the assumption of a hypothesis H0, of obtaining data at least as incompatible with H0 as the data actually observed.” Sometimes,we can also” translate” this probability as “n ” deviation from hypothesis H0.
Some examples
Observation of an anomalous enhancement near the threshold of mass spectrum at BES II J/ygpp acceptance weighted BW +3 +5 -10 -25 M=1859 MeV/c2 G < 30 MeV/c2 (90% CL) c2/dof=56/56 0.1 0.2 0.3 Phys. Rev. Lett. 91, 022001 (2003) M(pp)-2mp (GeV) 3-body phase space acceptance
Could it be a tail of a known resonance? 0-+ resonances in PDG tables: h(1760) M=1760 G = 60 MeV p(1800) M=1801 G = 210 MeV 2/dof=323/58 c2/dof=412/58
Pure FSI disfavored I=0 S-wave FSI CANNOT fit the BES data. FSI curve from A.Sirbirtsev et al. ( Phys.Rev.D71:054010, 2005 ) in the fit (I=0) FSI * PS * eff + bck
Systematic Uncertainties for Method II Only those that may change the background shape need to be taken into account: Some systematic errors, such as tracking efficiency, photon efficiency and 4c-fit, which have very small impact on the background shapes and the shape of acceptance curve, can be ignored, since they have littile contribution in the 2 calculation.
Method III: Likelihood Ratio Tests This method can be applied to background shapes obtained from sideband fit. It is widely used by many experiments. Two fits: with signal: fit1 L1 ; without signal: fit0 L0 Rigorous statistical theorem tell us that follow the 2 distribution with So, we have:
Using TOY MC experiments, it can also be easily shown that follows the 2 distribution with . So, when we apply likelihood ratio test method, the number of d.o.f. must be taken into account. For a BW-like new signal, usually we have at least 3 parameters for the signal (mass, width and amplitude), so using to estimate signal significance is incorrect and it over estimates the significance by 0.7 when claiming a 5 discovery, i.e., the actual significance is only 4.3 . ( The probability is more than 10 times larger BE CAREFUL! )
BESII Observation of X(1835) in Statistical Significance 7.7 BESII The +- mass spectrum for decaying into +- and
Systematic Uncertainties for Method III The factors cause the change of should be taken into account into the systematic uncertainties. Since we obtain the background shape from sideband information, so the systematic uncertainties are mainly from the uncertainties of different choice of fitting functions and fitting range. Some systematic errors, such as tracking efficiency, photon efficiency and 4c-fit, which have very small impact on the background shapes, can be ignored.
Example on Systematic Uncertainty of Significance: Observation of Y(2175) in J/ f0(980) at BESII
BESII preliminary Fit with one resonance BG shape is fixed to , f0 sideband BG BESII preliminary 5.5 M =2.186±0.010 0.006 GeV/c2 =0.065±0.023 0.017 GeV/c2 N events= 5212 M(f0(980)) GeV/c2
BESII preliminary Fit with one resonance BG is represented by a 3-order polynomial BESII preliminary 4.9 M =2.182±0.010 GeV/c2 =0.073±0.024 GeV/c2 N events= 6114
BESII preliminary Fit with two resonances BG shape is fixed to , f0 sideband BG the mass and width of the second peak are fixed to those of from BaBar. BESII preliminary 5.8 2.5 M =2.186±0.010GeV/c2 = 0.065±0.022GeV/c2 N1 events= 4714 N2 events= 2211
信号统计显著性总结与讨论 Significance 的本质为统计假设检验的Prob. , 常将这一 Prob. 表述为“n ” 用 方法估计 Significance (or Prob.) 时,必须考虑 数. 只有引入的信号参数个数为 1 时,才可用 方法估计 “n ” 系统误差的考虑因所用信息的情况而定.
Comments on statistical significance (Talk by S.Jin at ICHEP04) Using to estimate statistical significance seems too optimistic. Even if we have firm knowledge on the background CLb as LEP Higgs used is recommended. When the background is estimated from the fit of sideband, the likelihood ratio with D.O.F. taken into consideration is a better estimator of statistical significance. In this case, the uncertainty of all possible background shapes should be included in the uncertainty of significance. Do not optimize/tune the cuts on the data! Determine the cuts based on MC optimization before looking at data. The sys. uncertainty on significance from “bias” cut is hard to estimate. “Look elsewhere” effect may reduce the significance by 1~2σ.
谢 谢! Thanks!