方贤进 http://star.aust.edu.cn/~xjfang/virginia.pptx 利用重合指数法破解Virginia加密 方贤进 http://star.aust.edu.cn/~xjfang/virginia.pptx
Part1 : 重合指数及其无偏估计值 重合指数:设某种语言由n个字母组成,每个字母i发生的概率为pi(1≤i≤n),则重合指数就是指两个随机字母相同的概率,记为IC 一般用IC的无偏估计值IC’来近似计算IC. 其中的xi表示字母i出现的频次,L表示文本长度,n表示某种语言中包含的字母数。
IC’值的特点 随机英文文本的IC’总是大约为0.038. 而一段有意义的英文文本的IC’总是大约为0.065.
Example 1:随机英文文本明文及其IC’
Example 1:随机英文文本密文及其IC’ (移位加密key=17) 注:英文文本中字母的编码为 a~z…….0~25
Example 2:一个有意义的英文text Differential Privacy is the state-of-the-art goal for the problem of privacy-preserving data release and privacy-preserving data mining. Existing techniques using differential privacy, however, cannot effectively handle the publication of high-dimensional data. In particular, when the input dataset contains a large number of attributes, existing methods incur higher computing complexity and lower information to noise ratio, which renders the published data next to useless. This proposal aims to reduce computing complexity and signal to noise ratio. The starting point is to approximate the full distribution of high-dimensional dataset with a set of low-dimensional marginal distributions via optimizing score function and reducing sensitivity, in which generation of noisy conditional distributions with differential privacy is computed in a set of low-dimensional subspaces, and then, the sample tuples from the noisy approximation distribution are used to generate and release the synthetic dataset. Some crucial science problems would be investigated below: (i) constructing a low k-degree Bayesian network over the high-dimensional dataset via exponential mechanism in differential privacy, where the score function is optimized to reduce the sensitivity using mutual information, equivalence classes in maximum joint distribution and dynamic programming; (ii)studying the algorithm to compute a set of noisy conditional distributions from joint distributions in the subspace of Bayesian network, via the Laplace mechanism of differential privacy. (iii)exploring how to generate synthetic data from the differentially private Bayesian network and conditional distributions, without explicitly materializing the noisy global distribution. The proposed solution may have theoretical and technical significance for synthetic data generation with differential privacy on business prospects. 其IC’为:0.0659
Part2 : Virginia加密 密钥空间为26d 多表密码是利用多个单表代替密码构成的密码体制。它在对明文进行加密的过程中依照密钥的指示轮流使用多个单表代替密码。 明文M=(m1,m2,…,mn),密钥 K=(k1,k2,…,kd) ,密文C=(c1,c2,…,cn) 加密变换:ci+td=Eki(mi+td)=mi+td+ki mod n 解密变换: mi+td=Dki(ci+td)=ci+td - ki mod n 密钥空间为26d
Example 3: plaintext.txt differentialprivacyisthestateoftheartgoalfortheproblemofprivacypreservingdatareleaseandprivacypreservingdataminingexistingtechniquesusingdifferentialprivacyhowevercannoteffectivelyhandlethepublicationofhighdimensionaldatainparticularwhentheinputdatasetcontainsalargenumberofattributesexistingmethodsincurhighercomputingcomplexityandlowerinformationtonoiseratiowhichrendersthepublisheddatanexttouselessthisproposalaimstoreducecomputingcomplexityandsignaltonoiseratiothestartingpointistoapproximatethefulldistributionofhighdimensionaldatasetwithasetoflowdimensionalmarginaldistributionsviaoptimizingscorefunctionandreducingsensitivityinwhichgenerationofnoisyconditionaldistributionswithdifferentialprivacyiscomputedinasetoflowdimensionalsubspacesandthenthesampletuplesfromthenoisyapproximationdistributionareusedtogenerateandreleasethesyntheticdatasetsomecrucialscienceproblemswouldbeinvestigatedbelowiconstructingalowkdegreebayesiannetworkoverthehighdimensionaldatasetviaexponentialmechanismindifferentialprivacywherethescorefunctionisoptimizedtoreducethesensitivityusingmutualinformationequivalenceclassesinmaximumjointdistributionanddynamicprogrammingiistudyingthealgorithmtocomputeasetofnoisyconditionaldistributionsfromjointdistributionsinthesubspaceofbayesiannetworkviathelaplacemechanismofdifferentialprivacyiiiexploringhowtogeneratesyntheticdatafromthedifferentiallyprivatebayesiannetworkandconditionaldistributionswithoutexplicitlymaterializingthenoisyglobaldistributiontheproposedsolutionmayhavetheoreticalandtechnicalsignificanceforsyntheticdatagenerationwithdifferentialprivacyonbusinessprospects 利用Virginia加密, key=infosec
Example 3: ciphertext 计算该文本(1个串,串长=1609)的重合指数无偏估计值IC=0.0418 lvktwvgvgnodttqifqqmubujglevmbkhziczglcsphweyvwttwoqseshxenjsgaxejgwvxqalrsxczrqsswgiaidjmxipddjiumeawfkfigfaarkvtjlawvqalhwgjvvviwwwavsuvmhnrwsfxkiyufazcklmcoixmehofrqbrktwgvqijzqlcvqqsllgxhgzagcbvtbgjjqtmraqgvfncfenlnyoarrieywuyniebvwrvprnbhyvlnyokivkbshsmpanqojkgvhrpwvqnnyhjmdcgjgwbkagnbyqgbutrkmpkhwvakjmehcetwbvsuusoxyjlaxaiaizgagzvstgvoigncfxqvbngwvcbvtkzmepejbvitagmshydtvxvwhfigfbwbvbbzgwpgafyvawrzbuckenivrglstmqzqwgquczharikbrddizqgdofhuqtsodxqvbngwvcbvthziubnwharixbnblmubbfdhvqfvrolivprkidpfqfyfafwbvtbgjjqtmraqgvfncfenlnyokivevyvswgbbkzgafqzjbkmqvnqasviqafzvmubenpmxkwaxjaeqxgnaadkvtxqgvgnhsqlmqvnsrjifcpnbywgvfnhazkblnbolkkulsfitigncfshvbngqgqvqnhaspiyiwkxtqozhaspajnhzhknsjfwrvqnqdjmxipdwkgquczhwhkvnxslshtbbraqgvfncfenahggheemffbvxjmayvwwcucqslyrtrxtjsobujbgmugnudjszqzfhasplvxhjmdcgncfetmhxsvxqorssjevmnsrjinmnxsllgalshzivqpioleumgxceiezhhwspukvjbuirzbgzwquebzzvfgqaaskxkonysvfgtbbwuspagwiuxkvtfzgamlrlfwidiljgaepvrykgvmwijfllgpvlvvmomaxwgrctqfhswgbinowbrwajblmctzjqzepqfrwfhknsjfwrvqnqdjmxipdkzitmgmskgqzrkifgvqbswksrbvrwrifbbwsvyemgmskipavywnmvghxwfkocgzodmpnbwasxkwajemmxiyjbuietnxgwwkvzflaqwuwtwfxfqfyfafwbvtbsrfllsoemexetujeouvsuamubhimaribujodkqzvyvexqkbrdmxgifjhgjpwvxmusplvywgrctqnglvkjhywgrunetabskvgiwkxtqozhaspavshziucoxdsggwsgoqiuqnsbwxywepjaevprqohpckrrsulcvvxagjfqsksjipbvfzhvkdnhmamkmkuzgvkvtmcoxqorssjevmfdbllgbvhrsxcnetallglvktwvgvgnodpaxenjsxgjndskmcvajhostsnsrusplvywgrctqnglvkjhywgruevyvgyvmkuzagkbydasxgzvfzadkvtyvwrqqfdudsdiyiwkxtqozhaspbujdjsrwfjrksncgncfqcgufjwxjmbwslmeiyfbvxgkuswuenavlbajkknsqwjqzfdbllgbvhrsxcorssjevqbskaxjlvktwvgvgnodttqifqqspjhxwfiuacwcktgkgxvzlj 计算该文本(1个串,串长=1609)的重合指数无偏估计值IC=0.0418
Example 3:将ciphertext分成2个子串 计算2个子串的重合指数无偏估计值的平均值为IC=0.0419
Example 3:将ciphertext分成3个子串 计算3个子串的重合指数无偏估计值的平均值为IC=0.0419
Example 3:依此类推,将ciphertext分成7个子串 子串1:lvqbmzwwxxqziimivqvanikmbqvxbqvliiplkavncabkmbxizivbpatibazimukqqvbbxbfpqbqvlebqvqbwxvnvcvbkivviqanqiuvtvammutbgqlcmommaqmzkzeqotavlivwpmtbwtqnqimzqbbmagcnwitvuqblxubbzkiwltjnvqacwqwpkvqbdmvombnlvxjvsltjembzvqiqbwcgmikakzboqlvqjak 子串2:vgiubgeoeearapegtavvryleriqhvtfneernbnhngguhevyavgbvegvgbfbvqcbgtbvnbbvrfvtfnvbznaeagthnpflugbqyojsnpcnbfhfacrunzvghrnnlpghvbbanbgtrlrivaqiazfsnpgrbvbgvhgbaynzwfvlevhuvbfvvqhegovosnerrvsvnktrfvevgenanvqhvkyvtfyoufgubyuvnfvrbvgihcg 子串3:knfjklyqnjlqidafjlvswumhkjqgtmnyybnysqryjntwhsjisnntjmxfzyurzzrdsntwnfrkytmnyykjqfnxnxssnnnlnnniznjqdzxbngfyqxjufxnxssxsixhjgzaybwfljyjlxfnjjrjqdmksrwmyxzwjjxftytstsijyrjxynytizsxgspqrxkfhumsdhtknndjsynyyudfydizjjnfwfslsdhssknfxwx …… 子串7:gtuvchthaxcgxufkvjwhkcxqvcgcjgnrnvvvpgqdkgpjwoagoqcetdfvgrntqizuqcuiuqvfwjgnvgfqiukqkgqfgkkthqptpkvxqkhgnejcrouzpdtqvngvueurugkgpkmdpmgocgrcpkvxtqvrfepvopkxekwfwfeouiqqgppckuktpuguyvccfpkkkqvgcggagctpckuvkgkqdtprncjegnkqgcvjgtpug 计算7个子串的重合指数无偏估计值的平均值为IC=0.0657
将密文串划分成多个子串,分别求IC无偏估计值平均值 数 子串1 子串2 子串3 子串4 子串5 子串6 子串7 平 均 IC 长 1 1609 0.0419 2 805 0.0427 804 0.0411 3 537 0.0417 536 0.0424 4 403 0.0425 402 0.0.98 5 322 0.0414 0.0418 0.0413 321 0.0415 6 269 0.0402 268 0.0397 0.0441 0.0432 0.0416 7 230 0.0674 0.0677 0.0621 0.0584 0.0744 0.0666 229 0.0634 0.0657 8 … 0.0422 因为有意义的英文文本的明文IC ≈ 0.065,而移位加密不改变其IC值,所以对应的密文的IC ≈0.065。所以通过上表可知key size=7
Part3 : Virginia密码的破解 拟重合指数:设某种语言由n个字母组成,每个字母i的统计概率为pi(i=1…n),每个字母在密文子串Cj(j=1…keysize)中出现的频次为fi,j ,每个密文子串Cj的长度为ni,j ,则第j个子串的拟重合指数定义为:
明文中各个字母出现的统计概率(pi)
Virginia密码破解方法 Step1: 将Virginia密码分成若干子串测算IC无偏估计值后,分析Virginia加密的密钥长度为keysize,同时得到keysize个密文子串。 Step2:根据Virginia加密算法可知, 每个子串中的密文字母都是对明文中的字母经过相同的移位加密得到的。而移位加密的密钥空间仅为26。因此对每个密文子串测试26次移位算法进行解密,每次测试时计算该子串的拟重合指数,根据拟重合指数最高的那次移位数可得出该子串所对应的Virginia加密密钥中的一个字母。 Step3: 对Step2重复keysize次即可得到组成密钥的所有字母。
Example 4: 对密文子串3测试26次移位算法进行解密 移位数 密文子串3经过移位 加密后的拟重合指数 1 0.0387 14 0.0326 2 0.0325 15 0.0348 3 0.0324 16 0.0416 4 0.0368 17 0.0392 5 (f) 0.0615 18 0.0405 6 0.0433 19 0.0361 7 0.0332 20 0.0461 8 0.0279 21 0.0386 9 0.0468 22 0.0356 10 0.0384 23 0.0313 11 0.0365 24 0.0364 12 25 0.0429 13 26 0.0340 子串3:knfjklyqnjlqidafjlvswumhkjqgtmnyybnysqryjntwhsjisnntjmxfzyurzzrdsntwnfrkytmnyykjqfnxnxssnnnlnnniznjqdzxbngfyqxjufxnxssxsixhjgzaybwfljyjlxfnjjrjqdmksrwmyxzwjjxftytstsijyrjxynytizsxgspqrxkfhumsdhtknndjsynyyudfydizjjnfwfslsdhssknfxwx 计算密文子串3执行26次移位算法的26个拟重合指数! 所以Virginia加密密钥中的第三个字母为”f” 依此类推,可求出7个密文子串的所对应的Virginia加密的密钥为”infosec”
The End Thank you!