与其想当然的 overdesign,不如自己动手做个试验

  • A+
所属分类:腾讯分分彩

Conmajia
Jan. 29th, 2019

早在2012年,我曾经针对 C# System.Random 不同的初始化方案专门做过一次试验,得出了单次默认初始化即可获得质量很好的随机数的结论。可是这么多年过去,C# 从2.0升到了4.7,还能在网上看到很多新手(甚至是老鸟)被一些想当然的奇怪思想误导,费时费力地脱裤子放屁。

与其想当然的 overdesign,不如自己动手做个试验

▲ 万千被误导的灵魂

有些人总觉得用点额外的、生僻的玩意儿会显得很炫技,很厉害。正如修真小说里,稀有的古代神器多半比量产的现代装备牛逼,在编程的时候,用上那么几个不常用,或者当前问题环境下一般没人用的patterns or callbacks什么的,仿佛就拥有了高贵的血统,让人不明觉厉。然而再华丽的古罗马战车,也比不上穿梭在街头巷尾的五菱宏光;再炫酷的开发技巧,也敌不过没头没脑的overdesign。

以前就有说法,觉得用随机性相当大的GUID做种子来初始化Random就能得到比new Random()更随机的输出结果。这种不知从何而来的莫名自信一直延续到了今天。

与其想当然的 overdesign,不如自己动手做个试验

▲ 不管合适不合适,先来一顿瞎几把乱秀

且不说这堆一脉相承的智障操作对性能的影响,很显然,他们对真随机数伪随机数这些概念有点误会,对计算机生成随机数的原理也不甚了然1。他们只是看到那一长串变幻无穷的GUID后,心中的虔诚感油然而生。然而回过神,迎面扑来的却是现实的一盆刺骨冷水:对于 Random任何多余的初始化都不过是拖后腿的累赘而已

一顿操作猛如虎,一看战绩0-5

已经足够好的 Random

一个随机数发生器好坏的评判标准,首先看它在值域的分布概率是不是符合均匀分布(uniform distribution),也就是说它取得任何一个值的概率都是相同的。其次看它的性能。不管你的Random用到了多么炫酷亮瞎狗眼的神技,只要它的性能不够,狂吃资源,那它就一定是个辣鸡。如果产生一个随机数需要5分钟,那么任何音乐软件的“随机播放”都将变得索然无味。而对于那些热衷于使用花里胡哨的玩意儿来做随机数种子的方案,性能永远不可能超过默认构造函数,因为你每次生成随机数的时候都必须把这帮家伙初始化一遍,否则就和默认初始化完全一样,这些花里胡哨也就毫无意义了。数学理论不说了,先让秀儿们看看默认的Random到底够不够随机

下面这张图的数据,是用默认构造函数初始化Random生成的数据统计概率直方图\(y\) 轴表示对应随机数值出现的概率。总共生成 10000000(一千万)个0-100的随机数,用时0.76秒。令 \(\tau=0.76\),后面的case都以 \(\tau\) 作为性能基准,排除计算平台的干扰。

Case #1
new Random(),用时:$\tau=0.76$ Loading...

// 函数定义
static Random r = new Random();
int GetRandomNumber() {
    return r.Next(0, 100);
}

// 主程序
...
for (int i = 0; i < 10000000; i++) {
  output[i] = GetRandomNumber();
}
...

看,典型的均匀分布。在一千万大数据量的支撑下,0-99这100个可能值的输出概率都达到了完美的1%,误差小于 ±0.00005,还有什么可挑剔的呢?

那么,现在开始试验备受推崇的GUID初始化随机数发生器了。当然,这句话也可以拗口地说成随机生成随机数发生器(generate random generator randomly),反正都是秀嘛。

找个比较简单的GUID例子:

与其想当然的 overdesign,不如自己动手做个试验

依然生成 10000000个随机数,主程序内容不变,只需要修改 GetRandomNumber()

Case #2
new Random(GUID),用时:$52\tau$ Loading...

int GetRandomNumber() {
    Random r = new Random(
        BitConverter.ToInt32(Guid.NewGuid().ToByteArray(), 0)
    );
    return r.Next(0, 100);
}

没错,这确实能得到基本完全随机、不重复的随机数。事实上,它的误差达到了 ±0.0002,将近默认初始化的4倍,完全谈不上更好。那么性能呢?在效果近似,误差略大的情况下,GUID做种子生成10000000个随机数用时达到了默认初始化的52倍(39.5秒),这辣鸡性能还好意思吹您 与其想当然的 overdesign,不如自己动手做个试验与其想当然的 overdesign,不如自己动手做个试验与其想当然的 overdesign,不如自己动手做个试验 呢?

再来看看更秀的,本文最开始那张图里的例子,用的是GUID×Time×计数器这种秀破天际的初始化方案:

Case #3
new Random(GUID * Time * count),用时:$56\tau$ Loading...

static randomCount = 0;
static int GetRandomNumber() {
    randomCount++;
    Guid guid = Guid.NewGuid();
    int key1 = guid.GetHashCode();
    int key2 = unchecked((int)DateTime.Now.Ticks);
    int seed = unchecked(key1 * key2 * randomCount);
    Random r = new Random(seed);
    return r.Next(0, 100);
}

您可省省吧!

这段代码的作者甚至还想到了用unchecked略微优化一下代码的健壮性,习惯成自然,可以猜测他平时在业务工作中没少这么干。然后是hashcode、time tick各种key一顿花里胡哨得到一个seed来初始化Random可是这又有什么卵用呢?朋友?为了这个和 Case #1 几乎一样效果的输出结果花掉了 56倍(43秒)的计算时间,您觉得合适吗??

代码的质量不是看它用了多少技巧,秀了多少知识,只要花点功夫,这并不难做到。恰恰相反,用最简单的办法实现适当功能和良好的性能,才是最困难的。一段代码是不是实用,你也不可能靠它的字数来判断,任何结论,要么理论推导,要么试验验证。那些被人奉为经典的半吊子大神的话,可能往往只是他们放的狗屁而已。

The End. \(\Box\)

function paintCanvas(canvas,data,title=''){var myChart=echarts.init(document.getElementById(canvas));myChart.setOption({title:{text:title},tooltip:{},legend:{data:['']},xAxis:{type:'category', data:data.x,min:0,max:100,interval:10},yAxis:{},series:[{name:'',type:'bar', data:data.y}],grid:{top:10,right:0,bottom:20}})}
paintCanvas('canvas-uniform',{x:[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99],y:[0.009999,0.0099791,0.0099851,0.010045,0.0099323,0.0100227,0.009973,0.0099823,0.0099812,0.0100117,0.0099555,0.0100219,0.0099808,0.0099235,0.0100256,0.0100195,0.0100347,0.0099649,0.0099957,0.0100012,0.0099983,0.0100443,0.0100185,0.0100028,0.0099762,0.0099998,0.010038,0.0099765,0.0100109,0.01005,0.01003,0.0100054,0.0099526,0.010001,0.0100172,0.0100067,0.0099779,0.0100547,0.0100519,0.009982,0.0100288,0.0100559,0.0100406,0.0099738,0.009963,0.0099715,0.0099753,0.0099745,0.0099862,0.0099928,0.0100211,0.009992,0.0100048,0.0099731,0.0100005,0.0100157,0.0100208,0.0099976,0.0099595,0.01004,0.0100246,0.0100253,0.0100169,0.0099769,0.0099607,0.0100206,0.010013,0.0099873,0.0099567,0.0099987,0.0099625,0.0100595,0.0099338,0.0100009,0.0100181,0.0099867,0.0100141,0.010015,0.0099953,0.0100089,0.0100287,0.0100257,0.0100045,0.0100001,0.0100012,0.0100357,0.0099458,0.0100448,0.0099926,0.0099496,0.0100401,0.0099849,0.0099666,0.0100041,0.0100289,0.0099873,0.009978,0.0099922,0.0099923,0.0100045]});
paintCanvas('canvas-guid-complex',{x:[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99],y:[0.0099588,0.0099963,0.0099867,0.0100279,0.0099967,0.010009,0.0100268,0.0099682,0.0100487,0.0099889,0.0099977,0.0100182,0.0099335,0.0100526,0.009973,0.0100066,0.0100051,0.0100047,0.0099966,0.0099876,0.0100194,0.0099806,0.0100175,0.0099505,0.0100212,0.0100264,0.009992,0.0100031,0.009983,0.010008,0.0099936,0.0099991,0.0099872,0.0100077,0.0099912,0.010037,0.0099905,0.010021,0.009914,0.0100222,0.0100325,0.0100102,0.0099669,0.0100526,0.0100275,0.0099605,0.009992,0.0100189,0.009977,0.0099784,0.0100042,0.0100493,0.0100071,0.0100028,0.0099897,0.0099899,0.0100025,0.0100122,0.0099531,0.0099807,0.0099904,0.0100486,0.0099704,0.0100065,0.010033,0.0100121,0.0099809,0.0100112,0.0099883,0.0100166,0.0100253,0.0100342,0.0100222,0.0099733,0.010014,0.0100402,0.0100118,0.0099748,0.0099949,0.0099435,0.0100269,0.0100096,0.0100035,0.0099935,0.0100527,0.010029,0.0099903,0.0100189,0.0099248,0.0099613,0.0100053,0.010031,0.0099276,0.0099528,0.0100068,0.0099687,0.0100655,0.0100228,0.0099488,0.0100112]});
paintCanvas('canvas-guid',{x:[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99],y:[0.009969,0.0099814,0.010007,0.0099734,0.0100348,0.0099411,0.010003,0.0099791,0.0100223,0.0100081,0.010043,0.0099691,0.0100782,0.0099857,0.0100318,0.0099852,0.0099031,0.0100022,0.0099955,0.0100051,0.0100461,0.0100305,0.0099732,0.010001,0.0100225,0.0099725,0.0099751,0.0099508,0.0100818,0.0100174,0.0099828,0.0099956,0.0100132,0.0099434,0.0100041,0.0100116,0.0100056,0.0099988,0.0099917,0.0099825,0.0100184,0.0100239,0.0100129,0.010003,0.0099602,0.0099697,0.0100037,0.0100334,0.0099445,0.0099953,0.0100062,0.0100498,0.0099619,0.0100386,0.0100099,0.0099716,0.009962,0.009943,0.010002,0.0100403,0.0100361,0.0099616,0.0100443,0.0100754,0.0099895,0.0100405,0.0100066,0.0100115,0.0099725,0.0100516,0.0099657,0.0099764,0.0100091,0.0100129,0.0100226,0.0099746,0.009995,0.0100378,0.0099761,0.010022,0.0099874,0.0099882,0.0099836,0.0100138,0.0100195,0.0100456,0.0099339,0.0099185,0.0099723,0.0099984,0.0100329,0.010027,0.0099859,0.0100038,0.0099408,0.0100047,0.0100058,0.0100318,0.010004,0.0100517]});

  1. 所有的真随机数发生器都需要专用硬件支持,它们中绝大部分受到发明专利保护。System.Random 基于 Donald E. Knuth 的减随机数生成器算法实现,从实用角度而言,随机程度已经足够。