Choosing ordered distinct random integers

This is a response to a puzzle posted by Daniel Lemire.

On first reading, I missed that Daniel wanted an ordered set, so offered a ‘shuffle’ algorithm.

On second reading, noting the power-of-two bias in the benchmark, took advantage and offered a ‘cheat’ algorithm, as an upper limit. The ‘cheat’ algorithm does not generate all possible sets. But it is fast. Which matters most depends on your usage.

Posted the sources to GitHub.

… a diversion. :)

Reporting speed as millions of values per second.  Max: 1048576
              hash  hash/neg    hybrid reservoir      tree   shuffle     cheat
  4096        0.27      1.21      5.37      0.01      0.53      5.28      7.01
  2048        4.33      2.59      5.12      0.02      2.65     10.09     92.93
  1024        4.27      6.53      3.14      0.04      4.39     12.50    108.37
   512        5.07      6.32     11.11      0.09      5.09     12.17    111.78
   256        5.22      5.86     16.16      0.18      4.46     11.52    108.13
   128        5.31      5.48     19.95      0.35      4.32     10.83    107.33
    64        5.32      5.60     22.89      0.68      3.95     10.22    107.44
    32        5.36      5.44     26.49      1.29      3.91      9.81    107.45
    16        4.91      5.04     28.51      2.35      3.19      9.39    107.37
     8        3.84      3.94     28.28      4.03      2.47      8.76    107.63
     4        2.72      2.80     26.85      6.54      1.51      8.50    107.10
     2        2.08      2.11     21.98     10.71      0.94      8.06    106.16
Reporting speed as millions of values per second.  Max: 262144
              hash  hash/neg    hybrid reservoir      tree   shuffle     cheat
  1024        7.22      7.30      9.00      0.05      6.77     15.14    105.88
   512        6.99      6.95     12.57      0.09      6.83     14.04    105.59
   256        6.78      6.74     15.81      0.18      5.88     10.09    103.17
   128        6.20      6.28     20.68      0.35      5.75     12.61    112.66
    64        6.19      6.12     24.73      0.69      4.71     12.16    111.99
    32        5.96      5.97     28.04      1.32      4.74     11.51    110.57
    16        5.51      5.85     30.23      2.43      4.32     10.96    108.58
     8        5.48      5.57     30.78      4.14      3.73     10.42    112.01
     4        5.25      5.26     28.93      6.89      3.09      9.61    104.60
     2        3.78      3.80     23.74     11.56      1.93      9.19    106.82
Reporting speed as millions of values per second.  Max: 65536
              hash  hash/neg    hybrid reservoir      tree   shuffle     cheat
   256        6.52      6.66     16.45      0.17      7.11     15.54     94.37
   128        6.56      7.01     21.25      0.35      6.74     14.11    111.87
    64        6.62      6.56     24.87      0.67      5.75     12.59    102.90
    32        5.63      5.96     26.33      1.23      5.67     12.40    108.15
    16        6.21      6.09     30.36      2.46      5.10     12.10    110.82
     8        5.60      5.75     30.36      4.48      4.56     11.55    110.59
     4        5.99      5.79     28.92      7.44      3.95     10.96    110.19
     2        5.61      5.76     23.79     12.63      2.99     10.44    109.16

blogroll

social