How can Python randomly shuffle a list of 100,000 items?











up vote
-1
down vote

favorite












I'm trying to perform some statistical analysis on long sequences of numbers. That requires a randomised shuffle of the list. The tests are sensitive, so fairness and randomness is very important. The list is 100,000 integers, but I would like to try 1 million.





NB.




  • Fairness trumps efficiency or speed.


  • I have access to /dev/urandom.


  • The USA's NIST laboratory does it using C++ within their entropy measurement suite, SP800-90B, EntropyAssessment. They sort sequences of 1 million bytes. It's @ https://github.com/usnistgov/SP800-90B_EntropyAssessment.











share|improve this question




















  • 1




    What have you tried so far?
    – BernardL
    Nov 22 at 16:38










  • Can you use NumPy?
    – Nils Werner
    Nov 22 at 16:38










  • @NilsWerner I can, but won't that be subject to the 2080 limit?
    – Paul Uszak
    Nov 22 at 16:40










  • No, of course not!
    – Nils Werner
    Nov 22 at 16:40















up vote
-1
down vote

favorite












I'm trying to perform some statistical analysis on long sequences of numbers. That requires a randomised shuffle of the list. The tests are sensitive, so fairness and randomness is very important. The list is 100,000 integers, but I would like to try 1 million.





NB.




  • Fairness trumps efficiency or speed.


  • I have access to /dev/urandom.


  • The USA's NIST laboratory does it using C++ within their entropy measurement suite, SP800-90B, EntropyAssessment. They sort sequences of 1 million bytes. It's @ https://github.com/usnistgov/SP800-90B_EntropyAssessment.











share|improve this question




















  • 1




    What have you tried so far?
    – BernardL
    Nov 22 at 16:38










  • Can you use NumPy?
    – Nils Werner
    Nov 22 at 16:38










  • @NilsWerner I can, but won't that be subject to the 2080 limit?
    – Paul Uszak
    Nov 22 at 16:40










  • No, of course not!
    – Nils Werner
    Nov 22 at 16:40













up vote
-1
down vote

favorite









up vote
-1
down vote

favorite











I'm trying to perform some statistical analysis on long sequences of numbers. That requires a randomised shuffle of the list. The tests are sensitive, so fairness and randomness is very important. The list is 100,000 integers, but I would like to try 1 million.





NB.




  • Fairness trumps efficiency or speed.


  • I have access to /dev/urandom.


  • The USA's NIST laboratory does it using C++ within their entropy measurement suite, SP800-90B, EntropyAssessment. They sort sequences of 1 million bytes. It's @ https://github.com/usnistgov/SP800-90B_EntropyAssessment.











share|improve this question















I'm trying to perform some statistical analysis on long sequences of numbers. That requires a randomised shuffle of the list. The tests are sensitive, so fairness and randomness is very important. The list is 100,000 integers, but I would like to try 1 million.





NB.




  • Fairness trumps efficiency or speed.


  • I have access to /dev/urandom.


  • The USA's NIST laboratory does it using C++ within their entropy measurement suite, SP800-90B, EntropyAssessment. They sort sequences of 1 million bytes. It's @ https://github.com/usnistgov/SP800-90B_EntropyAssessment.








python random shuffle






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 at 17:05

























asked Nov 22 at 16:36









Paul Uszak

209414




209414








  • 1




    What have you tried so far?
    – BernardL
    Nov 22 at 16:38










  • Can you use NumPy?
    – Nils Werner
    Nov 22 at 16:38










  • @NilsWerner I can, but won't that be subject to the 2080 limit?
    – Paul Uszak
    Nov 22 at 16:40










  • No, of course not!
    – Nils Werner
    Nov 22 at 16:40














  • 1




    What have you tried so far?
    – BernardL
    Nov 22 at 16:38










  • Can you use NumPy?
    – Nils Werner
    Nov 22 at 16:38










  • @NilsWerner I can, but won't that be subject to the 2080 limit?
    – Paul Uszak
    Nov 22 at 16:40










  • No, of course not!
    – Nils Werner
    Nov 22 at 16:40








1




1




What have you tried so far?
– BernardL
Nov 22 at 16:38




What have you tried so far?
– BernardL
Nov 22 at 16:38












Can you use NumPy?
– Nils Werner
Nov 22 at 16:38




Can you use NumPy?
– Nils Werner
Nov 22 at 16:38












@NilsWerner I can, but won't that be subject to the 2080 limit?
– Paul Uszak
Nov 22 at 16:40




@NilsWerner I can, but won't that be subject to the 2080 limit?
– Paul Uszak
Nov 22 at 16:40












No, of course not!
– Nils Werner
Nov 22 at 16:40




No, of course not!
– Nils Werner
Nov 22 at 16:40












2 Answers
2






active

oldest

votes

















up vote
0
down vote



accepted










You can easily shuffle millions of numbers in NumPy:



import numpy as np

data = np.arange(1e6)
%timeit np.random.shuffle(data)
# 32.7 ms ± 2.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)





share|improve this answer























  • I misunderstood what the "Maximal" link was actually telling me. Of course you're right, of course :-)
    – Paul Uszak
    Nov 22 at 17:05


















up vote
0
down vote













Have you tried using numpy's shuffle?



https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.shuffle.html



or permutation if you don't want to do this inplace: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.permutation.html






share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53435155%2fhow-can-python-randomly-shuffle-a-list-of-100-000-items%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote



    accepted










    You can easily shuffle millions of numbers in NumPy:



    import numpy as np

    data = np.arange(1e6)
    %timeit np.random.shuffle(data)
    # 32.7 ms ± 2.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)





    share|improve this answer























    • I misunderstood what the "Maximal" link was actually telling me. Of course you're right, of course :-)
      – Paul Uszak
      Nov 22 at 17:05















    up vote
    0
    down vote



    accepted










    You can easily shuffle millions of numbers in NumPy:



    import numpy as np

    data = np.arange(1e6)
    %timeit np.random.shuffle(data)
    # 32.7 ms ± 2.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)





    share|improve this answer























    • I misunderstood what the "Maximal" link was actually telling me. Of course you're right, of course :-)
      – Paul Uszak
      Nov 22 at 17:05













    up vote
    0
    down vote



    accepted







    up vote
    0
    down vote



    accepted






    You can easily shuffle millions of numbers in NumPy:



    import numpy as np

    data = np.arange(1e6)
    %timeit np.random.shuffle(data)
    # 32.7 ms ± 2.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)





    share|improve this answer














    You can easily shuffle millions of numbers in NumPy:



    import numpy as np

    data = np.arange(1e6)
    %timeit np.random.shuffle(data)
    # 32.7 ms ± 2.25 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)






    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Nov 22 at 17:09

























    answered Nov 22 at 16:41









    Nils Werner

    17.2k13859




    17.2k13859












    • I misunderstood what the "Maximal" link was actually telling me. Of course you're right, of course :-)
      – Paul Uszak
      Nov 22 at 17:05


















    • I misunderstood what the "Maximal" link was actually telling me. Of course you're right, of course :-)
      – Paul Uszak
      Nov 22 at 17:05
















    I misunderstood what the "Maximal" link was actually telling me. Of course you're right, of course :-)
    – Paul Uszak
    Nov 22 at 17:05




    I misunderstood what the "Maximal" link was actually telling me. Of course you're right, of course :-)
    – Paul Uszak
    Nov 22 at 17:05












    up vote
    0
    down vote













    Have you tried using numpy's shuffle?



    https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.shuffle.html



    or permutation if you don't want to do this inplace: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.permutation.html






    share|improve this answer

























      up vote
      0
      down vote













      Have you tried using numpy's shuffle?



      https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.shuffle.html



      or permutation if you don't want to do this inplace: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.permutation.html






      share|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        Have you tried using numpy's shuffle?



        https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.shuffle.html



        or permutation if you don't want to do this inplace: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.permutation.html






        share|improve this answer












        Have you tried using numpy's shuffle?



        https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.shuffle.html



        or permutation if you don't want to do this inplace: https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.permutation.html







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 22 at 16:39









        Dan

        36.7k95199




        36.7k95199






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53435155%2fhow-can-python-randomly-shuffle-a-list-of-100-000-items%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            What visual should I use to simply compare current year value vs last year in Power BI desktop

            How to ignore python UserWarning in pytest?

            Alexandru Averescu