Divide a set of strings into those which contain “a” and those which don't











up vote
4
down vote

favorite
1












I am beginning to learn Python on my own through edX and I was given this basic problem: "Given a list of cities, separate the list into two lists, namely one with cities containing the letter 'a' in their name and one with cities not containing the letter 'a'."



This is my approach. It works, but I can't help but feel like there is a much more concise way to get this result. If anybody could offer some other methods I would be interested:



cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

a_city, no_a_city = ,
for city in cities:
for x in range(0, len(city)):
if city[x] == "a":
a_city.append(city)
break
else:
if x == len(city) - 1:
no_a_city.append(city)



print("a_city:", a_city)
print("no_a_city:", no_a_city)


I am also interested in learning more about the interaction between nested loops/conditionals and complexity if anybody has some resources.










share|improve this question









New contributor




JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
























    up vote
    4
    down vote

    favorite
    1












    I am beginning to learn Python on my own through edX and I was given this basic problem: "Given a list of cities, separate the list into two lists, namely one with cities containing the letter 'a' in their name and one with cities not containing the letter 'a'."



    This is my approach. It works, but I can't help but feel like there is a much more concise way to get this result. If anybody could offer some other methods I would be interested:



    cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

    a_city, no_a_city = ,
    for city in cities:
    for x in range(0, len(city)):
    if city[x] == "a":
    a_city.append(city)
    break
    else:
    if x == len(city) - 1:
    no_a_city.append(city)



    print("a_city:", a_city)
    print("no_a_city:", no_a_city)


    I am also interested in learning more about the interaction between nested loops/conditionals and complexity if anybody has some resources.










    share|improve this question









    New contributor




    JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      up vote
      4
      down vote

      favorite
      1









      up vote
      4
      down vote

      favorite
      1






      1





      I am beginning to learn Python on my own through edX and I was given this basic problem: "Given a list of cities, separate the list into two lists, namely one with cities containing the letter 'a' in their name and one with cities not containing the letter 'a'."



      This is my approach. It works, but I can't help but feel like there is a much more concise way to get this result. If anybody could offer some other methods I would be interested:



      cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

      a_city, no_a_city = ,
      for city in cities:
      for x in range(0, len(city)):
      if city[x] == "a":
      a_city.append(city)
      break
      else:
      if x == len(city) - 1:
      no_a_city.append(city)



      print("a_city:", a_city)
      print("no_a_city:", no_a_city)


      I am also interested in learning more about the interaction between nested loops/conditionals and complexity if anybody has some resources.










      share|improve this question









      New contributor




      JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I am beginning to learn Python on my own through edX and I was given this basic problem: "Given a list of cities, separate the list into two lists, namely one with cities containing the letter 'a' in their name and one with cities not containing the letter 'a'."



      This is my approach. It works, but I can't help but feel like there is a much more concise way to get this result. If anybody could offer some other methods I would be interested:



      cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

      a_city, no_a_city = ,
      for city in cities:
      for x in range(0, len(city)):
      if city[x] == "a":
      a_city.append(city)
      break
      else:
      if x == len(city) - 1:
      no_a_city.append(city)



      print("a_city:", a_city)
      print("no_a_city:", no_a_city)


      I am also interested in learning more about the interaction between nested loops/conditionals and complexity if anybody has some resources.







      python python-3.x






      share|improve this question









      New contributor




      JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 6 hours ago









      Toby Speight

      22.1k536108




      22.1k536108






      New contributor




      JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 7 hours ago









      JacobCheverie

      1235




      1235




      New contributor




      JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      JacobCheverie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          3 Answers
          3






          active

          oldest

          votes

















          up vote
          8
          down vote



          accepted










          This usecase is actually covered by one of the itertools recipes. itertools is a package in the Python standard library that supplies fast and efficient tools for iterating over things or creating certain iterable things (like the combination of all pairs and such). It is an often used library and well worth it to get acquainted with.



          The recipe is as follows:




          from itertools import filterfalse, tee

          def partition(pred, iterable):
          'Use a predicate to partition entries into false entries and true entries'
          # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
          t1, t2 = tee(iterable)
          return filterfalse(pred, t1), filter(pred, t2)



          In your specific case you would use it like this:



          if __name__ == "__main__":    
          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]
          no_a_city, a_city = map(list, partition(lambda city: "a" in city, cities))
          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The map(list, ...) part is needed because what the partition function returns are generators that generate values on the fly. They can be consumed into a list.



          The predicate used is a lambda function, an anonymous function which in this case returns truthy or falsy values. It is used to test each element of the iterable.



          Instead of manually iterating over each name (even worse, over each index of each name, have a look at Loop Like A Native), I used the fact that strings support the in operator.



          I also added a if __name__ == "__main__": guard to allow importing from this script from another script.





          One thing you could have used in your code is the fact that for loops have an optional else clause which is run if no break statement interrupted the loop:



          a_city, no_a_city = ,
          for city in cities:
          for char in city:
          if char == "a":
          a_city.append(city)
          break
          else:
          no_a_city.append(city)




          As for complexity, this has the same complexity as your code. You have two nested for loops, making this on average $mathcal{O}(nk)$ with $n$ being the number of cities and $k$ being the average length of the city names.



          The in operator for strings is $mathcal{O}(k)$ (it is just the same loop you wrote, but probably wirtten in C) and it is used once per city, so my code should also be $mathcal{O}(nk)$.






          share|improve this answer























          • Thank you Graipher, That is a lot to take in for a beginner, but I will work on dissecting some of it to improve my coding. I briefly checked out that link that you provided and already see some issues with what I am doing (as you pointed out, iterating over each name AND each index of each name) so thanks for that. You've given me a lot to look into. On complexity: using the link that you provided, I can see a way to write this same code using only one for loop. Does this imply that it is less complex than your code?
            – JacobCheverie
            6 hours ago








          • 1




            @JacobCheverie: Yes, there is still some road left ahead of you, but don't despair. And no, just because there is only one visible for loop does not make it linear in time (my solution has no visible for loop and yet it is not). I cannot think of a way that does not have to look at each character of each city name, but maybe you are more clever than I.
            – Graipher
            6 hours ago


















          up vote
          3
          down vote













          This is not a complete review, but rather two simpler alternatives that illustrate how you can cut the some of the inner control flow using the in keyword. This keyword tests for membership of an object in an iterable. In this case, you want to test for membership of 'a' in the name of each element of cities.



          The first takes the same approach as your code



          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

          a_city, no_a_city = ,
          for city in cities:
          if 'a' in city:
          a_city.append(city)
          else:
          no_a_city.append(city)

          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The membership test using in is a drop-in replacement for the harder-to-read and more error-prone explicit loop over the characters.



          An even cleaner solution makes use of the built-in set data type. In simplistic terms, a set is like a list, except that it is not ordered and does not contain duplicates.



          # a set is constructed with {}, unlike the the  used for lists
          cities = {"New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"}

          # we can also construct one using the `set` builtin function (analogous to `list`)
          a_city = set(city for city in cities if 'a' in city)

          # subtracting one set from another is well-defined and does the obvious thing
          no_a_city = cities - a_city


          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          Check out the docs for a flavor of the kinds of rich membership and comparison that sets allow. These operations can typically be expected to be more efficient than equivalent algorithms on lists, mainly due to the fact that sets are guaranteed not to have duplicate elements.






          share|improve this answer





















          • Thanks @Endulum, I like the set approach. I can see where I would have to be careful using this, however. If I was not guaranteed to have distinct members then I assume that the set approach would discard some information and thus not be appropriate. When you define a_city, do I read this as I would read a mathematical set (set of all 'city' such that for 'city' in 'cities' if 'a' in 'city')? Without reading as so, it appears to me as a typo.
            – JacobCheverie
            1 hour ago










          • Regarding the appropriateness of sets versus dictionaries: yes, if is desirable that you retain duplicates, then a set is likely not the right collection to choose.
            – Endulum
            1 hour ago










          • Regarding the relationship to mathematical set notation, it should be read like (the set of all 'city' in 'cities' such that 'a' in 'city'). It is an alternative syntax for consisely building lists, sets, or other collections without an explicit loop. It is called a "comprehension", in this case a "set comprehension". If you have not learned about these yet, I expect you will before long. They are very popular among Python programmers.
            – Endulum
            1 hour ago










          • I had touched on list comprehensions briefly last night while reading but I didn't recognize the form. I only asked about the relationship to mathematics because city is written twice at the start of the set definition which seems a bit hard to remember, but I think that this is just Python syntax I must get used to. Thanks again!
            – JacobCheverie
            36 mins ago


















          up vote
          3
          down vote













          The task here is to collate the list of cities according to a key. In this case the key can be 'a' in city, which is True if the city contains 'a' and False otherwise.



          It's common to encounter processing tasks which require collation, for example to find words that are anagrams of each other we could collate the words in a dictionary according to their sorted letters.



          There is a standard pattern for collation in Python, which is to use collections.defaultdict and a loop. In the cities case, it goes like this:



          from collections import defaultdict
          with_a = defaultdict(list)
          for city in cities:
          with_a['a' in city].append(city)


          After running this loop, with_a[True] is the list of cities with 'a' and with_a[False] is the list of cities without.



          I prefer this approach to itertools.partition because it iterates over the input just once (whereas partition iterates over the input twice), and it's clear how to generalize it to other kinds of key.






          share|improve this answer





















          • This is seemingly high-powered. I will look into defaultdict some more to gain a better understanding of the code. When you mention your preference of this over partition, does that imply that this would be a more efficient/less complex approach?
            – JacobCheverie
            1 hour ago











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "196"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          JacobCheverie is a new contributor. Be nice, and check out our Code of Conduct.










           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208130%2fdivide-a-set-of-strings-into-those-which-contain-a-and-those-which-dont%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          8
          down vote



          accepted










          This usecase is actually covered by one of the itertools recipes. itertools is a package in the Python standard library that supplies fast and efficient tools for iterating over things or creating certain iterable things (like the combination of all pairs and such). It is an often used library and well worth it to get acquainted with.



          The recipe is as follows:




          from itertools import filterfalse, tee

          def partition(pred, iterable):
          'Use a predicate to partition entries into false entries and true entries'
          # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
          t1, t2 = tee(iterable)
          return filterfalse(pred, t1), filter(pred, t2)



          In your specific case you would use it like this:



          if __name__ == "__main__":    
          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]
          no_a_city, a_city = map(list, partition(lambda city: "a" in city, cities))
          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The map(list, ...) part is needed because what the partition function returns are generators that generate values on the fly. They can be consumed into a list.



          The predicate used is a lambda function, an anonymous function which in this case returns truthy or falsy values. It is used to test each element of the iterable.



          Instead of manually iterating over each name (even worse, over each index of each name, have a look at Loop Like A Native), I used the fact that strings support the in operator.



          I also added a if __name__ == "__main__": guard to allow importing from this script from another script.





          One thing you could have used in your code is the fact that for loops have an optional else clause which is run if no break statement interrupted the loop:



          a_city, no_a_city = ,
          for city in cities:
          for char in city:
          if char == "a":
          a_city.append(city)
          break
          else:
          no_a_city.append(city)




          As for complexity, this has the same complexity as your code. You have two nested for loops, making this on average $mathcal{O}(nk)$ with $n$ being the number of cities and $k$ being the average length of the city names.



          The in operator for strings is $mathcal{O}(k)$ (it is just the same loop you wrote, but probably wirtten in C) and it is used once per city, so my code should also be $mathcal{O}(nk)$.






          share|improve this answer























          • Thank you Graipher, That is a lot to take in for a beginner, but I will work on dissecting some of it to improve my coding. I briefly checked out that link that you provided and already see some issues with what I am doing (as you pointed out, iterating over each name AND each index of each name) so thanks for that. You've given me a lot to look into. On complexity: using the link that you provided, I can see a way to write this same code using only one for loop. Does this imply that it is less complex than your code?
            – JacobCheverie
            6 hours ago








          • 1




            @JacobCheverie: Yes, there is still some road left ahead of you, but don't despair. And no, just because there is only one visible for loop does not make it linear in time (my solution has no visible for loop and yet it is not). I cannot think of a way that does not have to look at each character of each city name, but maybe you are more clever than I.
            – Graipher
            6 hours ago















          up vote
          8
          down vote



          accepted










          This usecase is actually covered by one of the itertools recipes. itertools is a package in the Python standard library that supplies fast and efficient tools for iterating over things or creating certain iterable things (like the combination of all pairs and such). It is an often used library and well worth it to get acquainted with.



          The recipe is as follows:




          from itertools import filterfalse, tee

          def partition(pred, iterable):
          'Use a predicate to partition entries into false entries and true entries'
          # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
          t1, t2 = tee(iterable)
          return filterfalse(pred, t1), filter(pred, t2)



          In your specific case you would use it like this:



          if __name__ == "__main__":    
          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]
          no_a_city, a_city = map(list, partition(lambda city: "a" in city, cities))
          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The map(list, ...) part is needed because what the partition function returns are generators that generate values on the fly. They can be consumed into a list.



          The predicate used is a lambda function, an anonymous function which in this case returns truthy or falsy values. It is used to test each element of the iterable.



          Instead of manually iterating over each name (even worse, over each index of each name, have a look at Loop Like A Native), I used the fact that strings support the in operator.



          I also added a if __name__ == "__main__": guard to allow importing from this script from another script.





          One thing you could have used in your code is the fact that for loops have an optional else clause which is run if no break statement interrupted the loop:



          a_city, no_a_city = ,
          for city in cities:
          for char in city:
          if char == "a":
          a_city.append(city)
          break
          else:
          no_a_city.append(city)




          As for complexity, this has the same complexity as your code. You have two nested for loops, making this on average $mathcal{O}(nk)$ with $n$ being the number of cities and $k$ being the average length of the city names.



          The in operator for strings is $mathcal{O}(k)$ (it is just the same loop you wrote, but probably wirtten in C) and it is used once per city, so my code should also be $mathcal{O}(nk)$.






          share|improve this answer























          • Thank you Graipher, That is a lot to take in for a beginner, but I will work on dissecting some of it to improve my coding. I briefly checked out that link that you provided and already see some issues with what I am doing (as you pointed out, iterating over each name AND each index of each name) so thanks for that. You've given me a lot to look into. On complexity: using the link that you provided, I can see a way to write this same code using only one for loop. Does this imply that it is less complex than your code?
            – JacobCheverie
            6 hours ago








          • 1




            @JacobCheverie: Yes, there is still some road left ahead of you, but don't despair. And no, just because there is only one visible for loop does not make it linear in time (my solution has no visible for loop and yet it is not). I cannot think of a way that does not have to look at each character of each city name, but maybe you are more clever than I.
            – Graipher
            6 hours ago













          up vote
          8
          down vote



          accepted







          up vote
          8
          down vote



          accepted






          This usecase is actually covered by one of the itertools recipes. itertools is a package in the Python standard library that supplies fast and efficient tools for iterating over things or creating certain iterable things (like the combination of all pairs and such). It is an often used library and well worth it to get acquainted with.



          The recipe is as follows:




          from itertools import filterfalse, tee

          def partition(pred, iterable):
          'Use a predicate to partition entries into false entries and true entries'
          # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
          t1, t2 = tee(iterable)
          return filterfalse(pred, t1), filter(pred, t2)



          In your specific case you would use it like this:



          if __name__ == "__main__":    
          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]
          no_a_city, a_city = map(list, partition(lambda city: "a" in city, cities))
          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The map(list, ...) part is needed because what the partition function returns are generators that generate values on the fly. They can be consumed into a list.



          The predicate used is a lambda function, an anonymous function which in this case returns truthy or falsy values. It is used to test each element of the iterable.



          Instead of manually iterating over each name (even worse, over each index of each name, have a look at Loop Like A Native), I used the fact that strings support the in operator.



          I also added a if __name__ == "__main__": guard to allow importing from this script from another script.





          One thing you could have used in your code is the fact that for loops have an optional else clause which is run if no break statement interrupted the loop:



          a_city, no_a_city = ,
          for city in cities:
          for char in city:
          if char == "a":
          a_city.append(city)
          break
          else:
          no_a_city.append(city)




          As for complexity, this has the same complexity as your code. You have two nested for loops, making this on average $mathcal{O}(nk)$ with $n$ being the number of cities and $k$ being the average length of the city names.



          The in operator for strings is $mathcal{O}(k)$ (it is just the same loop you wrote, but probably wirtten in C) and it is used once per city, so my code should also be $mathcal{O}(nk)$.






          share|improve this answer














          This usecase is actually covered by one of the itertools recipes. itertools is a package in the Python standard library that supplies fast and efficient tools for iterating over things or creating certain iterable things (like the combination of all pairs and such). It is an often used library and well worth it to get acquainted with.



          The recipe is as follows:




          from itertools import filterfalse, tee

          def partition(pred, iterable):
          'Use a predicate to partition entries into false entries and true entries'
          # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
          t1, t2 = tee(iterable)
          return filterfalse(pred, t1), filter(pred, t2)



          In your specific case you would use it like this:



          if __name__ == "__main__":    
          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]
          no_a_city, a_city = map(list, partition(lambda city: "a" in city, cities))
          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The map(list, ...) part is needed because what the partition function returns are generators that generate values on the fly. They can be consumed into a list.



          The predicate used is a lambda function, an anonymous function which in this case returns truthy or falsy values. It is used to test each element of the iterable.



          Instead of manually iterating over each name (even worse, over each index of each name, have a look at Loop Like A Native), I used the fact that strings support the in operator.



          I also added a if __name__ == "__main__": guard to allow importing from this script from another script.





          One thing you could have used in your code is the fact that for loops have an optional else clause which is run if no break statement interrupted the loop:



          a_city, no_a_city = ,
          for city in cities:
          for char in city:
          if char == "a":
          a_city.append(city)
          break
          else:
          no_a_city.append(city)




          As for complexity, this has the same complexity as your code. You have two nested for loops, making this on average $mathcal{O}(nk)$ with $n$ being the number of cities and $k$ being the average length of the city names.



          The in operator for strings is $mathcal{O}(k)$ (it is just the same loop you wrote, but probably wirtten in C) and it is used once per city, so my code should also be $mathcal{O}(nk)$.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 6 hours ago

























          answered 7 hours ago









          Graipher

          22.2k53183




          22.2k53183












          • Thank you Graipher, That is a lot to take in for a beginner, but I will work on dissecting some of it to improve my coding. I briefly checked out that link that you provided and already see some issues with what I am doing (as you pointed out, iterating over each name AND each index of each name) so thanks for that. You've given me a lot to look into. On complexity: using the link that you provided, I can see a way to write this same code using only one for loop. Does this imply that it is less complex than your code?
            – JacobCheverie
            6 hours ago








          • 1




            @JacobCheverie: Yes, there is still some road left ahead of you, but don't despair. And no, just because there is only one visible for loop does not make it linear in time (my solution has no visible for loop and yet it is not). I cannot think of a way that does not have to look at each character of each city name, but maybe you are more clever than I.
            – Graipher
            6 hours ago


















          • Thank you Graipher, That is a lot to take in for a beginner, but I will work on dissecting some of it to improve my coding. I briefly checked out that link that you provided and already see some issues with what I am doing (as you pointed out, iterating over each name AND each index of each name) so thanks for that. You've given me a lot to look into. On complexity: using the link that you provided, I can see a way to write this same code using only one for loop. Does this imply that it is less complex than your code?
            – JacobCheverie
            6 hours ago








          • 1




            @JacobCheverie: Yes, there is still some road left ahead of you, but don't despair. And no, just because there is only one visible for loop does not make it linear in time (my solution has no visible for loop and yet it is not). I cannot think of a way that does not have to look at each character of each city name, but maybe you are more clever than I.
            – Graipher
            6 hours ago
















          Thank you Graipher, That is a lot to take in for a beginner, but I will work on dissecting some of it to improve my coding. I briefly checked out that link that you provided and already see some issues with what I am doing (as you pointed out, iterating over each name AND each index of each name) so thanks for that. You've given me a lot to look into. On complexity: using the link that you provided, I can see a way to write this same code using only one for loop. Does this imply that it is less complex than your code?
          – JacobCheverie
          6 hours ago






          Thank you Graipher, That is a lot to take in for a beginner, but I will work on dissecting some of it to improve my coding. I briefly checked out that link that you provided and already see some issues with what I am doing (as you pointed out, iterating over each name AND each index of each name) so thanks for that. You've given me a lot to look into. On complexity: using the link that you provided, I can see a way to write this same code using only one for loop. Does this imply that it is less complex than your code?
          – JacobCheverie
          6 hours ago






          1




          1




          @JacobCheverie: Yes, there is still some road left ahead of you, but don't despair. And no, just because there is only one visible for loop does not make it linear in time (my solution has no visible for loop and yet it is not). I cannot think of a way that does not have to look at each character of each city name, but maybe you are more clever than I.
          – Graipher
          6 hours ago




          @JacobCheverie: Yes, there is still some road left ahead of you, but don't despair. And no, just because there is only one visible for loop does not make it linear in time (my solution has no visible for loop and yet it is not). I cannot think of a way that does not have to look at each character of each city name, but maybe you are more clever than I.
          – Graipher
          6 hours ago












          up vote
          3
          down vote













          This is not a complete review, but rather two simpler alternatives that illustrate how you can cut the some of the inner control flow using the in keyword. This keyword tests for membership of an object in an iterable. In this case, you want to test for membership of 'a' in the name of each element of cities.



          The first takes the same approach as your code



          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

          a_city, no_a_city = ,
          for city in cities:
          if 'a' in city:
          a_city.append(city)
          else:
          no_a_city.append(city)

          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The membership test using in is a drop-in replacement for the harder-to-read and more error-prone explicit loop over the characters.



          An even cleaner solution makes use of the built-in set data type. In simplistic terms, a set is like a list, except that it is not ordered and does not contain duplicates.



          # a set is constructed with {}, unlike the the  used for lists
          cities = {"New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"}

          # we can also construct one using the `set` builtin function (analogous to `list`)
          a_city = set(city for city in cities if 'a' in city)

          # subtracting one set from another is well-defined and does the obvious thing
          no_a_city = cities - a_city


          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          Check out the docs for a flavor of the kinds of rich membership and comparison that sets allow. These operations can typically be expected to be more efficient than equivalent algorithms on lists, mainly due to the fact that sets are guaranteed not to have duplicate elements.






          share|improve this answer





















          • Thanks @Endulum, I like the set approach. I can see where I would have to be careful using this, however. If I was not guaranteed to have distinct members then I assume that the set approach would discard some information and thus not be appropriate. When you define a_city, do I read this as I would read a mathematical set (set of all 'city' such that for 'city' in 'cities' if 'a' in 'city')? Without reading as so, it appears to me as a typo.
            – JacobCheverie
            1 hour ago










          • Regarding the appropriateness of sets versus dictionaries: yes, if is desirable that you retain duplicates, then a set is likely not the right collection to choose.
            – Endulum
            1 hour ago










          • Regarding the relationship to mathematical set notation, it should be read like (the set of all 'city' in 'cities' such that 'a' in 'city'). It is an alternative syntax for consisely building lists, sets, or other collections without an explicit loop. It is called a "comprehension", in this case a "set comprehension". If you have not learned about these yet, I expect you will before long. They are very popular among Python programmers.
            – Endulum
            1 hour ago










          • I had touched on list comprehensions briefly last night while reading but I didn't recognize the form. I only asked about the relationship to mathematics because city is written twice at the start of the set definition which seems a bit hard to remember, but I think that this is just Python syntax I must get used to. Thanks again!
            – JacobCheverie
            36 mins ago















          up vote
          3
          down vote













          This is not a complete review, but rather two simpler alternatives that illustrate how you can cut the some of the inner control flow using the in keyword. This keyword tests for membership of an object in an iterable. In this case, you want to test for membership of 'a' in the name of each element of cities.



          The first takes the same approach as your code



          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

          a_city, no_a_city = ,
          for city in cities:
          if 'a' in city:
          a_city.append(city)
          else:
          no_a_city.append(city)

          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The membership test using in is a drop-in replacement for the harder-to-read and more error-prone explicit loop over the characters.



          An even cleaner solution makes use of the built-in set data type. In simplistic terms, a set is like a list, except that it is not ordered and does not contain duplicates.



          # a set is constructed with {}, unlike the the  used for lists
          cities = {"New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"}

          # we can also construct one using the `set` builtin function (analogous to `list`)
          a_city = set(city for city in cities if 'a' in city)

          # subtracting one set from another is well-defined and does the obvious thing
          no_a_city = cities - a_city


          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          Check out the docs for a flavor of the kinds of rich membership and comparison that sets allow. These operations can typically be expected to be more efficient than equivalent algorithms on lists, mainly due to the fact that sets are guaranteed not to have duplicate elements.






          share|improve this answer





















          • Thanks @Endulum, I like the set approach. I can see where I would have to be careful using this, however. If I was not guaranteed to have distinct members then I assume that the set approach would discard some information and thus not be appropriate. When you define a_city, do I read this as I would read a mathematical set (set of all 'city' such that for 'city' in 'cities' if 'a' in 'city')? Without reading as so, it appears to me as a typo.
            – JacobCheverie
            1 hour ago










          • Regarding the appropriateness of sets versus dictionaries: yes, if is desirable that you retain duplicates, then a set is likely not the right collection to choose.
            – Endulum
            1 hour ago










          • Regarding the relationship to mathematical set notation, it should be read like (the set of all 'city' in 'cities' such that 'a' in 'city'). It is an alternative syntax for consisely building lists, sets, or other collections without an explicit loop. It is called a "comprehension", in this case a "set comprehension". If you have not learned about these yet, I expect you will before long. They are very popular among Python programmers.
            – Endulum
            1 hour ago










          • I had touched on list comprehensions briefly last night while reading but I didn't recognize the form. I only asked about the relationship to mathematics because city is written twice at the start of the set definition which seems a bit hard to remember, but I think that this is just Python syntax I must get used to. Thanks again!
            – JacobCheverie
            36 mins ago













          up vote
          3
          down vote










          up vote
          3
          down vote









          This is not a complete review, but rather two simpler alternatives that illustrate how you can cut the some of the inner control flow using the in keyword. This keyword tests for membership of an object in an iterable. In this case, you want to test for membership of 'a' in the name of each element of cities.



          The first takes the same approach as your code



          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

          a_city, no_a_city = ,
          for city in cities:
          if 'a' in city:
          a_city.append(city)
          else:
          no_a_city.append(city)

          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The membership test using in is a drop-in replacement for the harder-to-read and more error-prone explicit loop over the characters.



          An even cleaner solution makes use of the built-in set data type. In simplistic terms, a set is like a list, except that it is not ordered and does not contain duplicates.



          # a set is constructed with {}, unlike the the  used for lists
          cities = {"New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"}

          # we can also construct one using the `set` builtin function (analogous to `list`)
          a_city = set(city for city in cities if 'a' in city)

          # subtracting one set from another is well-defined and does the obvious thing
          no_a_city = cities - a_city


          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          Check out the docs for a flavor of the kinds of rich membership and comparison that sets allow. These operations can typically be expected to be more efficient than equivalent algorithms on lists, mainly due to the fact that sets are guaranteed not to have duplicate elements.






          share|improve this answer












          This is not a complete review, but rather two simpler alternatives that illustrate how you can cut the some of the inner control flow using the in keyword. This keyword tests for membership of an object in an iterable. In this case, you want to test for membership of 'a' in the name of each element of cities.



          The first takes the same approach as your code



          cities = ["New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"]

          a_city, no_a_city = ,
          for city in cities:
          if 'a' in city:
          a_city.append(city)
          else:
          no_a_city.append(city)

          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          The membership test using in is a drop-in replacement for the harder-to-read and more error-prone explicit loop over the characters.



          An even cleaner solution makes use of the built-in set data type. In simplistic terms, a set is like a list, except that it is not ordered and does not contain duplicates.



          # a set is constructed with {}, unlike the the  used for lists
          cities = {"New York", "Shanghai", "Munich", "Tokyo", "Dubai", "Mexico City", "São Paulo", "Hyderabad"}

          # we can also construct one using the `set` builtin function (analogous to `list`)
          a_city = set(city for city in cities if 'a' in city)

          # subtracting one set from another is well-defined and does the obvious thing
          no_a_city = cities - a_city


          print("a_city:", a_city)
          print("no_a_city:", no_a_city)


          Check out the docs for a flavor of the kinds of rich membership and comparison that sets allow. These operations can typically be expected to be more efficient than equivalent algorithms on lists, mainly due to the fact that sets are guaranteed not to have duplicate elements.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 3 hours ago









          Endulum

          22817




          22817












          • Thanks @Endulum, I like the set approach. I can see where I would have to be careful using this, however. If I was not guaranteed to have distinct members then I assume that the set approach would discard some information and thus not be appropriate. When you define a_city, do I read this as I would read a mathematical set (set of all 'city' such that for 'city' in 'cities' if 'a' in 'city')? Without reading as so, it appears to me as a typo.
            – JacobCheverie
            1 hour ago










          • Regarding the appropriateness of sets versus dictionaries: yes, if is desirable that you retain duplicates, then a set is likely not the right collection to choose.
            – Endulum
            1 hour ago










          • Regarding the relationship to mathematical set notation, it should be read like (the set of all 'city' in 'cities' such that 'a' in 'city'). It is an alternative syntax for consisely building lists, sets, or other collections without an explicit loop. It is called a "comprehension", in this case a "set comprehension". If you have not learned about these yet, I expect you will before long. They are very popular among Python programmers.
            – Endulum
            1 hour ago










          • I had touched on list comprehensions briefly last night while reading but I didn't recognize the form. I only asked about the relationship to mathematics because city is written twice at the start of the set definition which seems a bit hard to remember, but I think that this is just Python syntax I must get used to. Thanks again!
            – JacobCheverie
            36 mins ago


















          • Thanks @Endulum, I like the set approach. I can see where I would have to be careful using this, however. If I was not guaranteed to have distinct members then I assume that the set approach would discard some information and thus not be appropriate. When you define a_city, do I read this as I would read a mathematical set (set of all 'city' such that for 'city' in 'cities' if 'a' in 'city')? Without reading as so, it appears to me as a typo.
            – JacobCheverie
            1 hour ago










          • Regarding the appropriateness of sets versus dictionaries: yes, if is desirable that you retain duplicates, then a set is likely not the right collection to choose.
            – Endulum
            1 hour ago










          • Regarding the relationship to mathematical set notation, it should be read like (the set of all 'city' in 'cities' such that 'a' in 'city'). It is an alternative syntax for consisely building lists, sets, or other collections without an explicit loop. It is called a "comprehension", in this case a "set comprehension". If you have not learned about these yet, I expect you will before long. They are very popular among Python programmers.
            – Endulum
            1 hour ago










          • I had touched on list comprehensions briefly last night while reading but I didn't recognize the form. I only asked about the relationship to mathematics because city is written twice at the start of the set definition which seems a bit hard to remember, but I think that this is just Python syntax I must get used to. Thanks again!
            – JacobCheverie
            36 mins ago
















          Thanks @Endulum, I like the set approach. I can see where I would have to be careful using this, however. If I was not guaranteed to have distinct members then I assume that the set approach would discard some information and thus not be appropriate. When you define a_city, do I read this as I would read a mathematical set (set of all 'city' such that for 'city' in 'cities' if 'a' in 'city')? Without reading as so, it appears to me as a typo.
          – JacobCheverie
          1 hour ago




          Thanks @Endulum, I like the set approach. I can see where I would have to be careful using this, however. If I was not guaranteed to have distinct members then I assume that the set approach would discard some information and thus not be appropriate. When you define a_city, do I read this as I would read a mathematical set (set of all 'city' such that for 'city' in 'cities' if 'a' in 'city')? Without reading as so, it appears to me as a typo.
          – JacobCheverie
          1 hour ago












          Regarding the appropriateness of sets versus dictionaries: yes, if is desirable that you retain duplicates, then a set is likely not the right collection to choose.
          – Endulum
          1 hour ago




          Regarding the appropriateness of sets versus dictionaries: yes, if is desirable that you retain duplicates, then a set is likely not the right collection to choose.
          – Endulum
          1 hour ago












          Regarding the relationship to mathematical set notation, it should be read like (the set of all 'city' in 'cities' such that 'a' in 'city'). It is an alternative syntax for consisely building lists, sets, or other collections without an explicit loop. It is called a "comprehension", in this case a "set comprehension". If you have not learned about these yet, I expect you will before long. They are very popular among Python programmers.
          – Endulum
          1 hour ago




          Regarding the relationship to mathematical set notation, it should be read like (the set of all 'city' in 'cities' such that 'a' in 'city'). It is an alternative syntax for consisely building lists, sets, or other collections without an explicit loop. It is called a "comprehension", in this case a "set comprehension". If you have not learned about these yet, I expect you will before long. They are very popular among Python programmers.
          – Endulum
          1 hour ago












          I had touched on list comprehensions briefly last night while reading but I didn't recognize the form. I only asked about the relationship to mathematics because city is written twice at the start of the set definition which seems a bit hard to remember, but I think that this is just Python syntax I must get used to. Thanks again!
          – JacobCheverie
          36 mins ago




          I had touched on list comprehensions briefly last night while reading but I didn't recognize the form. I only asked about the relationship to mathematics because city is written twice at the start of the set definition which seems a bit hard to remember, but I think that this is just Python syntax I must get used to. Thanks again!
          – JacobCheverie
          36 mins ago










          up vote
          3
          down vote













          The task here is to collate the list of cities according to a key. In this case the key can be 'a' in city, which is True if the city contains 'a' and False otherwise.



          It's common to encounter processing tasks which require collation, for example to find words that are anagrams of each other we could collate the words in a dictionary according to their sorted letters.



          There is a standard pattern for collation in Python, which is to use collections.defaultdict and a loop. In the cities case, it goes like this:



          from collections import defaultdict
          with_a = defaultdict(list)
          for city in cities:
          with_a['a' in city].append(city)


          After running this loop, with_a[True] is the list of cities with 'a' and with_a[False] is the list of cities without.



          I prefer this approach to itertools.partition because it iterates over the input just once (whereas partition iterates over the input twice), and it's clear how to generalize it to other kinds of key.






          share|improve this answer





















          • This is seemingly high-powered. I will look into defaultdict some more to gain a better understanding of the code. When you mention your preference of this over partition, does that imply that this would be a more efficient/less complex approach?
            – JacobCheverie
            1 hour ago















          up vote
          3
          down vote













          The task here is to collate the list of cities according to a key. In this case the key can be 'a' in city, which is True if the city contains 'a' and False otherwise.



          It's common to encounter processing tasks which require collation, for example to find words that are anagrams of each other we could collate the words in a dictionary according to their sorted letters.



          There is a standard pattern for collation in Python, which is to use collections.defaultdict and a loop. In the cities case, it goes like this:



          from collections import defaultdict
          with_a = defaultdict(list)
          for city in cities:
          with_a['a' in city].append(city)


          After running this loop, with_a[True] is the list of cities with 'a' and with_a[False] is the list of cities without.



          I prefer this approach to itertools.partition because it iterates over the input just once (whereas partition iterates over the input twice), and it's clear how to generalize it to other kinds of key.






          share|improve this answer





















          • This is seemingly high-powered. I will look into defaultdict some more to gain a better understanding of the code. When you mention your preference of this over partition, does that imply that this would be a more efficient/less complex approach?
            – JacobCheverie
            1 hour ago













          up vote
          3
          down vote










          up vote
          3
          down vote









          The task here is to collate the list of cities according to a key. In this case the key can be 'a' in city, which is True if the city contains 'a' and False otherwise.



          It's common to encounter processing tasks which require collation, for example to find words that are anagrams of each other we could collate the words in a dictionary according to their sorted letters.



          There is a standard pattern for collation in Python, which is to use collections.defaultdict and a loop. In the cities case, it goes like this:



          from collections import defaultdict
          with_a = defaultdict(list)
          for city in cities:
          with_a['a' in city].append(city)


          After running this loop, with_a[True] is the list of cities with 'a' and with_a[False] is the list of cities without.



          I prefer this approach to itertools.partition because it iterates over the input just once (whereas partition iterates over the input twice), and it's clear how to generalize it to other kinds of key.






          share|improve this answer












          The task here is to collate the list of cities according to a key. In this case the key can be 'a' in city, which is True if the city contains 'a' and False otherwise.



          It's common to encounter processing tasks which require collation, for example to find words that are anagrams of each other we could collate the words in a dictionary according to their sorted letters.



          There is a standard pattern for collation in Python, which is to use collections.defaultdict and a loop. In the cities case, it goes like this:



          from collections import defaultdict
          with_a = defaultdict(list)
          for city in cities:
          with_a['a' in city].append(city)


          After running this loop, with_a[True] is the list of cities with 'a' and with_a[False] is the list of cities without.



          I prefer this approach to itertools.partition because it iterates over the input just once (whereas partition iterates over the input twice), and it's clear how to generalize it to other kinds of key.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 2 hours ago









          Gareth Rees

          44.4k3100179




          44.4k3100179












          • This is seemingly high-powered. I will look into defaultdict some more to gain a better understanding of the code. When you mention your preference of this over partition, does that imply that this would be a more efficient/less complex approach?
            – JacobCheverie
            1 hour ago


















          • This is seemingly high-powered. I will look into defaultdict some more to gain a better understanding of the code. When you mention your preference of this over partition, does that imply that this would be a more efficient/less complex approach?
            – JacobCheverie
            1 hour ago
















          This is seemingly high-powered. I will look into defaultdict some more to gain a better understanding of the code. When you mention your preference of this over partition, does that imply that this would be a more efficient/less complex approach?
          – JacobCheverie
          1 hour ago




          This is seemingly high-powered. I will look into defaultdict some more to gain a better understanding of the code. When you mention your preference of this over partition, does that imply that this would be a more efficient/less complex approach?
          – JacobCheverie
          1 hour ago










          JacobCheverie is a new contributor. Be nice, and check out our Code of Conduct.










           

          draft saved


          draft discarded


















          JacobCheverie is a new contributor. Be nice, and check out our Code of Conduct.













          JacobCheverie is a new contributor. Be nice, and check out our Code of Conduct.












          JacobCheverie is a new contributor. Be nice, and check out our Code of Conduct.















           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208130%2fdivide-a-set-of-strings-into-those-which-contain-a-and-those-which-dont%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          What visual should I use to simply compare current year value vs last year in Power BI desktop

          How to ignore python UserWarning in pytest?

          Alexandru Averescu