Delete successive rows under condition in R












1














I have a data frame



dt <- read.table(text = "
350 16
352 0
354 0
359 0
366 11
376 38
380 28
386 0
389 0
397 55
398 45
399 0
400 19
402 30")


When successive rows contain zero in the second column, I want to keep only the zero row which precedes the non zero row in the second column.



The result must be:



dt <- read.table(text = "
350 16
359 0
366 11
376 38
380 28
389 0
397 55
398 45
399 0
400 19
402 30")









share|improve this question



























    1














    I have a data frame



    dt <- read.table(text = "
    350 16
    352 0
    354 0
    359 0
    366 11
    376 38
    380 28
    386 0
    389 0
    397 55
    398 45
    399 0
    400 19
    402 30")


    When successive rows contain zero in the second column, I want to keep only the zero row which precedes the non zero row in the second column.



    The result must be:



    dt <- read.table(text = "
    350 16
    359 0
    366 11
    376 38
    380 28
    389 0
    397 55
    398 45
    399 0
    400 19
    402 30")









    share|improve this question

























      1












      1








      1







      I have a data frame



      dt <- read.table(text = "
      350 16
      352 0
      354 0
      359 0
      366 11
      376 38
      380 28
      386 0
      389 0
      397 55
      398 45
      399 0
      400 19
      402 30")


      When successive rows contain zero in the second column, I want to keep only the zero row which precedes the non zero row in the second column.



      The result must be:



      dt <- read.table(text = "
      350 16
      359 0
      366 11
      376 38
      380 28
      389 0
      397 55
      398 45
      399 0
      400 19
      402 30")









      share|improve this question













      I have a data frame



      dt <- read.table(text = "
      350 16
      352 0
      354 0
      359 0
      366 11
      376 38
      380 28
      386 0
      389 0
      397 55
      398 45
      399 0
      400 19
      402 30")


      When successive rows contain zero in the second column, I want to keep only the zero row which precedes the non zero row in the second column.



      The result must be:



      dt <- read.table(text = "
      350 16
      359 0
      366 11
      376 38
      380 28
      389 0
      397 55
      398 45
      399 0
      400 19
      402 30")






      r delete-row






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 22 at 18:14









      Noura

      544




      544
























          3 Answers
          3






          active

          oldest

          votes


















          1














          Simple one line solution:



          dplyr::filter(dt, !(V2==0 & lead(V2)==0))

          V1 V2
          1 350 16
          2 359 0
          3 366 11
          4 376 38
          5 380 28
          6 389 0
          7 397 55
          8 398 45
          9 399 0
          10 400 19
          11 402 30





          share|improve this answer































            1














            Here is an option where we create a grouping variable with rleid based on the zero values and filter with the conditions mentioned in the OP's post



            library(tidyverse)
            library(data.table)
            dt %>%
            group_by(grp = rleid(V2 == 0)) %>%
            filter(all(V2== 0) & row_number()==n() | V2 != 0) %>%
            ungroup %>%
            select(-grp)
            # A tibble: 11 x 2
            # V1 V2
            # <int> <int>
            # 1 350 16
            # 2 359 0
            # 3 366 11
            # 4 376 38
            # 5 380 28
            # 6 389 0
            # 7 397 55
            # 8 398 45
            # 9 399 0
            #10 400 19
            #11 402 30




            Or using data.table, the same logic can be applied



            setDT(dt)[dt[, .I[(V2 == 0 & seq_len(.N) == .N) | V2 != 0], rleid(V2 == 0)]$V1]
            # V1 V2
            # 1: 350 16
            # 2: 359 0
            # 3: 366 11
            # 4: 376 38
            # 5: 380 28
            # 6: 389 0
            # 7: 397 55
            # 8: 398 45
            # 9: 399 0
            #10: 400 19
            #11: 402 30


            Or as @jogo mentioned in the comments, to create a grouping column with rleid and then subset the first row (that have only 0 values in 'V2') based on a if/else condition



            setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i] 


            NOTE: These are flexible solutions which can be generalized






            share|improve this answer



















            • 1




              next try: setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i]
              – jogo
              Nov 22 at 19:10



















            0














            Here is the data.table equivalent to the solution from @iod :



            library("data.table")

            dt <- fread(
            "350 16
            352 0
            354 0
            359 0
            366 11
            376 38
            380 38
            386 0
            389 0
            397 55
            398 45
            399 0
            400 19
            402 30")

            dt[V2!=0 | shift(V2, type="lead")!=0]





            share|improve this answer





















              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436337%2fdelete-successive-rows-under-condition-in-r%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              3 Answers
              3






              active

              oldest

              votes








              3 Answers
              3






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              1














              Simple one line solution:



              dplyr::filter(dt, !(V2==0 & lead(V2)==0))

              V1 V2
              1 350 16
              2 359 0
              3 366 11
              4 376 38
              5 380 28
              6 389 0
              7 397 55
              8 398 45
              9 399 0
              10 400 19
              11 402 30





              share|improve this answer




























                1














                Simple one line solution:



                dplyr::filter(dt, !(V2==0 & lead(V2)==0))

                V1 V2
                1 350 16
                2 359 0
                3 366 11
                4 376 38
                5 380 28
                6 389 0
                7 397 55
                8 398 45
                9 399 0
                10 400 19
                11 402 30





                share|improve this answer


























                  1












                  1








                  1






                  Simple one line solution:



                  dplyr::filter(dt, !(V2==0 & lead(V2)==0))

                  V1 V2
                  1 350 16
                  2 359 0
                  3 366 11
                  4 376 38
                  5 380 28
                  6 389 0
                  7 397 55
                  8 398 45
                  9 399 0
                  10 400 19
                  11 402 30





                  share|improve this answer














                  Simple one line solution:



                  dplyr::filter(dt, !(V2==0 & lead(V2)==0))

                  V1 V2
                  1 350 16
                  2 359 0
                  3 366 11
                  4 376 38
                  5 380 28
                  6 389 0
                  7 397 55
                  8 398 45
                  9 399 0
                  10 400 19
                  11 402 30






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 22 at 18:42

























                  answered Nov 22 at 18:19









                  iod

                  3,4892721




                  3,4892721

























                      1














                      Here is an option where we create a grouping variable with rleid based on the zero values and filter with the conditions mentioned in the OP's post



                      library(tidyverse)
                      library(data.table)
                      dt %>%
                      group_by(grp = rleid(V2 == 0)) %>%
                      filter(all(V2== 0) & row_number()==n() | V2 != 0) %>%
                      ungroup %>%
                      select(-grp)
                      # A tibble: 11 x 2
                      # V1 V2
                      # <int> <int>
                      # 1 350 16
                      # 2 359 0
                      # 3 366 11
                      # 4 376 38
                      # 5 380 28
                      # 6 389 0
                      # 7 397 55
                      # 8 398 45
                      # 9 399 0
                      #10 400 19
                      #11 402 30




                      Or using data.table, the same logic can be applied



                      setDT(dt)[dt[, .I[(V2 == 0 & seq_len(.N) == .N) | V2 != 0], rleid(V2 == 0)]$V1]
                      # V1 V2
                      # 1: 350 16
                      # 2: 359 0
                      # 3: 366 11
                      # 4: 376 38
                      # 5: 380 28
                      # 6: 389 0
                      # 7: 397 55
                      # 8: 398 45
                      # 9: 399 0
                      #10: 400 19
                      #11: 402 30


                      Or as @jogo mentioned in the comments, to create a grouping column with rleid and then subset the first row (that have only 0 values in 'V2') based on a if/else condition



                      setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i] 


                      NOTE: These are flexible solutions which can be generalized






                      share|improve this answer



















                      • 1




                        next try: setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i]
                        – jogo
                        Nov 22 at 19:10
















                      1














                      Here is an option where we create a grouping variable with rleid based on the zero values and filter with the conditions mentioned in the OP's post



                      library(tidyverse)
                      library(data.table)
                      dt %>%
                      group_by(grp = rleid(V2 == 0)) %>%
                      filter(all(V2== 0) & row_number()==n() | V2 != 0) %>%
                      ungroup %>%
                      select(-grp)
                      # A tibble: 11 x 2
                      # V1 V2
                      # <int> <int>
                      # 1 350 16
                      # 2 359 0
                      # 3 366 11
                      # 4 376 38
                      # 5 380 28
                      # 6 389 0
                      # 7 397 55
                      # 8 398 45
                      # 9 399 0
                      #10 400 19
                      #11 402 30




                      Or using data.table, the same logic can be applied



                      setDT(dt)[dt[, .I[(V2 == 0 & seq_len(.N) == .N) | V2 != 0], rleid(V2 == 0)]$V1]
                      # V1 V2
                      # 1: 350 16
                      # 2: 359 0
                      # 3: 366 11
                      # 4: 376 38
                      # 5: 380 28
                      # 6: 389 0
                      # 7: 397 55
                      # 8: 398 45
                      # 9: 399 0
                      #10: 400 19
                      #11: 402 30


                      Or as @jogo mentioned in the comments, to create a grouping column with rleid and then subset the first row (that have only 0 values in 'V2') based on a if/else condition



                      setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i] 


                      NOTE: These are flexible solutions which can be generalized






                      share|improve this answer



















                      • 1




                        next try: setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i]
                        – jogo
                        Nov 22 at 19:10














                      1












                      1








                      1






                      Here is an option where we create a grouping variable with rleid based on the zero values and filter with the conditions mentioned in the OP's post



                      library(tidyverse)
                      library(data.table)
                      dt %>%
                      group_by(grp = rleid(V2 == 0)) %>%
                      filter(all(V2== 0) & row_number()==n() | V2 != 0) %>%
                      ungroup %>%
                      select(-grp)
                      # A tibble: 11 x 2
                      # V1 V2
                      # <int> <int>
                      # 1 350 16
                      # 2 359 0
                      # 3 366 11
                      # 4 376 38
                      # 5 380 28
                      # 6 389 0
                      # 7 397 55
                      # 8 398 45
                      # 9 399 0
                      #10 400 19
                      #11 402 30




                      Or using data.table, the same logic can be applied



                      setDT(dt)[dt[, .I[(V2 == 0 & seq_len(.N) == .N) | V2 != 0], rleid(V2 == 0)]$V1]
                      # V1 V2
                      # 1: 350 16
                      # 2: 359 0
                      # 3: 366 11
                      # 4: 376 38
                      # 5: 380 28
                      # 6: 389 0
                      # 7: 397 55
                      # 8: 398 45
                      # 9: 399 0
                      #10: 400 19
                      #11: 402 30


                      Or as @jogo mentioned in the comments, to create a grouping column with rleid and then subset the first row (that have only 0 values in 'V2') based on a if/else condition



                      setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i] 


                      NOTE: These are flexible solutions which can be generalized






                      share|improve this answer














                      Here is an option where we create a grouping variable with rleid based on the zero values and filter with the conditions mentioned in the OP's post



                      library(tidyverse)
                      library(data.table)
                      dt %>%
                      group_by(grp = rleid(V2 == 0)) %>%
                      filter(all(V2== 0) & row_number()==n() | V2 != 0) %>%
                      ungroup %>%
                      select(-grp)
                      # A tibble: 11 x 2
                      # V1 V2
                      # <int> <int>
                      # 1 350 16
                      # 2 359 0
                      # 3 366 11
                      # 4 376 38
                      # 5 380 28
                      # 6 389 0
                      # 7 397 55
                      # 8 398 45
                      # 9 399 0
                      #10 400 19
                      #11 402 30




                      Or using data.table, the same logic can be applied



                      setDT(dt)[dt[, .I[(V2 == 0 & seq_len(.N) == .N) | V2 != 0], rleid(V2 == 0)]$V1]
                      # V1 V2
                      # 1: 350 16
                      # 2: 359 0
                      # 3: 366 11
                      # 4: 376 38
                      # 5: 380 28
                      # 6: 389 0
                      # 7: 397 55
                      # 8: 398 45
                      # 9: 399 0
                      #10: 400 19
                      #11: 402 30


                      Or as @jogo mentioned in the comments, to create a grouping column with rleid and then subset the first row (that have only 0 values in 'V2') based on a if/else condition



                      setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i] 


                      NOTE: These are flexible solutions which can be generalized







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Nov 22 at 19:25

























                      answered Nov 22 at 18:17









                      akrun

                      396k13187260




                      396k13187260








                      • 1




                        next try: setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i]
                        – jogo
                        Nov 22 at 19:10














                      • 1




                        next try: setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i]
                        – jogo
                        Nov 22 at 19:10








                      1




                      1




                      next try: setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i]
                      – jogo
                      Nov 22 at 19:10




                      next try: setDT(dt)[, i:=rleid(V2)][, if (any(V2!=0)) .SD else .SD[.N], i]
                      – jogo
                      Nov 22 at 19:10











                      0














                      Here is the data.table equivalent to the solution from @iod :



                      library("data.table")

                      dt <- fread(
                      "350 16
                      352 0
                      354 0
                      359 0
                      366 11
                      376 38
                      380 38
                      386 0
                      389 0
                      397 55
                      398 45
                      399 0
                      400 19
                      402 30")

                      dt[V2!=0 | shift(V2, type="lead")!=0]





                      share|improve this answer


























                        0














                        Here is the data.table equivalent to the solution from @iod :



                        library("data.table")

                        dt <- fread(
                        "350 16
                        352 0
                        354 0
                        359 0
                        366 11
                        376 38
                        380 38
                        386 0
                        389 0
                        397 55
                        398 45
                        399 0
                        400 19
                        402 30")

                        dt[V2!=0 | shift(V2, type="lead")!=0]





                        share|improve this answer
























                          0












                          0








                          0






                          Here is the data.table equivalent to the solution from @iod :



                          library("data.table")

                          dt <- fread(
                          "350 16
                          352 0
                          354 0
                          359 0
                          366 11
                          376 38
                          380 38
                          386 0
                          389 0
                          397 55
                          398 45
                          399 0
                          400 19
                          402 30")

                          dt[V2!=0 | shift(V2, type="lead")!=0]





                          share|improve this answer












                          Here is the data.table equivalent to the solution from @iod :



                          library("data.table")

                          dt <- fread(
                          "350 16
                          352 0
                          354 0
                          359 0
                          366 11
                          376 38
                          380 38
                          386 0
                          389 0
                          397 55
                          398 45
                          399 0
                          400 19
                          402 30")

                          dt[V2!=0 | shift(V2, type="lead")!=0]






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Nov 25 at 17:05









                          jogo

                          9,82692135




                          9,82692135






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.





                              Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                              Please pay close attention to the following guidance:


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53436337%2fdelete-successive-rows-under-condition-in-r%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              How to ignore python UserWarning in pytest?

                              What visual should I use to simply compare current year value vs last year in Power BI desktop

                              Script to remove string up to first number