sed or awk: remove numbers after a symbol











up vote
2
down vote

favorite












I would like to remove just the numbers and "_" after ">" symbol, for example:



>1_CR-B_CR56_t
MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
>2_R-B_R46_t
MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
>3000_N-N274_M
MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


Expected Results:



>CR-B_CR56_t
MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
>R-B_R46_t
MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
>N-N274_M
MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


I used sed "s/>[0-9][_]//g" but it removed ">" as well.










share|improve this question


























    up vote
    2
    down vote

    favorite












    I would like to remove just the numbers and "_" after ">" symbol, for example:



    >1_CR-B_CR56_t
    MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
    >2_R-B_R46_t
    MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
    >3000_N-N274_M
    MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


    Expected Results:



    >CR-B_CR56_t
    MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
    >R-B_R46_t
    MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
    >N-N274_M
    MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


    I used sed "s/>[0-9][_]//g" but it removed ">" as well.










    share|improve this question
























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      I would like to remove just the numbers and "_" after ">" symbol, for example:



      >1_CR-B_CR56_t
      MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
      >2_R-B_R46_t
      MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
      >3000_N-N274_M
      MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


      Expected Results:



      >CR-B_CR56_t
      MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
      >R-B_R46_t
      MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
      >N-N274_M
      MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


      I used sed "s/>[0-9][_]//g" but it removed ">" as well.










      share|improve this question













      I would like to remove just the numbers and "_" after ">" symbol, for example:



      >1_CR-B_CR56_t
      MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
      >2_R-B_R46_t
      MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
      >3000_N-N274_M
      MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


      Expected Results:



      >CR-B_CR56_t
      MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
      >R-B_R46_t
      MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
      >N-N274_M
      MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV


      I used sed "s/>[0-9][_]//g" but it removed ">" as well.







      awk sed delete






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 4 hours ago









      Paul

      937




      937






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          4
          down vote



          accepted










          Just a slight modification from your sed command:



          sed 's/^>[0-9]*[_]/>/g'


          the s is the sed substitute command, it searches for the string on the left hand side and replaces it with the string on the right hand side. Instead of replacing it with nothing you can replace it with the > character that you would like to keep.



          ^ is used to specify that the match should only start at the beginning of a newline



          Additionally the * is being used to match more than a single digit.






          share|improve this answer























          • thanks. I tried so many options, but not this one.
            – Paul
            4 hours ago






          • 2




            You might want the line-start anchor (^) as well. And + instead of * for one or more digits.
            – glenn jackman
            4 hours ago




















          up vote
          0
          down vote













          awk '{sub(/^>._|^>...._/,">")}1' file
          >CR-B_CR56_t
          MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
          >R-B_R46_t
          MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
          >N-N274_M
          MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV





          share|improve this answer





















            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "106"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f487873%2fsed-or-awk-remove-numbers-after-a-symbol%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            4
            down vote



            accepted










            Just a slight modification from your sed command:



            sed 's/^>[0-9]*[_]/>/g'


            the s is the sed substitute command, it searches for the string on the left hand side and replaces it with the string on the right hand side. Instead of replacing it with nothing you can replace it with the > character that you would like to keep.



            ^ is used to specify that the match should only start at the beginning of a newline



            Additionally the * is being used to match more than a single digit.






            share|improve this answer























            • thanks. I tried so many options, but not this one.
              – Paul
              4 hours ago






            • 2




              You might want the line-start anchor (^) as well. And + instead of * for one or more digits.
              – glenn jackman
              4 hours ago

















            up vote
            4
            down vote



            accepted










            Just a slight modification from your sed command:



            sed 's/^>[0-9]*[_]/>/g'


            the s is the sed substitute command, it searches for the string on the left hand side and replaces it with the string on the right hand side. Instead of replacing it with nothing you can replace it with the > character that you would like to keep.



            ^ is used to specify that the match should only start at the beginning of a newline



            Additionally the * is being used to match more than a single digit.






            share|improve this answer























            • thanks. I tried so many options, but not this one.
              – Paul
              4 hours ago






            • 2




              You might want the line-start anchor (^) as well. And + instead of * for one or more digits.
              – glenn jackman
              4 hours ago















            up vote
            4
            down vote



            accepted







            up vote
            4
            down vote



            accepted






            Just a slight modification from your sed command:



            sed 's/^>[0-9]*[_]/>/g'


            the s is the sed substitute command, it searches for the string on the left hand side and replaces it with the string on the right hand side. Instead of replacing it with nothing you can replace it with the > character that you would like to keep.



            ^ is used to specify that the match should only start at the beginning of a newline



            Additionally the * is being used to match more than a single digit.






            share|improve this answer














            Just a slight modification from your sed command:



            sed 's/^>[0-9]*[_]/>/g'


            the s is the sed substitute command, it searches for the string on the left hand side and replaces it with the string on the right hand side. Instead of replacing it with nothing you can replace it with the > character that you would like to keep.



            ^ is used to specify that the match should only start at the beginning of a newline



            Additionally the * is being used to match more than a single digit.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 4 hours ago

























            answered 4 hours ago









            Jesse_b

            11.6k23063




            11.6k23063












            • thanks. I tried so many options, but not this one.
              – Paul
              4 hours ago






            • 2




              You might want the line-start anchor (^) as well. And + instead of * for one or more digits.
              – glenn jackman
              4 hours ago




















            • thanks. I tried so many options, but not this one.
              – Paul
              4 hours ago






            • 2




              You might want the line-start anchor (^) as well. And + instead of * for one or more digits.
              – glenn jackman
              4 hours ago


















            thanks. I tried so many options, but not this one.
            – Paul
            4 hours ago




            thanks. I tried so many options, but not this one.
            – Paul
            4 hours ago




            2




            2




            You might want the line-start anchor (^) as well. And + instead of * for one or more digits.
            – glenn jackman
            4 hours ago






            You might want the line-start anchor (^) as well. And + instead of * for one or more digits.
            – glenn jackman
            4 hours ago














            up vote
            0
            down vote













            awk '{sub(/^>._|^>...._/,">")}1' file
            >CR-B_CR56_t
            MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
            >R-B_R46_t
            MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
            >N-N274_M
            MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV





            share|improve this answer

























              up vote
              0
              down vote













              awk '{sub(/^>._|^>...._/,">")}1' file
              >CR-B_CR56_t
              MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
              >R-B_R46_t
              MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
              >N-N274_M
              MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV





              share|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                awk '{sub(/^>._|^>...._/,">")}1' file
                >CR-B_CR56_t
                MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
                >R-B_R46_t
                MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
                >N-N274_M
                MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV





                share|improve this answer












                awk '{sub(/^>._|^>...._/,">")}1' file
                >CR-B_CR56_t
                MTKIIKFVYFMTIFISPNHHCPVYNCTHPKQPWCKLVRLQLLFHGSLIGLCDCI
                >R-B_R46_t
                MVEVTKLVNVMLIFLTLSPLVYDCQAYECELPFKPDCLMVEYSPQFVALRCGCV
                >N-N274_M
                MVEVTKLVNVMLIFLTLFVYTDSDCQAYACELPFKPDCLMVEYAPQFFRLACGCV






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 1 hour ago









                Claes Wikner

                12713




                12713






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Unix & Linux Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.





                    Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                    Please pay close attention to the following guidance:


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f487873%2fsed-or-awk-remove-numbers-after-a-symbol%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to ignore python UserWarning in pytest?

                    What visual should I use to simply compare current year value vs last year in Power BI desktop

                    Script to remove string up to first number