find newline with words starting with underscore with specific pattern

up vote
2
down vote

favorite

I need to find the following from c code using regular expression python but some how i could not write it properly.

if(condition)

     /*~T*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~T*/

     }

..........

/*~T*/

     _findmethis = FALSE;

......

                    /*~T*/

_findthat = True;

I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file

import re

fh = open('filename.c', "r")

output = open("output.txt", "w")

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')

for line in fh:

for m in re.finditer(pattern, line):

    output.write(m.group(3))

    output.write("n")



output.close()

edited 2 hours ago

asked 2 hours ago

fastlearner

2417

[aA-zZ] does not only match letters, it also matches [, , ], ^, _, `. You must have meant [a-zA-Z]. All you need to do is remove for line in fh: and use re.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago

add a comment |

up vote
2
down vote

favorite

I need to find the following from c code using regular expression python but some how i could not write it properly.

if(condition)

     /*~T*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~T*/

     }

..........

/*~T*/

     _findmethis = FALSE;

......

                    /*~T*/

_findthat = True;

I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file

import re

fh = open('filename.c', "r")

output = open("output.txt", "w")

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')

for line in fh:

for m in re.finditer(pattern, line):

    output.write(m.group(3))

    output.write("n")



output.close()

edited 2 hours ago

asked 2 hours ago

fastlearner

2417

[aA-zZ] does not only match letters, it also matches [, , ], ^, _, `. You must have meant [a-zA-Z]. All you need to do is remove for line in fh: and use re.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago

add a comment |

up vote
2
down vote

favorite

I need to find the following from c code using regular expression python but some how i could not write it properly.

if(condition)

     /*~T*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~T*/

     }

..........

/*~T*/

     _findmethis = FALSE;

......

                    /*~T*/

_findthat = True;

I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file

import re

fh = open('filename.c', "r")

output = open("output.txt", "w")

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')

for line in fh:

for m in re.finditer(pattern, line):

    output.write(m.group(3))

    output.write("n")



output.close()

edited 2 hours ago

asked 2 hours ago

fastlearner

2417

I need to find the following from c code using regular expression python but some how i could not write it properly.

if(condition)

     /*~T*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~T*/

     }

..........

/*~T*/

     _findmethis = FALSE;

......

                    /*~T*/

_findthat = True;

I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file

import re

fh = open('filename.c', "r")

output = open("output.txt", "w")

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')

for line in fh:

for m in re.finditer(pattern, line):

    output.write(m.group(3))

    output.write("n")



output.close()

regex python-3.x

edited 2 hours ago

asked 2 hours ago

fastlearner

2417

edited 2 hours ago

asked 2 hours ago

fastlearner

2417

edited 2 hours ago

asked 2 hours ago

fastlearner

2417

asked 2 hours ago

fastlearner

2417

asked 2 hours ago

fastlearner

2417

[aA-zZ] does not only match letters, it also matches [, , ], ^, _, `. You must have meant [a-zA-Z]. All you need to do is remove for line in fh: and use re.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago

add a comment |

[aA-zZ] does not only match letters, it also matches [, , ], ^, _, `. You must have meant [a-zA-Z]. All you need to do is remove for line in fh: and use re.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago

[aA-zZ] does not only match letters, it also matches [, , ], ^, _, `. You must have meant [a-zA-Z]. All you need to do is remove for line in fh: and use re.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago

add a comment |

2 Answers
2

active

oldest

votes

up vote
0
down vote

The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.

Consider using this:

t = """

if(condition)

     /*~-*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~-*/

     }

..........

/*~T*/

     _findmethis = FALSE;



     /*~T*/

     do_not_findme_this = FALSE;

"""



import re

pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)

for m in re.finditer(pattern, t):  # use the whole file here - not line-wise

    print(m.group(1))

The pattern uses 2 flags that tell regex to use multiline matches and that dots . also match newlines (by default they don't) together with a non greedy .*? to make the gap between /*~-T*/ and the following group minimal large.

Printout:

_getmethis

_findmethis

Doku:

re.MULTILINE

re.DOTALL

edited 33 mins ago

answered 2 hours ago

Patrick Artner

18.1k51940

I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago

but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago

@fastlearner Then adjust the pattern? So the (_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago

add a comment |

up vote
0
down vote

You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.

The pattern I suggest is

(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)

See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.

When reading files in, it is more convenient to use with so that you do not have to use .close():

import re

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')



with open('filename.c', "r") as fh:

    contents = fh.read()

    with open("output.txt", "w") as output:

        output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))

answered 11 mins ago

Wiktor Stribiżew

301k16122197

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415684%2ffind-newline-with-words-starting-with-underscore-with-specific-pattern%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
0
down vote

The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.

Consider using this:

t = """

if(condition)

     /*~-*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~-*/

     }

..........

/*~T*/

     _findmethis = FALSE;



     /*~T*/

     do_not_findme_this = FALSE;

"""



import re

pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)

for m in re.finditer(pattern, t):  # use the whole file here - not line-wise

    print(m.group(1))

Printout:

_getmethis

_findmethis

Doku:

re.MULTILINE

re.DOTALL

edited 33 mins ago

answered 2 hours ago

Patrick Artner

18.1k51940

I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago

but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago

@fastlearner Then adjust the pattern? So the (_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago

add a comment |

up vote
0
down vote

The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.

Consider using this:

t = """

if(condition)

     /*~-*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~-*/

     }

..........

/*~T*/

     _findmethis = FALSE;



     /*~T*/

     do_not_findme_this = FALSE;

"""



import re

pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)

for m in re.finditer(pattern, t):  # use the whole file here - not line-wise

    print(m.group(1))

Printout:

_getmethis

_findmethis

Doku:

re.MULTILINE

re.DOTALL

edited 33 mins ago

answered 2 hours ago

Patrick Artner

18.1k51940

I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago

but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago

@fastlearner Then adjust the pattern? So the (_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago

add a comment |

up vote
0
down vote

The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.

Consider using this:

t = """

if(condition)

     /*~-*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~-*/

     }

..........

/*~T*/

     _findmethis = FALSE;



     /*~T*/

     do_not_findme_this = FALSE;

"""



import re

pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)

for m in re.finditer(pattern, t):  # use the whole file here - not line-wise

    print(m.group(1))

Printout:

_getmethis

_findmethis

Doku:

re.MULTILINE

re.DOTALL

edited 33 mins ago

answered 2 hours ago

Patrick Artner

18.1k51940

The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.

Consider using this:

t = """

if(condition)

     /*~-*/

     {

        /*~T*/

        _getmethis = FALSE;

     /*~-*/

     }

..........

/*~T*/

     _findmethis = FALSE;



     /*~T*/

     do_not_findme_this = FALSE;

"""



import re

pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)

for m in re.finditer(pattern, t):  # use the whole file here - not line-wise

    print(m.group(1))

Printout:

_getmethis

_findmethis

Doku:

re.MULTILINE

re.DOTALL

edited 33 mins ago

answered 2 hours ago

Patrick Artner

18.1k51940

edited 33 mins ago

answered 2 hours ago

Patrick Artner

18.1k51940

answered 2 hours ago

Patrick Artner

18.1k51940

answered 2 hours ago

Patrick Artner

18.1k51940

I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago

but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago

@fastlearner Then adjust the pattern? So the (_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago

add a comment |

I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago

but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago

@fastlearner Then adjust the pattern? So the (_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago

I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago

but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago

@fastlearner Then adjust the pattern? So the (_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago

add a comment |

up vote
0
down vote

You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.

The pattern I suggest is

(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)

See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.

When reading files in, it is more convenient to use with so that you do not have to use .close():

import re

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')



with open('filename.c', "r") as fh:

    contents = fh.read()

    with open("output.txt", "w") as output:

        output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))

answered 11 mins ago

Wiktor Stribiżew

301k16122197

add a comment |

up vote
0
down vote

You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.

The pattern I suggest is

(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)

See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.

When reading files in, it is more convenient to use with so that you do not have to use .close():

import re

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')



with open('filename.c', "r") as fh:

    contents = fh.read()

    with open("output.txt", "w") as output:

        output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))

answered 11 mins ago

Wiktor Stribiżew

301k16122197

add a comment |

up vote
0
down vote

You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.

The pattern I suggest is

(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)

See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.

When reading files in, it is more convenient to use with so that you do not have to use .close():

import re

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')



with open('filename.c', "r") as fh:

    contents = fh.read()

    with open("output.txt", "w") as output:

        output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))

answered 11 mins ago

Wiktor Stribiżew

301k16122197

You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.

The pattern I suggest is

(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)

See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.

When reading files in, it is more convenient to use with so that you do not have to use .close():

import re

pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')



with open('filename.c', "r") as fh:

    contents = fh.read()

    with open("output.txt", "w") as output:

        output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))

answered 11 mins ago

Wiktor Stribiżew

301k16122197

answered 11 mins ago

Wiktor Stribiżew

301k16122197

answered 11 mins ago

Wiktor Stribiżew

301k16122197

answered 11 mins ago

Wiktor Stribiżew

301k16122197

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Qfyilyi