find newline with words starting with underscore with specific pattern
up vote
2
down vote
favorite
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
add a comment |
up vote
2
down vote
favorite
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
[aA-zZ]
does not only match letters, it also matches[
,,
]
,^
,_
,`
. You must have meant[a-zA-Z]
. All you need to do is removefor line in fh:
and usere.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
regex python-3.x
edited 2 hours ago
asked 2 hours ago
fastlearner
2417
2417
[aA-zZ]
does not only match letters, it also matches[
,,
]
,^
,_
,`
. You must have meant[a-zA-Z]
. All you need to do is removefor line in fh:
and usere.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago
add a comment |
[aA-zZ]
does not only match letters, it also matches[
,,
]
,^
,_
,`
. You must have meant[a-zA-Z]
. All you need to do is removefor line in fh:
and usere.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago
[aA-zZ]
does not only match letters, it also matches [
,
, ]
, ^
, _
, `
. You must have meant [a-zA-Z]
. All you need to do is remove for line in fh:
and use re.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago
[aA-zZ]
does not only match letters, it also matches [
,
, ]
, ^
, _
, `
. You must have meant [a-zA-Z]
. All you need to do is remove for line in fh:
and use re.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago
add a comment |
2 Answers
2
active
oldest
votes
up vote
0
down vote
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots .
also match newlines (by default they don't) together with a non greedy .*?
to make the gap between /*~-T*/
and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)
is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago
add a comment |
up vote
0
down vote
You need to read the file in as a whole with fh.read()
and make sure you amend the pattern to only match letters since [aA-zZ]
matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n
from the first s*
to make matching more efficient.
When reading files in, it is more convenient to use with
so that you do not have to use .close()
:
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots .
also match newlines (by default they don't) together with a non greedy .*?
to make the gap between /*~-T*/
and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)
is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago
add a comment |
up vote
0
down vote
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots .
also match newlines (by default they don't) together with a non greedy .*?
to make the gap between /*~-T*/
and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)
is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago
add a comment |
up vote
0
down vote
up vote
0
down vote
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots .
also match newlines (by default they don't) together with a non greedy .*?
to make the gap between /*~-T*/
and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots .
also match newlines (by default they don't) together with a non greedy .*?
to make the gap between /*~-T*/
and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
edited 33 mins ago
answered 2 hours ago
Patrick Artner
18.1k51940
18.1k51940
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)
is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago
add a comment |
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)
is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
31 mins ago
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
2 hours ago
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
50 mins ago
@fastlearner Then adjust the pattern? So the
(_[aA-zZ]*)
is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...– Patrick Artner
31 mins ago
@fastlearner Then adjust the pattern? So the
(_[aA-zZ]*)
is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...– Patrick Artner
31 mins ago
add a comment |
up vote
0
down vote
You need to read the file in as a whole with fh.read()
and make sure you amend the pattern to only match letters since [aA-zZ]
matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n
from the first s*
to make matching more efficient.
When reading files in, it is more convenient to use with
so that you do not have to use .close()
:
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
add a comment |
up vote
0
down vote
You need to read the file in as a whole with fh.read()
and make sure you amend the pattern to only match letters since [aA-zZ]
matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n
from the first s*
to make matching more efficient.
When reading files in, it is more convenient to use with
so that you do not have to use .close()
:
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
add a comment |
up vote
0
down vote
up vote
0
down vote
You need to read the file in as a whole with fh.read()
and make sure you amend the pattern to only match letters since [aA-zZ]
matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n
from the first s*
to make matching more efficient.
When reading files in, it is more convenient to use with
so that you do not have to use .close()
:
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
You need to read the file in as a whole with fh.read()
and make sure you amend the pattern to only match letters since [aA-zZ]
matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n
from the first s*
to make matching more efficient.
When reading files in, it is more convenient to use with
so that you do not have to use .close()
:
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
answered 11 mins ago
Wiktor Stribiżew
301k16122197
301k16122197
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415684%2ffind-newline-with-words-starting-with-underscore-with-specific-pattern%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
[aA-zZ]
does not only match letters, it also matches[
,,
]
,^
,_
,`
. You must have meant[a-zA-Z]
. All you need to do is removefor line in fh:
and usere.finditer(pattern, fh.read())
– Wiktor Stribiżew
1 hour ago