Beautifulsoup print attribute value for divs with same class
up vote
-1
down vote
favorite
I've got the following code working that will print the text after value=
soup = BeautifulSoup(html, 'lxml')
name = soup.find('input')['value']
print(name)
However the page has multiple divs with the same class I've tried findAll but I get errors and can only print the first field value which is the Name.
Please see the attached screen shot
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
Thanks!
python selenium web-scraping beautifulsoup
add a comment |
up vote
-1
down vote
favorite
I've got the following code working that will print the text after value=
soup = BeautifulSoup(html, 'lxml')
name = soup.find('input')['value']
print(name)
However the page has multiple divs with the same class I've tried findAll but I get errors and can only print the first field value which is the Name.
Please see the attached screen shot
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
Thanks!
python selenium web-scraping beautifulsoup
Please use the snippet tool via edit to include HTML, not as an image. Also, your shown HTML does not have input tag elements visible.
– QHarr
Nov 21 at 16:08
I've uploaded the code to the snippet view now.
– Arron
Nov 21 at 16:36
what are you trying to extract from that snippet?
– QHarr
Nov 21 at 16:38
Hi QHarr, I'm trying to get the values in the fields under Name, Address and postcode, Mobile number etc the page has many of these fields with different labels so I'm trying a find_all which produces errors, yet when I run the find code It will just print the first label only which is the Name.
– Arron
Nov 21 at 16:43
add a comment |
up vote
-1
down vote
favorite
up vote
-1
down vote
favorite
I've got the following code working that will print the text after value=
soup = BeautifulSoup(html, 'lxml')
name = soup.find('input')['value']
print(name)
However the page has multiple divs with the same class I've tried findAll but I get errors and can only print the first field value which is the Name.
Please see the attached screen shot
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
Thanks!
python selenium web-scraping beautifulsoup
I've got the following code working that will print the text after value=
soup = BeautifulSoup(html, 'lxml')
name = soup.find('input')['value']
print(name)
However the page has multiple divs with the same class I've tried findAll but I get errors and can only print the first field value which is the Name.
Please see the attached screen shot
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
Thanks!
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
python selenium web-scraping beautifulsoup
python selenium web-scraping beautifulsoup
edited Nov 22 at 6:57
ewwink
6,96422233
6,96422233
asked Nov 21 at 15:58
Arron
43
43
Please use the snippet tool via edit to include HTML, not as an image. Also, your shown HTML does not have input tag elements visible.
– QHarr
Nov 21 at 16:08
I've uploaded the code to the snippet view now.
– Arron
Nov 21 at 16:36
what are you trying to extract from that snippet?
– QHarr
Nov 21 at 16:38
Hi QHarr, I'm trying to get the values in the fields under Name, Address and postcode, Mobile number etc the page has many of these fields with different labels so I'm trying a find_all which produces errors, yet when I run the find code It will just print the first label only which is the Name.
– Arron
Nov 21 at 16:43
add a comment |
Please use the snippet tool via edit to include HTML, not as an image. Also, your shown HTML does not have input tag elements visible.
– QHarr
Nov 21 at 16:08
I've uploaded the code to the snippet view now.
– Arron
Nov 21 at 16:36
what are you trying to extract from that snippet?
– QHarr
Nov 21 at 16:38
Hi QHarr, I'm trying to get the values in the fields under Name, Address and postcode, Mobile number etc the page has many of these fields with different labels so I'm trying a find_all which produces errors, yet when I run the find code It will just print the first label only which is the Name.
– Arron
Nov 21 at 16:43
Please use the snippet tool via edit to include HTML, not as an image. Also, your shown HTML does not have input tag elements visible.
– QHarr
Nov 21 at 16:08
Please use the snippet tool via edit to include HTML, not as an image. Also, your shown HTML does not have input tag elements visible.
– QHarr
Nov 21 at 16:08
I've uploaded the code to the snippet view now.
– Arron
Nov 21 at 16:36
I've uploaded the code to the snippet view now.
– Arron
Nov 21 at 16:36
what are you trying to extract from that snippet?
– QHarr
Nov 21 at 16:38
what are you trying to extract from that snippet?
– QHarr
Nov 21 at 16:38
Hi QHarr, I'm trying to get the values in the fields under Name, Address and postcode, Mobile number etc the page has many of these fields with different labels so I'm trying a find_all which produces errors, yet when I run the find code It will just print the first label only which is the Name.
– Arron
Nov 21 at 16:43
Hi QHarr, I'm trying to get the values in the fields under Name, Address and postcode, Mobile number etc the page has many of these fields with different labels so I'm trying a find_all which produces errors, yet when I run the find code It will just print the first label only which is the Name.
– Arron
Nov 21 at 16:43
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
Maybe something like:
from bs4 import BeautifulSoup
html = '''
<html>
<head></head>
<body>
<div class="control-group">
<label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls">
<input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required class="input-small text-bound datepicker hasDatepicker">
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(html, "lxml")
items = soup.select('.controls')
print([item.text.strip() for item in items if item.text.strip()])
Thanks! Could I specify particular fields, so something like only strip the value for 'Address and postcode' & 'Name' etc
– Arron
Nov 22 at 10:47
Why would you want to do that? If html has constant layout you could use position matching to perform this or xpath that looks for specific strings.
– QHarr
Nov 22 at 12:40
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
Maybe something like:
from bs4 import BeautifulSoup
html = '''
<html>
<head></head>
<body>
<div class="control-group">
<label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls">
<input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required class="input-small text-bound datepicker hasDatepicker">
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(html, "lxml")
items = soup.select('.controls')
print([item.text.strip() for item in items if item.text.strip()])
Thanks! Could I specify particular fields, so something like only strip the value for 'Address and postcode' & 'Name' etc
– Arron
Nov 22 at 10:47
Why would you want to do that? If html has constant layout you could use position matching to perform this or xpath that looks for specific strings.
– QHarr
Nov 22 at 12:40
add a comment |
up vote
0
down vote
Maybe something like:
from bs4 import BeautifulSoup
html = '''
<html>
<head></head>
<body>
<div class="control-group">
<label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls">
<input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required class="input-small text-bound datepicker hasDatepicker">
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(html, "lxml")
items = soup.select('.controls')
print([item.text.strip() for item in items if item.text.strip()])
Thanks! Could I specify particular fields, so something like only strip the value for 'Address and postcode' & 'Name' etc
– Arron
Nov 22 at 10:47
Why would you want to do that? If html has constant layout you could use position matching to perform this or xpath that looks for specific strings.
– QHarr
Nov 22 at 12:40
add a comment |
up vote
0
down vote
up vote
0
down vote
Maybe something like:
from bs4 import BeautifulSoup
html = '''
<html>
<head></head>
<body>
<div class="control-group">
<label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls">
<input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required class="input-small text-bound datepicker hasDatepicker">
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(html, "lxml")
items = soup.select('.controls')
print([item.text.strip() for item in items if item.text.strip()])
Maybe something like:
from bs4 import BeautifulSoup
html = '''
<html>
<head></head>
<body>
<div class="control-group">
<label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls">
<input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required class="input-small text-bound datepicker hasDatepicker">
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(html, "lxml")
items = soup.select('.controls')
print([item.text.strip() for item in items if item.text.strip()])
answered Nov 21 at 16:45
QHarr
27.2k81839
27.2k81839
Thanks! Could I specify particular fields, so something like only strip the value for 'Address and postcode' & 'Name' etc
– Arron
Nov 22 at 10:47
Why would you want to do that? If html has constant layout you could use position matching to perform this or xpath that looks for specific strings.
– QHarr
Nov 22 at 12:40
add a comment |
Thanks! Could I specify particular fields, so something like only strip the value for 'Address and postcode' & 'Name' etc
– Arron
Nov 22 at 10:47
Why would you want to do that? If html has constant layout you could use position matching to perform this or xpath that looks for specific strings.
– QHarr
Nov 22 at 12:40
Thanks! Could I specify particular fields, so something like only strip the value for 'Address and postcode' & 'Name' etc
– Arron
Nov 22 at 10:47
Thanks! Could I specify particular fields, so something like only strip the value for 'Address and postcode' & 'Name' etc
– Arron
Nov 22 at 10:47
Why would you want to do that? If html has constant layout you could use position matching to perform this or xpath that looks for specific strings.
– QHarr
Nov 22 at 12:40
Why would you want to do that? If html has constant layout you could use position matching to perform this or xpath that looks for specific strings.
– QHarr
Nov 22 at 12:40
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415942%2fbeautifulsoup-print-attribute-value-for-divs-with-same-class%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Please use the snippet tool via edit to include HTML, not as an image. Also, your shown HTML does not have input tag elements visible.
– QHarr
Nov 21 at 16:08
I've uploaded the code to the snippet view now.
– Arron
Nov 21 at 16:36
what are you trying to extract from that snippet?
– QHarr
Nov 21 at 16:38
Hi QHarr, I'm trying to get the values in the fields under Name, Address and postcode, Mobile number etc the page has many of these fields with different labels so I'm trying a find_all which produces errors, yet when I run the find code It will just print the first label only which is the Name.
– Arron
Nov 21 at 16:43