Unable to pare the href tag in python











up vote
1
down vote

favorite
1












I get the following output in my beautiful soup.
[Search over 301,944 datasetsn]



I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far



import requests
import re
from bs4 import BeautifulSoup
source = requests.get('https://www.data.gov/').text
soup = BeautifulSoup (source , 'lxml')
#print soup.prettify()
images = soup.find_all('small')
print images
con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
print con
#for con in images.find_all('a',href=True):
#print con
#content = images.split('metrics')
#print content[1]
#images = soup.find_all('a', {'href':re.compile('d+')})
#print images









share|improve this question


























    up vote
    1
    down vote

    favorite
    1












    I get the following output in my beautiful soup.
    [Search over 301,944 datasetsn]



    I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far



    import requests
    import re
    from bs4 import BeautifulSoup
    source = requests.get('https://www.data.gov/').text
    soup = BeautifulSoup (source , 'lxml')
    #print soup.prettify()
    images = soup.find_all('small')
    print images
    con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
    print con
    #for con in images.find_all('a',href=True):
    #print con
    #content = images.split('metrics')
    #print content[1]
    #images = soup.find_all('a', {'href':re.compile('d+')})
    #print images









    share|improve this question
























      up vote
      1
      down vote

      favorite
      1









      up vote
      1
      down vote

      favorite
      1






      1





      I get the following output in my beautiful soup.
      [Search over 301,944 datasetsn]



      I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far



      import requests
      import re
      from bs4 import BeautifulSoup
      source = requests.get('https://www.data.gov/').text
      soup = BeautifulSoup (source , 'lxml')
      #print soup.prettify()
      images = soup.find_all('small')
      print images
      con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
      print con
      #for con in images.find_all('a',href=True):
      #print con
      #content = images.split('metrics')
      #print content[1]
      #images = soup.find_all('a', {'href':re.compile('d+')})
      #print images









      share|improve this question













      I get the following output in my beautiful soup.
      [Search over 301,944 datasetsn]



      I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far



      import requests
      import re
      from bs4 import BeautifulSoup
      source = requests.get('https://www.data.gov/').text
      soup = BeautifulSoup (source , 'lxml')
      #print soup.prettify()
      images = soup.find_all('small')
      print images
      con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
      print con
      #for con in images.find_all('a',href=True):
      #print con
      #content = images.split('metrics')
      #print content[1]
      #images = soup.find_all('a', {'href':re.compile('d+')})
      #print images






      beautifulsoup






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 10 at 19:22









      user1107731

      64116




      64116
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote



          accepted










          There is only one <small> tag on website.



          Your images variable references it. But you use it in a wrong way to retrive anchor tag.



          If you want to retrieve text from a tag you can get it with:




          soup.find('small').a.text




          where find method returns first small element it encounters on website. If you use find_all, you will get list of all small elements (but there's only one small tag here).






          share|improve this answer

















          • 1




            its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
            – user1107731
            Nov 12 at 11:01










          • I did it on my pc, it just showed text from anchor tag. I used find_all() but since its only anchor tag in small tag, I used find() to retrieve just that one.
            – Dinko Pehar
            Nov 12 at 11:25






          • 1




            Thank you. Now I got it.
            – user1107731
            Nov 12 at 15:24










          • Thanks. Can you please mark question as complete ? Thank you
            – Dinko Pehar
            Nov 12 at 17:37











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242596%2funable-to-pare-the-href-tag-in-python%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote



          accepted










          There is only one <small> tag on website.



          Your images variable references it. But you use it in a wrong way to retrive anchor tag.



          If you want to retrieve text from a tag you can get it with:




          soup.find('small').a.text




          where find method returns first small element it encounters on website. If you use find_all, you will get list of all small elements (but there's only one small tag here).






          share|improve this answer

















          • 1




            its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
            – user1107731
            Nov 12 at 11:01










          • I did it on my pc, it just showed text from anchor tag. I used find_all() but since its only anchor tag in small tag, I used find() to retrieve just that one.
            – Dinko Pehar
            Nov 12 at 11:25






          • 1




            Thank you. Now I got it.
            – user1107731
            Nov 12 at 15:24










          • Thanks. Can you please mark question as complete ? Thank you
            – Dinko Pehar
            Nov 12 at 17:37















          up vote
          0
          down vote



          accepted










          There is only one <small> tag on website.



          Your images variable references it. But you use it in a wrong way to retrive anchor tag.



          If you want to retrieve text from a tag you can get it with:




          soup.find('small').a.text




          where find method returns first small element it encounters on website. If you use find_all, you will get list of all small elements (but there's only one small tag here).






          share|improve this answer

















          • 1




            its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
            – user1107731
            Nov 12 at 11:01










          • I did it on my pc, it just showed text from anchor tag. I used find_all() but since its only anchor tag in small tag, I used find() to retrieve just that one.
            – Dinko Pehar
            Nov 12 at 11:25






          • 1




            Thank you. Now I got it.
            – user1107731
            Nov 12 at 15:24










          • Thanks. Can you please mark question as complete ? Thank you
            – Dinko Pehar
            Nov 12 at 17:37













          up vote
          0
          down vote



          accepted







          up vote
          0
          down vote



          accepted






          There is only one <small> tag on website.



          Your images variable references it. But you use it in a wrong way to retrive anchor tag.



          If you want to retrieve text from a tag you can get it with:




          soup.find('small').a.text




          where find method returns first small element it encounters on website. If you use find_all, you will get list of all small elements (but there's only one small tag here).






          share|improve this answer












          There is only one <small> tag on website.



          Your images variable references it. But you use it in a wrong way to retrive anchor tag.



          If you want to retrieve text from a tag you can get it with:




          soup.find('small').a.text




          where find method returns first small element it encounters on website. If you use find_all, you will get list of all small elements (but there's only one small tag here).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 11 at 0:04









          Dinko Pehar

          588324




          588324








          • 1




            its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
            – user1107731
            Nov 12 at 11:01










          • I did it on my pc, it just showed text from anchor tag. I used find_all() but since its only anchor tag in small tag, I used find() to retrieve just that one.
            – Dinko Pehar
            Nov 12 at 11:25






          • 1




            Thank you. Now I got it.
            – user1107731
            Nov 12 at 15:24










          • Thanks. Can you please mark question as complete ? Thank you
            – Dinko Pehar
            Nov 12 at 17:37














          • 1




            its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
            – user1107731
            Nov 12 at 11:01










          • I did it on my pc, it just showed text from anchor tag. I used find_all() but since its only anchor tag in small tag, I used find() to retrieve just that one.
            – Dinko Pehar
            Nov 12 at 11:25






          • 1




            Thank you. Now I got it.
            – user1107731
            Nov 12 at 15:24










          • Thanks. Can you please mark question as complete ? Thank you
            – Dinko Pehar
            Nov 12 at 17:37








          1




          1




          its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
          – user1107731
          Nov 12 at 11:01




          its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
          – user1107731
          Nov 12 at 11:01












          I did it on my pc, it just showed text from anchor tag. I used find_all() but since its only anchor tag in small tag, I used find() to retrieve just that one.
          – Dinko Pehar
          Nov 12 at 11:25




          I did it on my pc, it just showed text from anchor tag. I used find_all() but since its only anchor tag in small tag, I used find() to retrieve just that one.
          – Dinko Pehar
          Nov 12 at 11:25




          1




          1




          Thank you. Now I got it.
          – user1107731
          Nov 12 at 15:24




          Thank you. Now I got it.
          – user1107731
          Nov 12 at 15:24












          Thanks. Can you please mark question as complete ? Thank you
          – Dinko Pehar
          Nov 12 at 17:37




          Thanks. Can you please mark question as complete ? Thank you
          – Dinko Pehar
          Nov 12 at 17:37


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242596%2funable-to-pare-the-href-tag-in-python%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Guess what letter conforming each word

          Run scheduled task as local user group (not BUILTIN)

          Port of Spain