xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0












0















I'm trying to parse a directory with a collection of xml files from RSS feeds.
I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:



xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0


Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss



It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=



Here is the function for this RSS:



import os
import xml.etree.ElementTree as ET
import csv

def baitem():
basepath = "../data_copy/bergens_avisen"

table =

for fname in os.listdir(basepath):
if fname != "last_feed.xml":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
#print(items)
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
return table


I tested with print(items) and it returns all the objects.
Can it be how the XML files are written?










share|improve this question





























    0















    I'm trying to parse a directory with a collection of xml files from RSS feeds.
    I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:



    xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0


    Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss



    It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=



    Here is the function for this RSS:



    import os
    import xml.etree.ElementTree as ET
    import csv

    def baitem():
    basepath = "../data_copy/bergens_avisen"

    table =

    for fname in os.listdir(basepath):
    if fname != "last_feed.xml":
    files = ET.parse(os.path.join(basepath, fname))
    root = files.getroot()
    items = root.find("channel").findall("item")
    #print(items)
    for item in items:
    date = item.find("pubDate").text
    title = item.find("title").text
    description = item.find("description").text
    link = item.find("link").text
    table.append((date, title, description, link))
    return table


    I tested with print(items) and it returns all the objects.
    Can it be how the XML files are written?










    share|improve this question



























      0












      0








      0








      I'm trying to parse a directory with a collection of xml files from RSS feeds.
      I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:



      xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0


      Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss



      It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=



      Here is the function for this RSS:



      import os
      import xml.etree.ElementTree as ET
      import csv

      def baitem():
      basepath = "../data_copy/bergens_avisen"

      table =

      for fname in os.listdir(basepath):
      if fname != "last_feed.xml":
      files = ET.parse(os.path.join(basepath, fname))
      root = files.getroot()
      items = root.find("channel").findall("item")
      #print(items)
      for item in items:
      date = item.find("pubDate").text
      title = item.find("title").text
      description = item.find("description").text
      link = item.find("link").text
      table.append((date, title, description, link))
      return table


      I tested with print(items) and it returns all the objects.
      Can it be how the XML files are written?










      share|improve this question
















      I'm trying to parse a directory with a collection of xml files from RSS feeds.
      I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:



      xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0


      Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss



      It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=



      Here is the function for this RSS:



      import os
      import xml.etree.ElementTree as ET
      import csv

      def baitem():
      basepath = "../data_copy/bergens_avisen"

      table =

      for fname in os.listdir(basepath):
      if fname != "last_feed.xml":
      files = ET.parse(os.path.join(basepath, fname))
      root = files.getroot()
      items = root.find("channel").findall("item")
      #print(items)
      for item in items:
      date = item.find("pubDate").text
      title = item.find("title").text
      description = item.find("description").text
      link = item.find("link").text
      table.append((date, title, description, link))
      return table


      I tested with print(items) and it returns all the objects.
      Can it be how the XML files are written?







      python-3.6 elementtree parse-error xml.etree python-os






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 19 '18 at 13:43







      Felisep

















      asked Nov 19 '18 at 11:58









      FelisepFelisep

      12




      12
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.



          def baitem():

          basepath = "../data_copy/bergens_avisen"

          table =

          for fname in os.listdir(basepath):
          try:
          if fname != "last_feed.xml" and fname != ".DS_Store":
          files = ET.parse(os.path.join(basepath, fname))
          root = files.getroot()
          items = root.find("channel").findall("item")
          for item in items:
          date = item.find("pubDate").text
          title = item.find("title").text
          description = item.find("description").text
          link = item.find("link").text
          table.append((date, title, description, link))
          except Exception as e:
          print(fname, e)
          return table





          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374160%2fxml-etree-elementtree-parseerror-not-well-formed-invalid-token-line-1-colum%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.



            def baitem():

            basepath = "../data_copy/bergens_avisen"

            table =

            for fname in os.listdir(basepath):
            try:
            if fname != "last_feed.xml" and fname != ".DS_Store":
            files = ET.parse(os.path.join(basepath, fname))
            root = files.getroot()
            items = root.find("channel").findall("item")
            for item in items:
            date = item.find("pubDate").text
            title = item.find("title").text
            description = item.find("description").text
            link = item.find("link").text
            table.append((date, title, description, link))
            except Exception as e:
            print(fname, e)
            return table





            share|improve this answer






























              0














              Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.



              def baitem():

              basepath = "../data_copy/bergens_avisen"

              table =

              for fname in os.listdir(basepath):
              try:
              if fname != "last_feed.xml" and fname != ".DS_Store":
              files = ET.parse(os.path.join(basepath, fname))
              root = files.getroot()
              items = root.find("channel").findall("item")
              for item in items:
              date = item.find("pubDate").text
              title = item.find("title").text
              description = item.find("description").text
              link = item.find("link").text
              table.append((date, title, description, link))
              except Exception as e:
              print(fname, e)
              return table





              share|improve this answer




























                0












                0








                0







                Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.



                def baitem():

                basepath = "../data_copy/bergens_avisen"

                table =

                for fname in os.listdir(basepath):
                try:
                if fname != "last_feed.xml" and fname != ".DS_Store":
                files = ET.parse(os.path.join(basepath, fname))
                root = files.getroot()
                items = root.find("channel").findall("item")
                for item in items:
                date = item.find("pubDate").text
                title = item.find("title").text
                description = item.find("description").text
                link = item.find("link").text
                table.append((date, title, description, link))
                except Exception as e:
                print(fname, e)
                return table





                share|improve this answer















                Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.



                def baitem():

                basepath = "../data_copy/bergens_avisen"

                table =

                for fname in os.listdir(basepath):
                try:
                if fname != "last_feed.xml" and fname != ".DS_Store":
                files = ET.parse(os.path.join(basepath, fname))
                root = files.getroot()
                items = root.find("channel").findall("item")
                for item in items:
                date = item.find("pubDate").text
                title = item.find("title").text
                description = item.find("description").text
                link = item.find("link").text
                table.append((date, title, description, link))
                except Exception as e:
                print(fname, e)
                return table






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Jan 19 at 12:57

























                answered Nov 19 '18 at 14:43









                FelisepFelisep

                12




                12






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374160%2fxml-etree-elementtree-parseerror-not-well-formed-invalid-token-line-1-colum%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Guess what letter conforming each word

                    Port of Spain

                    Run scheduled task as local user group (not BUILTIN)