xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0
I'm trying to parse a directory with a collection of xml files from RSS feeds.
I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0
Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss
It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=
Here is the function for this RSS:
import os
import xml.etree.ElementTree as ET
import csv
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
if fname != "last_feed.xml":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
#print(items)
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
return table
I tested with print(items)
and it returns all the objects.
Can it be how the XML files are written?
python-3.6 elementtree parse-error xml.etree python-os
add a comment |
I'm trying to parse a directory with a collection of xml files from RSS feeds.
I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0
Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss
It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=
Here is the function for this RSS:
import os
import xml.etree.ElementTree as ET
import csv
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
if fname != "last_feed.xml":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
#print(items)
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
return table
I tested with print(items)
and it returns all the objects.
Can it be how the XML files are written?
python-3.6 elementtree parse-error xml.etree python-os
add a comment |
I'm trying to parse a directory with a collection of xml files from RSS feeds.
I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0
Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss
It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=
Here is the function for this RSS:
import os
import xml.etree.ElementTree as ET
import csv
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
if fname != "last_feed.xml":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
#print(items)
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
return table
I tested with print(items)
and it returns all the objects.
Can it be how the XML files are written?
python-3.6 elementtree parse-error xml.etree python-os
I'm trying to parse a directory with a collection of xml files from RSS feeds.
I have a similar code for another directory working fine, so I can't figure out the problem. I want to return the items so I can write them to a CSV file. The error I'm getting is:
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 1, column 0
Here is the site I've collected RSS feeds from: https://www.ba.no/service/rss
It worked fine for: https://www.nrk.no/toppsaker.rss and https://www.vg.no/rss/feed/?limit=10&format=rss&categories=&keywords=
Here is the function for this RSS:
import os
import xml.etree.ElementTree as ET
import csv
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
if fname != "last_feed.xml":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
#print(items)
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
return table
I tested with print(items)
and it returns all the objects.
Can it be how the XML files are written?
python-3.6 elementtree parse-error xml.etree python-os
python-3.6 elementtree parse-error xml.etree python-os
edited Nov 19 '18 at 13:43
Felisep
asked Nov 19 '18 at 11:58
FelisepFelisep
12
12
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
try:
if fname != "last_feed.xml" and fname != ".DS_Store":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
except Exception as e:
print(fname, e)
return table
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374160%2fxml-etree-elementtree-parseerror-not-well-formed-invalid-token-line-1-colum%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
try:
if fname != "last_feed.xml" and fname != ".DS_Store":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
except Exception as e:
print(fname, e)
return table
add a comment |
Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
try:
if fname != "last_feed.xml" and fname != ".DS_Store":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
except Exception as e:
print(fname, e)
return table
add a comment |
Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
try:
if fname != "last_feed.xml" and fname != ".DS_Store":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
except Exception as e:
print(fname, e)
return table
Asked a friend and said to test with a try except statement. Found a .DS_Store file, which only applies to Mac computers. I'm providing the solution for those who might experience the same problem in the future.
def baitem():
basepath = "../data_copy/bergens_avisen"
table =
for fname in os.listdir(basepath):
try:
if fname != "last_feed.xml" and fname != ".DS_Store":
files = ET.parse(os.path.join(basepath, fname))
root = files.getroot()
items = root.find("channel").findall("item")
for item in items:
date = item.find("pubDate").text
title = item.find("title").text
description = item.find("description").text
link = item.find("link").text
table.append((date, title, description, link))
except Exception as e:
print(fname, e)
return table
edited Jan 19 at 12:57
answered Nov 19 '18 at 14:43
FelisepFelisep
12
12
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374160%2fxml-etree-elementtree-parseerror-not-well-formed-invalid-token-line-1-colum%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown