Skip to content

XML Files with a Byte Order Mark should work #165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Unit tests expect these files to have unix line endings, even on windows
tests/documents/* text eol=lf
12 changes: 10 additions & 2 deletions src/reader/parser/outside_tag.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ use common::is_whitespace_char;

use reader::events::XmlEvent;
use reader::lexer::Token;
use std::str;

use super::{
Result, PullParser, State, ClosingTagSubstate, OpeningTagSubstate,
Expand All @@ -16,8 +17,15 @@ impl PullParser {

Token::Whitespace(_) if self.depth() == 0 => None, // skip whitespace outside of the root element

_ if t.contains_char_data() && self.depth() == 0 =>
Some(self_error!(self; "Unexpected characters outside the root element: {}", t)),
_ if t.contains_char_data() && self.depth() == 0 => {
if let Token::Character(c) = t { //If the character is the UTF-8 BOM mark, just ignore it
let bom = &[0xefu8, 0xbbu8, 0xbfu8];
if c.to_string()==str::from_utf8(bom).unwrap() {
return None;
}
}
Some(self_error!(self; "Unexpected characters outside the root element: {}", t))
},

Token::Whitespace(_) if self.config.trim_whitespace && !self.buf_has_data() => None,

Expand Down
2 changes: 1 addition & 1 deletion tests/documents/sample_4.xml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<?xml version="1.0" encoding="utf-8"?>
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE data SYSTEM "abcd.dtd">
<p:data xmlns:p="urn:x" z=">">
<!-- abcd &lt; &gt; &amp; -->
Expand Down