Skip to content
This repository was archived by the owner on Jan 10, 2023. It is now read-only.

Generated XML may contains UTF-8 U+001B which is an invalid XML character #85

Closed
marcw opened this issue Nov 22, 2013 · 3 comments
Closed

Comments

@marcw
Copy link

marcw commented Nov 22, 2013

If the source code contains UTF-8 U+001B, it will break the XML parser of Jenkins as XML files cannot contain this character.

14:54:56 ERROR: Publisher hudson.plugins.violations.ViolationsPublisher aborted due to exception
14:54:56 hudson.util.IOException2: Cannot parse pmd-cpd.xml
14:54:56    at hudson.plugins.violations.parse.ViolationsDOMParser.parse(ViolationsDOMParser.java:68)
14:54:56    at hudson.plugins.violations.ViolationsCollector.doType(ViolationsCollector.java:187)
14:54:56    at hudson.plugins.violations.ViolationsCollector.invoke(ViolationsCollector.java:114)
14:54:56    at hudson.plugins.violations.ViolationsCollector.invoke(ViolationsCollector.java:25)
14:54:56    at hudson.FilePath.act(FilePath.java:917)
14:54:56    at hudson.FilePath.act(FilePath.java:890)
14:54:56    at hudson.plugins.violations.ViolationsPublisher.perform(ViolationsPublisher.java:74)
14:54:56    at hudson.tasks.BuildStepMonitor$3.perform(BuildStepMonitor.java:45)
14:54:56    at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:781)
14:54:56    at hudson.model.AbstractBuild$AbstractBuildExecution.performAllBuildSteps(AbstractBuild.java:753)
14:54:56    at hudson.model.Build$BuildExecution.post2(Build.java:183)
14:54:56    at hudson.model.AbstractBuild$AbstractBuildExecution.post(AbstractBuild.java:706)
14:54:56    at hudson.model.Run.execute(Run.java:1704)
14:54:56    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
14:54:56    at hudson.model.ResourceController.execute(ResourceController.java:88)
14:54:56    at hudson.model.Executor.run(Executor.java:230)
14:54:56 Caused by: org.xml.sax.SAXParseException; systemId: file:pmd-cpd.xml; lineNumber: 89783; columnNumber: 30; An invalid XML character (Unicode: 0x1b) was found in the element content of the document.
14:54:56    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:253)
14:54:56    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:288)
14:54:56    at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
14:54:56    at hudson.plugins.violations.parse.ViolationsDOMParser.parse(ViolationsDOMParser.java:57)
14:54:56    ... 15 more

Related stackoverflow answer about this: http://stackoverflow.com/a/10121993/684801

@clphillips
Copy link

I'm having the exact same issue. It's the result of phpseclib's doc comment which contains the ESC ASCII value.

The line starts:

     * "[x1B][00m", you're seeing ANSI escape codes.  According to

Note, I've replace the ESC value with "[x1B]" in the string above as github doesn't allow that char in comments.

Per XML spec, all low order bytes (below x20) are invalid. phpcpd should just strip them. It's not the doc comment's vault, it had no idea it was going to be rendered in an XML document.

aboks added a commit to aboks/phpcpd that referenced this issue Feb 21, 2014
The PMD logger now replaces all characters that are invalid in XML
by the Unicode replacement character (U+FFFD).
@clphillips
Copy link

Can someone merge @aboks fix and push this through? Would really like to see this resolved.

@marcw
Copy link
Author

marcw commented Jan 24, 2017

Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants