Igor Kromin |   Consultant. Coder. Blogger. Tinkerer. Gamer.

I was surprised to find that StringEscapeUtils in the Apache Commons Lang library doesn't let you specify whether it should double encode existing XML entities or not. After all, even PHP lets you do this. There is a very simple workaround for that however, so read on.

In PHP if you want to avoid double-encoding you simply pass false to the htmlentities() function like so:
$strOrig = "&";
$strEnc = htmlentities($strOrig, ENT_XML1, "UTF-8", false);

This will output & instead of & i.e. the string is not double encoded.

To achieve the same result with Java and Apache Commons Lang StringEscapeUtils all you have to do is:
String strOrig = "&";
String strTemp = StringEscapeUtils.unescapeXml(strOrig);
String strEnc = StringEscapeUtils.escapeXml(strTemp);

That's simple after you see it! Just unescape the string first, then escape it. That will take care of any already encoded entities and will avoid double encoding.


Hope you found this post useful...

...so please read on! I love writing articles that provide beneficial information, tips and examples to my readers. All information on my blog is provided free of charge and I encourage you to share it as you wish. There is a small favour I ask in return however - engage in comments below, provide feedback, and if you see mistakes let me know.

If you want to show additional support and help me pay for web hosting and domain name registration, donations, no matter how small, are always welcome!

Use of any information contained in this blog post/article is subject to this disclaimer.
comments powered by Disqus
Other posts you may like...