Convert String to XML
I'm using an web application where I can put HTTP post request. I have configured this part and the weird situation is that im getting a string back in stead of XML. When i send a request i give the following encoding: application/x-www-form-urlencoded; charset=UTF-8
The response is look like:
I have put the whole reponse in a text field so i can process it. I have replaced all the weird characters by doing:
<?xml version="1.0" encoding="utf-8"?>
<string xmlns="http://URL"><answer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><request>&lt;message&gt;&lt;authentication&gt;&lt;name&gt;user&lt;/name&gt;&lt;pass&gt;XXXX&lt;/pass&gt;&lt;sort&gt;SB-RDW-BASIC&lt;/sort&gt;&lt;/authentication&gt;&lt;parameters&gt;&lt;kent&gt;93NRSN&lt;/kent&gt;&lt;/parameters&gt;&lt;/message&gt;</request><result><code>00</code><description>Ok</description>
.....and so on
In my web application there is a method to parse the XML and also a function to do xPath. But i couldt get it work, because:
var test = "reference to the response text field";
test = test.replace(/'/g, "'");
test = test.replace(/"/g, '"');
test = test.replace(/</g, '<');
test = test.replace(/&/g, '&');
test = test.replace(/>/g, '>');
test = test.replace(/</g, '<');
- When trying to work with xPath i get: .SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
- When I open the field (after the character replaces) in the browser i see the XML starting with the following:
As you can see there there is also a \u0022 character (single quotes?), i dont know how to handle these....
<?xml version=\u00221.0\u0022 encoding=\u0022utf-8\u0022?>\n<string xmlns=\u0022URL\u0022>
and so on...
What is the best situation to parse the XML? Within the applicatoin there is a XMLParse function, but when i put the string (after the replaces) i couldnt get it parsed, i think the string is messed up or something? Anybody idea?
First off, you don't appear even to have well-formed, let alone valid, XML markup there.
Your other problem appears to be that you have character encodings set improperly.
You don't describe clearly how you are passing the XML content from client to server or from server to client.
On the server side, it should transmitting to you well-formed and VALID XML. Those byte codes (\u0022) are what you might use to prepare a URI for clean transmission, but they are not for XML.
In fact, I make it a point NEVER to use the string/text such as the .responseText property of the XML HTTP request object when I know the object sent is valid XML. I ONLY use the .responseXML property, because if that property is NOT presented to me (that is, its value is null or undefined), it is NOT valid XML. If you get the file not using the XML-HTTP request object, you can see if it builds using the DOMParser constructor: if it does not, it is NOT valid XML.
You have an error in coding of your web application if it is generating encoded bytes in your XML markup. You should address that first.
You should also review the code generating all your XML markup and make sure it validates. You will have to learn how to make DTDs at a minimum: don't worry, they're easy to learn, and if you don't want to learn, you can buy or probably get for free programs which will look at your WELL-FORMED but NOT valid XML markup and generate a DTD for that particular markup (just search "DTD generator"). I recommend that you learn how to make XML Schema instead of DTDs if you want to be even more careful about the validity of your XML markup. (there are probably XML schema generators you can buy or get for free too, but using these generators might take away the fun from learning)
On your HTTP client, for testing/debugging purposes, you can use the native developer tools in Chrome and latest IE version, but I still use Firebug extension in Firefox. Under the Net tab, you can see whether XML objects were parsed as valid, since it shows an XML tab in HTTP responses if they are valid.
Post the relevant parts of your code to see why you are encoding quote characters and getting \uXXXX in the output. Make sure that your HTTP headers have the proper Content-Type header set. Make sure that client encodings (<meta>, <form> elements) are set properly. If you are using XML-HTTP Request Object, you should set the request header "Content-type" too. Make sure that when you save PHP and HTML files, it is with the proper encoding. I recommend UTF-8 without the byte offset marker (BOM) in everything you do. All editor applications (Notepad, I use RJ Text Editor!) have the easy ability to save in UTF-8 without BOM.
Give us a look at how you are passing data back and forth between server and client and back.
Thanks for your clear and extensive reply. I will try and debug the stuff. Thanks a lot!
PS: Greatings to Ankara!