codedecoder

breaking into the unknown…

escaping special characters & < etc in xml

1 Comment

some charecter like & , > or < etc have special meaning in xml structure. So If you are generating xml payload from user inputs for some API call, you must take care of these characters, or the generated xml payload will become corrupt and break your application.

This is sample code from one of my library which generate xml payload from user input to submit the content to an other API

 
require 'rest_client'
require 'base64'

module LoanPath
  class Application
    def apply_credit(detail)
      details=detail.symbolize_keys
      uri = "http://mybank.com/esb-sunpower/outbound/SubmitForCreditService"
      payload =<<-eos
      <soap11:Envelope xmlns:soap11="http://schemas.xmlsoap.org/soap/envelope/">
      <soap11:Body>
        <Request>
           <Payload>
               <customer>
                  <id>#{details[:CustomerID]}</id>
                  <firstName>#{details[:first_name]}</firstName>
                  <lastName>#{details[:last_name]}</lastName>
                  <street>#{details[:street]}</street>
                  <city>#{details[:city]}</city>
                  <state>#{details[:state]}</state>
               </customer>
          </Payload>
        </Request> 
      </soap11:Body>
     </soap11:Envelope>
     eos
     rest_resource = RestClient::Resource.new(uri, {:user => "arun", :password => "secret", 
                                                   :timeout => 90 s, :open_timeout => 90s})
     rest_resource.post payload, :content_type => "application/xml"
     end
  end
end

I trigger, apply credit method in my controller as below

LoanPath::Application.new.apply_credit(params) # params contain value filled by user for name city state etc.

Now, say if a user fill the street address as “23 & main street”, the application will crash due to internal server error, which is basically due to generation of wrong XML.

The solution is to, replace all the special character with there correct XML notation

"&" with "&amp";
"<"; with "&lt";  
">" with "&gt"; 
and so on.....

Well, you do not need to do much hard work. You just need to encode all the users value By using REXML Text module.

NOTE : ruby have its own library called REXML to do all the stuff with XML. just look up this document whenever, you need some solution for your XML need.

So the correct code will look like this

require 'rest_client'
require 'base64'

module LoanPath
  class Application
    def apply_credit(detail)
      details=detail.symbolize_keys
      uri = "http://mybank.com/esb-sunpower/outbound/SubmitForCreditService"
      payload =<<-eos
      <soap11:Envelope xmlns:soap11="http://schemas.xmlsoap.org/soap/envelope/">
      <soap11:Body>
        <Request>
           <Payload>
               <customer>
                  <id>#{details[:CustomerID]}</id>
                  <firstName>#{REXML::Text.new(details[:first_name], false, nil, false)}</firstName>
                  <lastName>#{REXML::Text.new(details[:last_name], false, nil, false)}</lastName>
                  <street>#{REXML::Text.new(details[:street], false, nil, false)}</street>
                  <city>#{REXML::Text.new(details[:city], false, nil, false)}</city>
                  <state>#{REXML::Text.new(details[:state], false, nil, false)}</state>
               </customer>
          </Payload>
        </Request> 
      </soap11:Body>
     </soap11:Envelope>
     eos
     rest_resource = RestClient::Resource.new(uri, {:user => "arun", :password => "secret", 
                                                   :timeout => 90 s, :open_timeout => 90s})
     rest_resource.post payload, :content_type => "application/xml"
     end
  end
end

So basically, you need to replace the value of the node with below method of RXML. see the detail documentation here

REXML::Text.new(“your string”, false, nil, false)

Below, are the description of the parameter above.

First Argument

“your string”  is the first argument and will hold your string you want to escape. we careful that your string can be empty but not nil

1.9.3p194 :022 > REXML::Text.new( nil, false, nil, false) # will throw below error  for nil
NoMethodError: undefined method `type’ for nil:NilClass

Also, the first argument should be a string. Not apply it to integer

REXML::Text.new( 1234, false, nil, false)
NoMethodError: undefined method `type’ for 1234:Fixnum

Second Argument

It have boolean value for white space, if set true, white space will be retained, for false it will be removed.

1.9.3p194 :018 > REXML::Text.new( “&     arun”, true, nil, false) # see that white space is retained in the string
=> “&     arun”

Third Argument:

Keep it nil or pass the parent object. I always keep it nil, have never tried with parent object

Fourth Argument:

raw (nil) This argument can be given three values.

If true, then the value of used to construct this object is expected to contain no unescaped XML markup, and REXML will not change the text.

1.9.3p194 :011 > REXML::Text.new( “Iam <&”, false, nil, true ) # setting last argument true will not escape < & , so throw error
RuntimeError: Illegal character ‘<‘ in raw string “Iam <&”

If this value is false, the string may contain any characters, and REXML will escape any and all defined entities whose values are contained in the text.

1.9.3p194 :011 > REXML::Text.new( “Iam <&”, false, nil, true )  # setting last argument false will escape < &  etc
“Iam <&”

If this value is nil (the default), then the raw value of the parent will be used as the raw value for this node. If there is no raw value for the parent, and no value is supplied, the default is false.

NOTE : always set the fourth argument to false if want to escape the special characters

Author: arunyadav4u

over 7 years experience in web development with Ruby on Rails.Involved in all stage of development lifecycle : requirement gathering, planing, coding, deployment & Knowledge transfer. I can adept to any situation, mixup very easily with people & can be a great friend.

One thought on “escaping special characters & < etc in xml

  1. Thanks Arun

    That help 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s