Flickr

www.flickr.com

Montag, 29. August 2011

Ruby ASCII-8BIT => UTF-8

I tried to parse an xml file with ruby and had problems converting ASCII-8BIT (binary in fact) to UTF-8.

This was the error message:

/Applications/TextMate.app/Contents/SharedSupport/Bundles/Ruby.tmbundle/Support/RubyMate/catch_exception.rb:15:in `sub': incompatible character encodings: ASCII-8BIT and UTF-8 (Encoding::CompatibilityError)
from /Applications/TextMate.app/Contents/SharedSupport/Bundles/Ruby.tmbundle/Support/RubyMate/catch_exception.rb:15:in `block in '
/Users/x42/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/rexml/parsers/treeparser.rb:95:in `rescue in parse': # (REXML::ParseException)
/Users/x42/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/x42/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/rexml/source.rb:212:in `match'
/Users/x42/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/rexml/parsers/baseparser.rb:369:in `pull'
/Users/x42/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/rexml/parsers/treeparser.rb:22:in `parse'
/Users/x42/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/rexml/document.rb:230:in `build'
/Users/x42/.rvm/rubies/ruby-1.9.2-p180/lib/ruby/1.9.1/rexml/document.rb:43:in `initialize'
/Volumes/1500RAIDOPEN/x42/Documents/Projekte/weather_nagios_plugin/google_weather_nagios.rb:17:in `new'
/Volumes/1500RAIDOPEN/x42/Documents/Projekte/weather_nagios_plugin/google_weather_nagios.rb:17:in `
'
...
Exception parsing
Line: 1
Position: 1396
Last 80 unconsumed characters:
>


Here is my solution:

xml_data = Net::HTTP.get_response(URI.parse('http://www.google.com/ig/api?weather=Dueren')).body
p xml_data.encoding
xml_data = xml_data.encode("utf-8", "iso-8859-1")
p xml_data.encoding
puts xml_data
doc = REXML::Document.new(xml_data)

1 Kommentar:

  1. Thanks! Had a similar problem with Base64 encoded HTML that was packed into a JSON that I received from a joomla cms (php). Got it working with...
    str.encode("utf-8", "iso-8859-1")

    AntwortenLöschen