Martin Gross bio photo

Martin Gross

Solution seeker, building things, often in software.

Twitter Google+ LinkedIn Github

I had problems parsing a webpage with Nokogiri on JRuby 1.5.2 . The proxy authentication was the issue.

Here is how I solved this issue:

Set your environment variable HTTP_PROXY with your proxy data (user, password, URL, port) e.g.

HTTP_PROXY=http://prxuser:prxpasswd@prx-server:8080

Then change in lib\ruby\1.8\open-uri.rb, line 216, from:

klass = Net::HTTP::Proxy(proxy.host, proxy.port)

to:

klass = Net::HTTP::Proxy(proxy.host, proxy.port, proxy.user, proxy.password)

And test it with:

require 'rubygems'
require 'nokogiri'
require 'open-uri'
html_doc = Nokogiri::HTML(open("http://www.google.com/search?hl=de&q=software"))
html_doc.css('h3>a').each do |node|
  puts node.text
end