root/trunk/lib/protocols/httpclient.rb

Revision 788, 8.5 kB (checked in by raggi, 8 months ago)

Merge of branches/raggi
Most notable work and patches by Aman Gupta, Roger Pack, and James Tucker.
Patches / Tickets also submitted by: Jeremy Evans, aanand, darix, mmmurf,
danielaquino, macournoyer.

  • Moved docs into docs/ dir
  • Major refactor of rakefile, added generic rakefile helpers in tasks
  • Added example CPP build rakefile in tasks/cpp.rake
  • Moved rake tests out to tasks/tests.rake
  • Added svn ignores where appropriate
  • Fixed jruby build on older java platforms
  • Gem now builds from Rakefile rather than directly via extconf
  • Gem unified for jruby, C++ and pure ruby.
  • Correction for pure C++ build, removing ruby dependency
  • Fix for CYGWIN builds on ipv6
  • Major refactor for extconf.rb
  • Working mingw builds
  • extconf optionally uses pkg_config over manual configuration
  • extconf builds for 1.9 on any system that has 1.9
  • extconf no longer links pthread explicitly
  • looks for kqueue on all *nix systems
  • better error output on std::runtime_error, now says where it came from
  • Fixed some tests on jruby
  • Added test for general send_data flaw, required for a bugfix in jruby build
  • Added timeout to epoll tests
  • Added fixes for java reactor ruby api
  • Small addition of some docs in httpclient.rb and httpcli2.rb
  • Some refactor and fixes in smtpserver.rb
  • Added parenthesis where possible to avoid excess ruby warnings
  • Refactor of $eventmachine_library logic for accuracy and maintenance, jruby
  • EM::start_server now supports unix sockets
  • EM::connect now supports unix sockets
  • EM::defer @threadqueue now handled more gracefully
  • Added better messages on exceptions raised
  • Fix edge case in timer fires
  • Explicitly require buftok.rb
  • Add protocols to autoload, rather than require them all immediately
  • Fix a bug in pr_eventmachine for outbound_q
  • Refactors to take some of the use of defer out of tests.
  • Fixes in EM.defer under start/stop conditions. Reduced scope of threads.
  • Property svn:keywords set to Id
Line 
1 # $Id$
2 #
3 # Author:: Francis Cianfrocca (gmail: blackhedd)
4 # Homepage::  http://rubyeventmachine.com
5 # Date:: 16 July 2006
6 #
7 # See EventMachine and EventMachine::Connection for documentation and
8 # usage examples.
9 #
10 #----------------------------------------------------------------------------
11 #
12 # Copyright (C) 2006-07 by Francis Cianfrocca. All Rights Reserved.
13 # Gmail: blackhedd
14 #
15 # This program is free software; you can redistribute it and/or modify
16 # it under the terms of either: 1) the GNU General Public License
17 # as published by the Free Software Foundation; either version 2 of the
18 # License, or (at your option) any later version; or 2) Ruby's License.
19 #
20 # See the file COPYING for complete licensing information.
21 #
22 #---------------------------------------------------------------------------
23 #
24 #
25
26
27
28 module EventMachine
29 module Protocols
30
31 class HttpClient < Connection
32   include EventMachine::Deferrable
33
34
35     MaxPostContentLength = 20 * 1024 * 1024
36
37   # USAGE SAMPLE:
38   #
39   # EventMachine.run {
40   #   http = EventMachine::Protocols::HttpClient.request(
41   #     :host => server,
42   #     :port => 80,
43   #     :request => "/index.html",
44   #     :query_string => "parm1=value1&parm2=value2"
45   #   )
46   #   http.callback {|response|
47   #     puts response[:status]
48   #     puts response[:headers]
49   #     puts response[:content]
50   #   }
51   # }
52   #
53
54   # TODO:
55   # Add streaming so we can support enormous POSTs. Current max is 20meg.
56   # Timeout for connections that run too long or hang somewhere in the middle.
57   # Persistent connections (HTTP/1.1), may need a associated delegate object.
58   # DNS: Some way to cache DNS lookups for hostnames we connect to. Ruby's
59   # DNS lookups are unbelievably slow.
60   # HEAD requests.
61   # Chunked transfer encoding.
62   # Convenience methods for requests. get, post, url, etc.
63   # SSL.
64   # Handle status codes like 304, 100, etc.
65   # Refactor this code so that protocol errors all get handled one way (an exception?),
66   # instead of sprinkling set_deferred_status :failed calls everywhere.
67
68   # === Arg list
69   # :host => 'ip/dns', :port => fixnum, :verb => 'GET', :request => 'path',
70   # :basic_auth => {:username => '', :password => ''}, :content => 'content',
71   # :contenttype => 'text/plain', :query_string => '', :host_header => '',
72   # :cookie => ''
73
74   def self.request( args = {} )
75     args[:port] ||= 80
76     EventMachine.connect( args[:host], args[:port], self ) {|c|
77       # According to the docs, we will get here AFTER post_init is called.
78       c.instance_eval {@args = args}
79     }
80   end
81
82   def post_init
83     @start_time = Time.now
84     @data = ""
85     @read_state = :base
86   end
87
88   # We send the request when we get a connection.
89   # AND, we set an instance variable to indicate we passed through here.
90   # That allows #unbind to know whether there was a successful connection.
91   # NB: This naive technique won't work when we have to support multiple
92   # requests on a single connection.
93   def connection_completed
94     @connected = true
95     send_request @args
96   end
97
98   def send_request args
99     args[:verb] ||= args[:method] # Support :method as an alternative to :verb.
100     args[:verb] ||= :get # IS THIS A GOOD IDEA, to default to GET if nothing was specified?
101
102     verb = args[:verb].to_s.upcase
103     unless ["GET", "POST", "PUT", "DELETE", "HEAD"].include?(verb)
104       set_deferred_status :failed, {:status => 0} # TODO, not signalling the error type
105       return # NOTE THE EARLY RETURN, we're not sending any data.
106     end
107
108     request = args[:request] || "/"
109     unless request[0,1] == "/"
110       request = "/" + request
111     end
112
113     qs = args[:query_string] || ""
114     if qs.length > 0 and qs[0,1] != '?'
115       qs = "?" + qs
116     end
117
118     version = args[:version] || "1.1"
119
120     # Allow an override for the host header if it's not the connect-string.
121     host = args[:host_header] || args[:host] || "_"
122     # For now, ALWAYS tuck in the port string, although we may want to omit it if it's the default.
123     port = args[:port]
124
125     # POST items.
126     postcontenttype = args[:contenttype] || "application/octet-stream"
127     postcontent = args[:content] || ""
128     raise "oversized content in HTTP POST" if postcontent.length > MaxPostContentLength
129
130     # ESSENTIAL for the request's line-endings to be CRLF, not LF. Some servers misbehave otherwise.
131     # TODO: We ASSUME the caller wants to send a 1.1 request. May not be a good assumption.
132     req = [
133       "#{verb} #{request}#{qs} HTTP/#{version}",
134       "Host: #{host}:#{port}",
135       "User-agent: Ruby EventMachine",
136     ]
137
138     if verb == "POST" || verb == "PUT"
139       req << "Content-type: #{postcontenttype}"
140       req << "Content-length: #{postcontent.length}"
141     end
142
143     # TODO, this cookie handler assumes it's getting a single, semicolon-delimited string.
144     # Eventually we will want to deal intelligently with arrays and hashes.
145     if args[:cookie]
146       req << "Cookie: #{args[:cookie]}"
147     end
148
149     # Basic-auth stanza contributed by Mike Murphy.
150     if args[:basic_auth]
151       basic_auth_string = ["#{args[:basic_auth][:username]}:#{args[:basic_auth][:password]}"].pack('m').strip
152       req << "Authorization: Basic #{basic_auth_string}"
153     end
154
155     req << ""
156     reqstring = req.map {|l| "#{l}\r\n"}.join
157     send_data reqstring
158
159     if verb == "POST" || verb == "PUT"
160       send_data postcontent
161     end
162   end
163
164
165   def receive_data data
166     while data and data.length > 0
167       case @read_state
168       when :base
169         # Perform any per-request initialization here and don't consume any data.
170         @data = ""
171         @headers = []
172         @content_length = nil # not zero
173         @content = ""
174         @status = nil
175         @read_state = :header
176         @connection_close = nil
177       when :header
178         ary = data.split( /\r?\n/m, 2 )
179         if ary.length == 2
180           data = ary.last
181           if ary.first == ""
182               if (@content_length and @content_length > 0) || @connection_close
183                   @read_state = :content
184               else
185                   dispatch_response
186                   @read_state = :base
187               end
188           else
189             @headers << ary.first
190             if @headers.length == 1
191               parse_response_line
192             elsif ary.first =~ /\Acontent-length:\s*/i
193               # Only take the FIRST content-length header that appears,
194               # which we can distinguish because @content_length is nil.
195               # TODO, it's actually a fatal error if there is more than one
196               # content-length header, because the caller is presumptively
197               # a bad guy. (There is an exploit that depends on multiple
198               # content-length headers.)
199               @content_length ||= $'.to_i
200             elsif ary.first =~ /\Aconnection:\s*close/i
201               @connection_close = true
202             end
203           end
204         else
205           @data << data
206           data = ""
207         end
208       when :content
209         # If there was no content-length header, we have to wait until the connection
210         # closes. Everything we get until that point is content.
211         # TODO: Must impose a content-size limit, and also must implement chunking.
212         # Also, must support either temporary files for large content, or calling
213         # a content-consumer block supplied by the user.
214         if @content_length
215           bytes_needed = @content_length - @content.length
216           @content += data[0, bytes_needed]
217           data = data[bytes_needed..-1] || ""
218           if @content_length == @content.length
219             dispatch_response
220             @read_state = :base
221           end
222         else
223           @content << data
224           data = ""
225         end
226       end
227     end
228   end
229
230
231   # We get called here when we have received an HTTP response line.
232   # It's an opportunity to throw an exception or trigger other exceptional
233   # handling.
234   def parse_response_line
235     if @headers.first =~ /\AHTTP\/1\.[01] ([\d]{3})/
236       @status = $1.to_i
237     else
238       set_deferred_status :failed, {
239         :status => 0 # crappy way of signifying an unrecognized response. TODO, find a better way to do this.
240       }
241       close_connection
242     end
243   end
244   private :parse_response_line
245
246   def dispatch_response
247     @read_state = :base
248     set_deferred_status :succeeded, {
249       :content => @content,
250       :headers => @headers,
251       :status => @status
252     }
253     # TODO, we close the connection for now, but this is wrong for persistent clients.
254     close_connection
255   end
256
257   def unbind
258     if !@connected
259       set_deferred_status :failed, {:status => 0} # YECCCCH. Find a better way to signal no-connect/network error.
260     elsif (@read_state == :content and @content_length == nil)
261       dispatch_response
262     end
263   end
264 end
265
266
267 end
268 end
269
270
Note: See TracBrowser for help on using the browser.