A few months ago I published a post about writing a simple web server in Ruby using Ractors. That took only 20 lines of code and it was able to leverage multiple CPUs with Ruby without having to go through the Global Interpreter Lock (GIL). That was a good preview to what the Ractor primitive is going to provide.
Since then Ruby 3.0 was released and the Ractor implementation has got more mature. In this post, we'll make our Ractor-based web server do more things.
By the end of the post, you'll learn the constraints of Ractors and get familiar with three PRs to MRI that I had to open to make it work.
Getting started
Here's what we ended up with in the previous post:
require 'socket'
pipe = Ractor.new do
loop do
Ractor.yield(Ractor.receive, move: true)
end
end
CPU_COUNT = 4
workers = CPU_COUNT.times.map do
Ractor.new(pipe) do |pipe|
loop do
s = pipe.take
puts "taken from pipe by #{Ractor.current}"
data = s.recv(1024)
puts data.inspect
s.print "HTTP/1.1 200\r\n"
s.print "Content-Type: text/html\r\n"
s.print "\r\n"
s.print "Hello world!\n"
s.close
end
end
end
listener = Ractor.new(pipe) do |pipe|
server = TCPServer.new(8080)
loop do
conn, _ = server.accept
pipe.send(conn, move: true)
end
end
loop do
Ractor.select(listener, *workers)
# if the line above returned, one of the workers or the listener has crashed
end
Our web server does not parse the incoming request and responds with the hardcoded Hello world!
string. Let's make it more dynamic.
We'll leverage WEBrick, a simple web server that ships with Ruby, to parse HTTP requests. That should be as simple as:
req = WEBrick::HTTPRequest.new(WEBrick::Config::HTTP) req.parse(sock) # req is the HTTPRequest object with all attributes populated by `parse`
Let's try it out:
require 'webrick' pipe = Ractor.new do loop do Ractor.yield(Ractor.receive, move: true) end end CPU_COUNT = 4 workers = CPU_COUNT.times.map do Ractor.new(pipe) do |pipe| loop do s = pipe.take req = WEBrick::HTTPRequest.new(WEBrick::Config::HTTP.merge(RequestTimeout: nil)) req.parse(s) puts req.inspect s.print "HTTP/1.1 200\r\n" s.print "Content-Type: text/html\r\n" s.print "\r\n" s.print "Hello world!\n" s.close end end end listener = Ractor.new(pipe) do |pipe| server = TCPServer.new(8080) loop do conn, _ = server.accept pipe.send(conn, move: true) end end loop do Ractor.select(listener, *workers) end
We'll see it fail with:
ractor_v0.rb:28:in `block (3 levels) in <main>': can not access non-shareable objects in constant WEBrick::Config::HTTP by non-main Ractor. (Ractor::IsolationError) from ractor_v0.rb:25:in `loop' from ractor_v0.rb:25:in `block (2 levels) in <main>'
Thankfully this is an easy fix: since WEBrick::Config::HTTP
is not a frozen object, we need to explicitly freeze it and it make it shareable across Ractors.
We'll have to prepend our server's code with something like this:
Ractor.make_shareable(WEBrick::Config::HTTP) Ractor.make_shareable(WEBrick::LF) Ractor.make_shareable(WEBrick::CRLF) Ractor.make_shareable(WEBrick::HTTPRequest::BODY_CONTAINABLE_METHODS) Ractor.make_shareable(WEBrick::HTTPStatus::StatusMessage)
I opened a fix upstream to make that work by default. On the way making the rest of the code work I've had to do the same thing for the Time
class too.
The story of URI parsing
Once we declared those objects shareable, we'll see it fail with exceptions like:
/opt/rubies/3.0.0/lib/ruby/3.0.0/uri/common.rb:77:in `for': can not access class variables from non-main Ractors (Ractor::IsolationError) from /opt/rubies/3.0.0/lib/ruby/3.0.0/uri/rfc3986_parser.rb:72:in `parse' from /opt/rubies/3.0.0/lib/ruby/3.0.0/uri/common.rb:171:in `parse' from /Users/kir/.gem/ruby/3.0.0/gems/webrick-1.7.0/lib/webrick/httprequest.rb:504:in `parse_uri' from /Users/kir/.gem/ruby/3.0.0/gems/webrick-1.7.0/lib/webrick/httprequest.rb:218:in `parse'
We must remember that Ractors are strict about the concurrent data access and class variables are not safe to read concurrently.
We could boil that error down to:
r = Ractor.new do res = URI.parse("https://ruby-lang.org/") puts res.inspect end
If we look up URI
implementation we'll notice it uses a class instance variable:
module URI
# ...
def self.for(scheme, *arguments, default: Generic)
if scheme
# @@schemes is the class instance variable
uri_class = @@schemes[scheme.upcase] || default
else
uri_class = default
end
return uri_class.new(scheme, *arguments)
end
There's nothing we can do to make that safe to access across multiple Ractors without changing the URI module's code. Here's my PR with the attempted fix.
Making it work
After those three changes from above we have the following code working:
require 'webrick' # Fix: https://github.com/ruby/webrick/pull/65 Ractor.make_shareable(WEBrick::Config::HTTP) Ractor.make_shareable(WEBrick::LF) Ractor.make_shareable(WEBrick::CRLF) Ractor.make_shareable(WEBrick::HTTPRequest::BODY_CONTAINABLE_METHODS) Ractor.make_shareable(WEBrick::HTTPStatus::StatusMessage) # To pick up changes from https://github.com/ruby/ruby/pull/4007 Object.send(:remove_const, :URI) require '/Users/kir/src/github.com/ruby/ruby/lib/uri.rb' pipe = Ractor.new do loop do Ractor.yield(Ractor.receive, move: true) end end CPU_COUNT = 4 workers = CPU_COUNT.times.map do Ractor.new(pipe) do |pipe| loop do s = pipe.take req = WEBrick::HTTPRequest.new(WEBrick::Config::HTTP.merge(RequestTimeout: nil)) req.parse(s) puts req.inspect s.print "HTTP/1.1 200\r\n" s.print "Content-Type: text/html\r\n" s.print "\r\n" s.print "Hello world!\n" s.close end end end listener = Ractor.new(pipe) do |pipe| server = TCPServer.new(8080) loop do conn, _ = server.accept pipe.send(conn, move: true) end end loop do Ractor.select(listener, *workers) end
Yay! Now our server can parse HTTP protocol thanks to the parser from WEBrick.
Serving Rack apps
All web apps in Ruby are using Rack as a modular interface to web servers. Let's make our server compatible with the Rack interface.
We can peek into how Rack integrates with WEBrick and follow the same pattern. The service
method is what we're interested in. It does three things:
- Take
WEBrick::HTTPRequest
as the input and transform it into Rack env - Call the Rack app with that env
- Put the response to
WEBrick::HTTPResponse
We could borrow some of that code and make it work with something like this (see the full version):
# has to be explicitly required from the main thread:
# https://bugs.ruby-lang.org/issues/17477
require 'pp'
def env_from_request(req)
env = req.meta_vars
env.delete_if { |k, v| v.nil? }
rack_input = StringIO.new(req.body.to_s)
rack_input.set_encoding(Encoding::BINARY)
env.update(
Rack::RACK_VERSION => Rack::VERSION,
Rack::RACK_INPUT => rack_input,
Rack::RACK_ERRORS => $stderr,
Rack::RACK_MULTITHREAD => true,
Rack::RACK_MULTIPROCESS => false,
Rack::RACK_RUNONCE => false,
Rack::RACK_URL_SCHEME => ["yes", "on", "1"].include?(env[Rack::HTTPS]) ? "https" : "http"
)
env[Rack::QUERY_STRING] ||= ""
unless env[Rack::PATH_INFO] == ""
path, n = req.request_uri.path, env[Rack::SCRIPT_NAME].length
env[Rack::PATH_INFO] = path[n, path.length - n]
end
env[Rack::REQUEST_PATH] ||= [env[Rack::SCRIPT_NAME], env[Rack::PATH_INFO]].join
env
end
CPU_COUNT = 4
workers = CPU_COUNT.times.map do
Ractor.new(pipe) do |pipe|
app = lambda do |e|
[200, {'Content-Type' => 'text/html'}, ['hello world']]
end
loop do
s = pipe.take
req = WEBrick::HTTPRequest.new(WEBrick::Config::HTTP.merge(RequestTimeout: nil))
req.parse(s)
env = env_from_request(req)
status, headers, body = app.call(env)
resp = WEBrick::HTTPResponse.new(WEBrick::Config::HTTP)
begin
resp.status = status.to_i
io_lambda = nil
headers.each { |k, vs|
if k.downcase == "set-cookie"
resp.cookies.concat vs.split("\n")
else
# Since WEBrick won't accept repeated headers,
# merge the values per RFC 1945 section 4.2.
resp[k] = vs.split("\n").join(", ")
end
}
body.each { |part|
resp.body << part
}
ensure
body.close if body.respond_to? :close
end
pp env
resp.send_response(s)
end
end
end
Now we have a tiny Rack app running on multiple CPUs powered by the Ractor primitive! This is huge because Ractor was nowhere there when I wrote the first post. By the Ruby 3.0 release it has matured to the point that we are able to integrate it with Rack with only a few patches.
Wrap up
I hope this post gave some overview about the current state of the Ractor pattern in Ruby, to both developers and Ruby contributors.
If you are skimming over the post and are just curious about the internals, you can see the final version of the code here. Below is the list of all bugs/patches that I reported to the upstream as the result of the writing:
If you need a general refresher about Ractor, you should check out ractor.md in the Ruby repo.
The next step would be to try making our server run Sinatra apps. In theory, Sinatra app is the same Rack app, but there's some global state in Sinatra and Rack that might make it more tricky.