Skip to content

Feature Request: Support for IDNA #76

Open
@HoneyryderChuck

Description

@HoneyryderChuck

Currently, the biggest "missing feature" in stdlib ruby URI/DNS resolution supply chain, is IDNA support. addressable, the OS alternative to stdlib uri, has some support for it, which is, I believe, the main reason why it is a transitive dependency from many other gems (It's other feature, uri templates, is just not as compelling).

This is a proposal for a way to solve this.

punycode

IDNA domains are translated to its punycode representation, in order to be used in DNS queries (which require ascii domains). ruby core stdlib does not have a punycode converter, so this is where it should start IMO. For that, I propose: either a new punycode stdlib gem (bundled?), or its functionality to be available as a submodule of URI in the uri stdlib:

# as a bundled gem
require "punycode"
Punycode.encode("l♥️h.ws") #=> "xn--lh-t0xz926h.ws"
Punycode.decode("xn--lh-t0xz926h.ws") #=> "l♥️h.ws"

# as internal functionality
require "uri/punycode"
URI::Punycode.encode(...

implementation

addressable, as well as other (mostly abandoned) gems, support the IDNA 2003 standard. You'll find both libidn based extensions, as well as pure ruby ports. This has been since superseded by the IDNA 2008 standard (which essentially supports all the more recent unicode versions, plus some edge cases). While I think that a pure ruby implementation should be entertained at some point, I think that at this point, ruby should do best by adopting the most standardized implementation around, and that's libidn2: it's used by most other network libraries, including curl, and distributed as a package for most (all?) OSes supported by ruby.

Integration of libidn2 can be done by either a C extension, or FFI (I'm the maintainer of idnx, which already FFI's into libidn2 and winnls for windows). The advantage of the latter is that it works OOTB for java. The disadvantage may be performance (?), for which a C extension may be a better fit, but then we'd need to know whether java stdlib contains an equivalent of IDNA conversion supporting IDNA 2008.

This means that libidn2 would become a dependency when building ruby. It could be dealt with, however, as an optional dependency, like openssl is: when available, URI::Punycode is defined, and when it isn't, URI::Punycode is not. most ruby installers could then opportunistically install the package as well, just like it's done already with openssl.

(addressable is aware of its lack of IDNA 2008 support, and is working on it by FFI'ing into libidn2 as well).

API

uri could then transparently handle translation internally. I propose that, beyond the proposal made above, nothing else in the public API changes. Instead URI::Generic would support translation OOTB on building objects:

uri = "https://l♥️h.ws"
uri = URI(uri)
uri.host #=> "l♥️h.ws"
uri.hostname #=> "xn--lh-t0xz926h.ws"

# the example above is inspired in how uri already handles IPv6 addresses
uri = URI("https://[::1]")
uri.host #=> "[::1]", cannot be used in Socket.new(host, port)
uri.hostname #=> "::1", can be used in Socket.new(host, port)

This could then be used internally in the resolv library, before issuing the DNS query.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions