Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change URI#to_s to handle default ports for more schemes than http/https #5233

Merged
merged 26 commits into from Jan 2, 2018

Conversation

lachlan
Copy link
Contributor

@lachlan lachlan commented Nov 2, 2017

Add support for more well known schemes, in addition to http and https, when determining if the URI is for the default port for the scheme in question, and therefore if the port should be omitted from the resulting string.

The following schemes now have the default port omitted when serialized to a string: ftp, ftps, gopher, http, https, ldap, ldaps, nntp, scp, sftp, ssh, and telnet.

src/uri.cr Outdated
@@ -426,4 +426,22 @@ class URI
URI.escape(password, io)
end
end

@@DEFAULT_PORTS = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a constant for this (no @@)

DEFAULT_PORTS = ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. I've changed the class variable to be a constant.

src/uri.cr Outdated
}

private def is_default_port?
return port.nil? || port == DEFAULT_PORTS[scheme]?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor semantic thing: nil does not necessarily mean "default port", it might also be that the scheme doesn't support ports at all (for example file or mailto). Therefore I'd suggest to move the nil check out of this method.

src/uri.cr Outdated
"telnet" => 23,
}

private def is_default_port?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please lose the prefix is_, the question mark already describes the nature of this method.

@asterite
Copy link
Member

asterite commented Nov 3, 2017

Besides this, there should be a way to register new default ports, similar to how it's done in Elixir:

I wouldn't expose the constant for this, but do it with a method too.

@asterite
Copy link
Member

asterite commented Nov 3, 2017

And of course provide a method to get the default port of a given scheme, like in Elixir too.

be queried, registered, and unregistered, as follows.

Query the default port for a given scheme:

    URI.default_ports["http"]  #=> 80
    URI.default_ports["ponzi"] #=> nil

Register a default port for a given scheme:

    URI.default_ports["ponzi"] = 9999
    URI.default_ports["ponzi"] #=> 9999

Unregister a default port for a given scheme:

    URI.default_ports["ftp"] #=> 21
    URI.default_ports["ftp"] = nil
    URI.default_ports["ftp"] #=> nil
    URI.default_port "http"  # => 80
    URI.default_port "ponzi" # => nil
default ports to be accessed via either a hash like accessor or
getter method with consistent naming:

    URI.default_port "http"  # => 80
    URI.default_port["http"] # => 80
@lachlan
Copy link
Contributor Author

lachlan commented Nov 4, 2017

Please let me know what you think of this potential API for getting and setting default ports via the URI class. I thought making the API somewhat hash-like made sense, but am happy to take suggestions:

URI.default_port["http"] # => 80
URI.default_port["unknown"] # => nil
URI.default_port["unknown"] = 1234
URI.default_port["unknown"] # => 1234

I also added a method allowing for default ports to be queried using a normal method call rather than hash-like accessor:

URI.default_port "http" # => 80

I'm not sure if having two ways to query for a default port is good or bad?

Does anyone see a need to iterate through the registered default ports? I haven't implemented Enumerable or Iterable, so this is not currently possible.

Also, is there something else I need to do to make this implementation thread-safe?

variable for the internal hash of default ports, and to be more hash
like for unregistering a default port via the new delete method
# default_ports["http"] # => 80
# default_ports["ponzi"] # => nil
# ```
def [](scheme : String?) : Int32?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allow a nil scheme here? It should be String only imo

it "returns default port for well known schemes" do
URI.default_port["http"].should eq(80)
URI.default_port("http").should eq(80)
URI.default_port["https"].should eq(443)
Copy link
Contributor

@bew bew Nov 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels wrong and confusing to have default_port() and default_port[] that does the same thing, there should be only one way. (Don't know which one)

# default_port["http"] # => 80
# default_port["ponzi"] # => nil
# ```
def [](scheme : String?) : Int32?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allow scheme to be nil here?

# ```
def [](scheme : String?) : Int32?
return nil if scheme.nil?
@default_port[normalize(scheme)]?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels weird to forward a #[] method to a #[]? method, I think it should be consistent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a separate #[]?(scheme : String) method.

@asterite
Copy link
Member

asterite commented Nov 4, 2017

It looks good. But it's too complex. I prefer two methods:

URI.default_port(scheme)
URI.set_default_port(scheme, host : String?)

In the second overload when nil is given it erases the scheme's default port.

It just looks weird that you can do p URI.default_port. Is that the default port of an URI? It also involves more types, more (hidden) code, etc.

src/uri.cr Outdated
# returns `nil`.
# Registers the default port for the given scheme.
#
# If port is nil, the existing default port for the scheme, if any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the doc formatting should be If *port* is `nil`, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@straight-shoota
Copy link
Member

While you're at it, you could add some more defaults, like the ones described here: https://gist.github.com/mahmoud/2fe281a8daaff26cfe9c15d2c5bf5c8b

@RX14
Copy link
Contributor

RX14 commented Nov 4, 2017

Why not just expose the raw Hash? It's simple and it works. Perhaps it'd be harder to find out how to add new schemes.

@asterite
Copy link
Member

asterite commented Nov 4, 2017

@RX14 I agree.

We can have URI.default_ports return this Hash. Then one can use the usual Hash methods with it: [], []=, []?, keys, has_key?, delete, etc.

I worried about exposing this Hash to the user, but I don't know what could be wrong about that.

@lachlan
Copy link
Contributor Author

lachlan commented Nov 4, 2017

@RX14 and @asterite I wasn't sure about exposing the Hash directly to the user either, but that is by far the simplest solution. Are there any future thread/concurrency related issues with doing this?

If we go with URI.default_ports returning the Hash, then we should remove both the URI.default_port(scheme : String) : Int32? and URI.set_default_port(scheme : String, port : Int32?) methods as these can be accomplished via the Hash, do you agree?

@lachlan
Copy link
Contributor Author

lachlan commented Nov 4, 2017

@straight-shoota I've added a bunch more default ports based on that gist you linked to. Thanks!

@asterite
Copy link
Member

asterite commented Nov 4, 2017

Well, yes, that's the main issue of exposing the Hash to the user: two concurrent threads modifying it will probably lead to a crash or unexpected behavior. That's why the two methods approach was safer. But we could later change it to something that acts like a Hash but it thread-safe.

@RX14
Copy link
Contributor

RX14 commented Nov 5, 2017

I honestly think that Array and Hash should be thread-safe by default (with a mode to turn it off). Perhaps i'll have to compromise to non-safe by default. But thats en entirely different issue.

@RX14
Copy link
Contributor

RX14 commented Nov 5, 2017

OK, yeah case insensitivity is a good reason. Lets go back to @asterite's original idea (with the two methods) but we just call downcase before putting schemes into the hash.

src/uri.cr Outdated

# Returns `true` if this URI's *port* is the default port for its *scheme*.
private def default_port?
port == @@default_ports[scheme.try &.downcase]?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to call &.downcase since it's already handled by DefaultPortHash#normalize through DefaultPortHash#[]?.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sija Fixed! Thanks for picking that unnecessary downcase up.

@RX14
Copy link
Contributor

RX14 commented Nov 5, 2017

We want simplicity, we want a minimal diff. If we have to subclass hash, or add any new classes then this is not a mergeable PR. Either we do it with vanilla hash or we do it how @asterite suggested by only adding 2 methods.

@lachlan
Copy link
Contributor Author

lachlan commented Nov 5, 2017

Hmm, somehow my Hash subclass is causing unrelated specs to fail after overriding find_entry(key):

# A subclass of Hash(String, Int32) with case-insensitive keys.
private class DefaultPortHash < Hash(String, Int32)
  def initialize(seeds : Hash(String, Int32))
    super nil, initial_capacity: seeds.size
    merge! seeds
  end

  def []=(key : String, value : Int32)
    super normalize(key), value
  end

  def delete(key, &block)
    super normalize(key), &block
  end

  protected def find_entry(key)
    super normalize(key)
  end

  private def normalize(key : String?) : String?
    key.try &.downcase
  end
end

Causes the following error when running the tests:

PATH=/usr/local/opt/llvm/bin:$PATH CRYSTAL_CACHE_DIR=/tmp/crystal /bin/bash -c 'make std_spec clean'
Using /usr/local/opt/llvm/bin/llvm-config [version=5.0.0]
c++ -c  -o src/llvm/ext/llvm_ext.o src/llvm/ext/llvm_ext.cc `/usr/local/opt/llvm/bin/llvm-config --cxxflags`
cc -fPIC    -c -o src/ext/sigfault.o src/ext/sigfault.c
ar -rcs src/ext/libcrystal.a src/ext/sigfault.o
./bin/crystal build  -o .build/std_spec spec/std_spec.cr
Error in spec/std_spec.cr:2: while requiring "./std/**"

require "./std/**"
^

in spec/std/json/mapping_spec.cr:278: instantiating '(Array(JSON::Type) | Bool | Float64 | Hash(String, JSON::Type) | Int64 | String | Nil)#should(Spec::EqualExpectation(Array(Bool | Float64 | Hash(String, Int32) | Int32 | String | Nil)))'

    json.any.raw.should eq([{"x" => 1}, 2, "hey", true, false, 1.5, nil])
                 ^~~~~~

in src/spec/expectations.cr:334: instantiating 'Spec::EqualExpectation(Array(Bool | Float64 | Hash(String, Int32) | Int32 | String | Nil))#match(Array(JSON::Type))'

      unless expectation.match self
                         ^~~~~

in src/spec/expectations.cr:8: instantiating 'Array(JSON::Type)#==(Array(Bool | Float64 | Hash(String, Int32) | Int32 | String | Nil))'

      actual_value == @expected_value
                   ^

in src/array.cr:162: instantiating 'equals?(Array(Bool | Float64 | Hash(String, Int32) | Int32 | String | Nil))'

    equals?(other) { |x, y| x == y }
    ^~~~~~~

in src/indexable.cr:217: instantiating 'each_with_index()'

    each_with_index do |item, i|
    ^~~~~~~~~~~~~~~

in src/enumerable.cr:377: instantiating 'each()'

    each do |elem|
    ^~~~

in src/indexable.cr:148: instantiating 'each_index()'

    each_index do |i|
    ^~~~~~~~~~

in src/indexable.cr:148: instantiating 'each_index()'

    each_index do |i|
    ^~~~~~~~~~

in src/enumerable.cr:377: instantiating 'each()'

    each do |elem|
    ^~~~

in src/indexable.cr:217: instantiating 'each_with_index()'

    each_with_index do |item, i|
    ^~~~~~~~~~~~~~~

in src/array.cr:162: instantiating 'equals?(Array(Bool | Float64 | Hash(String, Int32) | Int32 | String | Nil))'

    equals?(other) { |x, y| x == y }
    ^~~~~~~

in src/array.cr:162: instantiating '(Array(JSON::Type) | Bool | Float64 | Hash(String, JSON::Type) | Int64 | String | Nil)#==((Bool | Float64 | Hash(String, Int32) | Int32 | String | Nil))'

    equals?(other) { |x, y| x == y }
                              ^

in src/hash.cr:704: instantiating 'each()'

    each do |key, value|
    ^~~~

in src/hash.cr:704: instantiating 'each()'

    each do |key, value|
    ^~~~

in src/hash.cr:705: protected method 'find_entry' called for URI::DefaultPortHash

      entry = other.find_entry(key)
                    ^~~~~~~~~~

make: *** [.build/std_spec] Error 1
on_os osx PATH=/usr/local/opt/llvm/bin:$PATH CRYSTAL_CACHE_DIR=/tmp/crystal /bin/bash -c 'make std_spec clean' exited with 2

bin/ci build returned exit code 1

Is this something I've done, or a compiler bug?

@asterite
Copy link
Member

asterite commented Nov 5, 2017

Please, just two methods as I suggested

@lachlan
Copy link
Contributor Author

lachlan commented Nov 5, 2017

@RX14 I can't see a way to use a vanilla Hash and also treat the keys case insensitively. So instead, as you can see, I've been experimenting with a Hash subclass, but I agree that way is too complicated and error prone.

@RX14 and @asterite if you both agree, then I'll revert this change back to having two public class methods on URI, one to query the default port for a scheme:

def default_port(scheme : String) : Int32?

And the other to set the default port for a scheme, where a nil port deletes the entry for scheme:

def set_default_port(scheme : String, port : Int32?)

Agreed?

@asterite
Copy link
Member

asterite commented Nov 5, 2017

Sounds good 👍

src/uri.cr Outdated
#
# ```
# URI.default_port "http" # => 80
# URI.default_ports "ponzi" # => nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: default_ports -> default_port

src/uri.cr Outdated
# scheme, if any, will be unregistered.
#
# ```
# URI.set_default_port "ponzi" = 9999
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: = -> ,

# URI.default_port "http" # => 80
# URI.default_port "ponzi" # => nil
# ```
def self.default_port(scheme : String) : Int32?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the URI.default_port method can return nil, should it be suffixed with a question mark (URI.default_port?) as per other methods such as Hash#first_key? that can return nil?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is only used when there is also a variant of the method which does not return nil (usually raises instead). This makes no sense here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks very much for the explanation!

src/uri.cr Outdated
# Returns `true` if this URI's *port* is the default port for
# its *scheme*.
private def default_port?
scheme && port && port == URI.default_port(scheme.not_nil!)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about?

if (scheme = @scheme) && (port = @port)
  port == URI.default_port(scheme)
else
  false
end

It's a bit longer, but thread safe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shorter (but not necessarily better):

(scheme = @scheme) && (port = @port) && port == URI.default_port(scheme)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, I've updated as per @straight-shoota's suggestion.

src/uri.cr Outdated
# ```
# URI.set_default_port "ponzi", 9999
# ```
def self.set_default_port(scheme : String, port : Int32?)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URI.set_default_port is (unintentionally) returning either the previous default port when unregistering, or the new default port when registering.

Should I change it to always return nil, or maybe to return the previous default port value that was replaced or unregistered if any?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just restrict it to : Nil in the declaration and the last line of the method will be ignored.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@RX14 RX14 added this to the Next milestone Jan 2, 2018
@RX14 RX14 merged commit c40c292 into crystal-lang:master Jan 2, 2018
lukeasrodgers pushed a commit to lukeasrodgers/crystal that referenced this pull request Jan 7, 2018
…tps (crystal-lang#5233)

* Change URI#to_s to handle default ports for more schemes than http/https

* Change DEFAULT_PORTS to be a constant rather than class variable

* Rename method to default_port? and move nil check back to #to_s

* Remove unnecessary return statement from default_port? method

* Add URI.default_ports method, which allows for default ports to
be queried, registered, and unregistered, as follows.

Query the default port for a given scheme:

    URI.default_ports["http"]  #=> 80
    URI.default_ports["ponzi"] #=> nil

Register a default port for a given scheme:

    URI.default_ports["ponzi"] = 9999
    URI.default_ports["ponzi"] #=> 9999

Unregister a default port for a given scheme:

    URI.default_ports["ftp"] #=> 21
    URI.default_ports["ftp"] = nil
    URI.default_ports["ftp"] #=> nil

* Fix URI::DefaultPorts#[]= to downcase scheme on delete

* Add URI.default_port method for returning a scheme's the default port

    URI.default_port "http"  # => 80
    URI.default_port "ponzi" # => nil

* Rename URI.default_ports method to URI.default_port, allowing
default ports to be accessed via either a hash like accessor or
getter method with consistent naming:

    URI.default_port "http"  # => 80
    URI.default_port["http"] # => 80

* Fix method comments to use back ticks consistently

* Change DefaultPort class to use instance variable rather than class
variable for the internal hash of default ports, and to be more hash
like for unregistering a default port via the new delete method

* Simplify to use two methods: URI.default_port and URI.set_default_port

* Fix the method documentation formatting

* Add more well-known default ports sourced from IANA via Mahmoud's
[scheme_port_map.json][1]

[1]: https://gist.github.com/mahmoud/2fe281a8daaff26cfe9c15d2c5bf5c8b

* Fix #default_port? to correctly handle nil scheme

* Change URI default port API to return a mutable subclass of Hash
with case-insensitive keys

* Change DefaultPortHash seeding to use merge! method

* Fix default_port? to not call scheme.downcase unnecessarily

* Add spec for URI with nil scheme and non-nil port

* Change DefaultPortHash seeds to be an initialize parameter

* Fix DefaultPortHash case insensitive keys for methods other than []= and fetch

* Revert to using two methods: URI.default_port and URI.set_default_port

* Fix typo in URI.default_port comment

* Fix typo in URI.set_default_port comment

* Fix URI.set_default_port comment formatting for scheme parameter

* Fix URI#default_port? to be thread safe

* Change URI.set_default_port to always return nil
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants