-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More strongly-typed HTTP and HTTPS URIs #162
Comments
Introducing a submodule sounds fine, and we can move it out to a toplevel Absolute_uri in the future. I'm not sure that |
The deprecation of userinfo also seems important for RFC-compliance (https://httpwg.org/specs/rfc9110.html#rfc.section.4.2.4). A rough version of the submodule approach could be:
|
I see that Absolute_uri is a term from RFC9110, so we could just call this Uri.Absolute to indicate the mandatory presence of a non-empty path. |
I think it might be worth separating the errors out for a deprecated user_info (which clients could validly choose to allow to pass through, after stripping it out) from the missing host error. |
Because the specificity seems important, I want to call out that an Absolute URI implies the presence of a hier-part, not necessarily an authority section (which is where a host identifier can exist):
I don't think this would satisfy the requirements in RFC9110, so it's probably not specific enough to describe this as a [Uri.Absolute.t]:
I'll also offer mild pushback on the userinfo section, based on the language of the RFC:
I suppose there's an argument to be made that the client-side SHOULD isn't strong enough to forbid this, but it seems to me that erroring when attempting to parse a Uri.t as a RFC9110-complaint HTTP(S) URI is aligned with the spirit of the RFC, if not the exact language. |
An error doesn't mean that the connection should be aborted; it means that the URI shouldn't be propagated. From that RFC text, it seems valid for me for an HTTP proxy to spot a user-info field, remove it (perhaps authenticating out of band), and to make a downstream request without it. We should make sure our URI interfaces support that. It might be that it happens within |
I've cleaned up the proposed changes above in #164 |
CHANGES: * Add `Uri.Absolute_http`, an RFC9110-compliance specialization of a `Uri.t`. (mirage/ocaml-uri#164 mirage/ocaml-uri#162 @torinnd). * Add a `uri-bench` package for the benchmarking dependencies in this repository (mirage/ocaml-uri#166 @tmcgilchrist).
CHANGES: * Add `Uri.Absolute_http`, an RFC9110-compliance specialization of a `Uri.t`. (mirage/ocaml-uri#164 mirage/ocaml-uri#162 @torinnd). * Add a `uri-bench` package for the benchmarking dependencies in this repository (mirage/ocaml-uri#166 @tmcgilchrist).
I find myself reaching for a more strongly typed URI when the scheme is one of "http" or "https". As per https://httpwg.org/specs/rfc9110.html#rfc.section.4.2 the following semantics need to be enforced for absolute URIs with one of these two schemes:
One way this could be addressed is by leveraging a GADT like:
This would be a non-trivial API change for URI, of course. A smaller change could be to introduce a submodule into the MLI for an [Absolute_http.h] with appropriate conversion functions:
... and a subset of the [Uri] MLI adapted to the constraints enforced by the scheme (e.g. [make] would require [host], omit [userinfo], and accept a more constrained set of [scheme]'s).
I have a fairly narrow understanding of other use cases for URI (for example, I see there is an old Issue #158 for Websocket support, which feels related), but imagine there are additional considerations besides what I've raised here. The big win for me personally would be a calling pattern where I could be confident that a [host] is present.
The text was updated successfully, but these errors were encountered: