
(sorry, fell asleep mid conversation)

There are some fundamental differences between ip and authority.

An IP is a giant number encoded as bytes.

An authority is a composition of other types.

If we think of them as data containers, direct syntax can make sense for either.

If we want them to interact with each other or the environment, we may need to make some design decisions.

At this point, using ips and authorities as containers is sufficient for axon.

And the simple syntax feels so clean.

For scheme-specific stuff, we have plenty of room in the URI API that needs filling out

For example, the scheme part of a URI is just a string. URI users can lookup scheme-specific info from a “database” keyed by the scheme name.

#uri"<http://whatever.com:9876/abc?def#ghi>"

the http://
tell us we’re doing a hierarchical parse, with authority = host + optional port. With the way RFCs are written to cascade, that’s a huge chunk of use cases right there, so we might want to extract a generic “hierarhical host+port” authority parser.

Poking around, it’s a lot harder to see patterns among non-hierarchical URIs.

<mailto:dedbox@gmail.com>

tel:8885551212

Things will get interesting if we want to reject mailto:8885551212
or tel:dedbox@gmail.com
as syntax errors.

In either case, we’re looking at serious meta-programming.

My favorite kind, actually.

I think the clean feeling of my current syntax comes from the fact that URIs and hostnames really are just strings to me, so I don’t care what net2
does with them. At least, not until I do, at which point I only want certain things, like hostname or port.

So net2/data
gives me a convenient way to munge URI data without resorting to string processing.

Concretely, #authority"192.168.1.2:34"
is equivalent to (authority #ip4"192.168.1.2" 34)

I can see the use for that. I’m much less sure about how to get it to work right though so I’d rather not commit to doing it in v1

also do you think both ip address versions ought to be handled with the same tag?

@dedbox Wait. Isn’t the “tag” itself doing the exact same thing as a URI scheme?
#uri"<http://whatever.com:9876/abc?def#ghi>"
=>#<http://whatever.com:9876/abc?def#ghi>
#dns"<http://example.com\|example.com>"
=>#dns:<http://example.com\|example.com>

There’s an RFC for representing DNS records as URIs so this isn’t totally without precedent: https://tools.ietf.org/html/rfc4501

It also looks remarkably similar to Clojure’s Extensible Data Notation, see https://github.com/edn-format/edn and https://clojure.org/reference/reader#tagged_literals

(thanks to @samth for triggering this thought in a github issue about #hash()
literals)


heh, I read the DNS scheme spec out of curiosity. It wasn’t quite what I expected.

According to RFC 4501, >A DNS URI designates a DNS resource record set, referenced by domain name, class, type, and, optionally, the authority.

> dnsurl = “dns:” [ "//" dnsauthority "/" ] dnsname ["?" dnsquery]

Kinda confusing that dns:<http://example.com\|example.com>
and <dns://example.com/>...
mean such different things

re: edn tagged elements, this is more or less what I had in mind

I imagine net2
would come with some basics (e.g., URIs and their parts), maybe with an extensibility mechanism.

Then protocol implementers can provide their own tagged elements.

@notjack Back to the present, what are you willing to accept for now?

~I am comfortable with hairy syntax-ey stuff.~ I can code up anything we might want to try.

I think only providing string->ip
and string->dns
is best for v1 with explicit use of an authority
constructor required for adding port info. A GitHub issue for exploratory design of literal syntaxes would be good though.

> also do you think both ip address versions ought to be handled with the same tag?

Almost missed that.

Having two distinct structs makes the library simpler by eliminating some structural checks. It also adds complexity by requiring similar checks elsewhere.

It would be about as easy (as what I havenow) to use a single ip
struct and provide ip4?
and ip6?
predicates.

Except then every IP would occupy at least 16 bytes.

So maybe keeping them separate gives us a bigger win (in memory) than unifying them (in cycles).

For the API, unifying them simplifies the syntax (one less constructor) but complicates the semantics (more run-time “type” checking, which might not matter down the road in TR).

I think keeping them separate is the safer approach.

I agree

There’s a notion of generic “ip literals” in the URI 3xxx spec that might be worth using in the API names

string->ip4
, string->ip6
, string->ip-literal

Cool. That’ll drop right in.

oh, wait.

An IP literal is just an IPv6address or IPvFuture wrapped in square brackets.

already doing that

Maybe you meant host
, which is an ip or reg-name literal.

also doing that, actually

Hmm. Maybe just string->ip
which does either v4 or literal

The authority
constructor needs to do string->host
, or equivalently string->ip-or-reg-name
.

which I guess in practice means string->ip-or-dns

I think the authority constructor should require non-string (already parsed) arguments

because schemes may define arbitrary kinds of authorities

Ok, then whatever uses the authority constructor will need to do that.

I guess my point is that it’s easy and I’m already doing it, so we can cal it what we want and put it wherever.

a string->authority
function maybe?

yes

I’m updating the code now, to try it out

I just undo the constructor trickery and rename the old constructor string->foo

The foo->string
functions are already there

:thumbsup:

thanks for filing the literal syntax issue btw

I just found https://tools.ietf.org/html/rfc4367 which I think is very relevant to the dns
struct definition

then I’ll read it

looks like I need host->string
in axon
.

because I don’t always know or care what kind of host I’m working with.

No big deal. I can just extract the code from authority->string

But replacing ~a
wit foo->string
is not always straight forward.

Now I have to choose which printer to use.

It looks like this right now: (define (host->string host)
(cond [(ip4? host) (ip4->string host)]
[(ip6? host) (ip6->string host)]
[(dns? host) (dns->string host)]))

maybe a net-host?
predicate matching ip4?
, ip6?
, and dns?
values?

this also makes me think about ditching reg-name
as a super-struct and having a host
super-struct that ip, dns, virtual addresses, and whatever else all inherit from

not extensible

sorry, was still talking about foo->string

host
super-struct is not far from reg-name
setup we have now, and it has other benefits

Like keeping ip4
and ip6
types distinct or unified as needed.

yes, and a generic parser would require some sort of generic (string-reg-name "some-string")
for parsing authorities of uris with unknown schemes - making host
the abstract super-struct would free the struct reg-name
to not be a supertype of any kind and be just a normal host subtype like any other except it contains any random string for some unknown non-ip host

Just rename the current reg-name
to host
and make everything else a sub-struct.

yup

but when a generic string->uri
form is added it needs a sub-struct of host
to use for authorities it doesn’t recognize

it would just stick the unparsed string it didn’t recognize into that

Ok. I’m using a null-authority
instance with ~all~ both fields #f
right now. It could be an instance of the “unknown” host
sub-type on the empty string instead, and not have that extra #f
dangling.

hmm, I think there is no such thing as an authority with a null host

as in, the authority if present always has a host and optionally has a port, but within a URI the entire authority may be absent

uri-authority : Uri -> Maybe Authority
authority-host : Authority -> Host
authority-port : Authority -> Maybe Port

because what would a URI with no host but a present port mean?

http::80/foo
?

But the host part could be anything.

fie:///...

I think it can only be anything except the empty string - the entire authority can be empty, but that’s different from the authority being nonempty and the host being empty

The authority part is empty. In one of the many RFCs I’ve read recently, it was pointed out.

file:///
is empty authority - empty host in nonempty authority would be something like file://:80/
I think

Ohh, ok

We’ll, I really care about empty authority, and the host
struct design is a better model than what we have now.

in that case I think places where you do accept an authority you could also accept #f
to represent empty authority, instead of authority with two #f
fields

it should be impossible for a net2
client to construct an authority struct where the host is empty and the port is present

that’s a useful invariant

Ok that makes sense.