dedbox
2017-11-29 17:41:31

@notjack


dedbox
2017-11-29 17:41:34

> for byte-level serialization protocols not parsing user-written source code


dedbox
2017-11-29 17:42:22

Good catch. I’m not yet actively considering that difference.


dedbox
2017-11-29 17:45:35

Maybe I don’t understand the codecs paragraph in the roadmap, then.


dedbox
2017-11-29 17:45:52

Can you give an example of structured data, in that context?


dedbox
2017-11-29 18:18:46

Wait, you’re talking about Ethernet frames and IP packets as structured data.


notjack
2017-11-29 18:23:07

@dedbox yes - the main difference is that usually these things specify lengths up front and give you other sorts of structure to make parsing not require unbounded time/space


notjack
2017-11-29 18:23:42

and make it easy to throw away unneeded stuff quickly (very little backtracking in the grammars)


dedbox
2017-11-29 18:26:07

Hrm. So I’m pretty sure the OS TCP stack does packet framing for you, doesn’t it?


dedbox
2017-11-29 18:26:48

Are there situations where writing a precise number of bytes is faster?


notjack
2017-11-29 18:30:48

TCP does yes. But HTTP does framing too. 1.0/1.1 sorta do “line framing” with headers, Content-Length frames the body, and chunked encodings do more typical here’s-a-stream-of-chunks-with-separators framing. HTTP2 does a lot more stuff like that in order to do things like header compression and stream multiplexing. And a lot of other protocols on top of TCP add their own sorts of framing stuff.


dedbox
2017-11-29 18:35:49

Can we define “frame” or “framing” precisely? Is a frame just a unit of meaning in a byte sequence, or more something more specific?


notjack
2017-11-29 18:38:13

ah


notjack
2017-11-29 18:40:05

by “framing” I mean doing this: " I want to send these bytes, but to do that I need to also send some info about how to send them, so I’ll wrap blocks of bytes with a header of some sort and the header will contain the extra information"


notjack
2017-11-29 18:44:17

Ethernet frames wrap data with Ethernet-specific control info (I think a checksum and some ARP routing stuff), IP frames wrap data with IP-specific control info like routing and addressing stuff, TCP wraps data with TCP stuff like session and segment info, HTTP wraps messages with headers and chunking / encoding info, SOAP wraps messages in a SOAP envelope, etc


notjack
2017-11-29 18:45:11

the most common use I’ve noticed is trying to say how much stuff you’re sending so the other side knows how to tell different messages apart and when to stop listening


dedbox
2017-11-29 18:50:41

Ok. So, would we call a length-prefixed string a frame?


notjack
2017-11-29 18:55:06

I think so, yes


dedbox
2017-11-29 18:58:51

And, one more. Would we say the headers of an HTTP message are framed by the end of the start line and an empty line?


notjack
2017-11-29 19:02:04

That one I’m less sure about, but I think so. The HTTP spec does specify that it’s reasonable for implementations to place limits on the length of a single header line so it’s sort of a size limit. There’s no length prefix in that case though, only an upper bound.


dedbox
2017-11-29 19:12:46

Ok thanks. I’m on the road now and will follow up after lunch time.


dedbox
2017-11-29 22:37:06

If we say a frame is a byte array of (efficiently) computable length, and framing is the act of assembling bytes into frames, then we can make strong performance guarantees on framing codecs.


dedbox
2017-11-29 22:41:08

Then it’s a little easier to think about subclasses of odecs, like length-prefixed-frame, bounded-length-frame, bytes-delimited-frame, etc.


dedbox
2017-11-29 22:43:08

Then an HTTP header line codec can compose (bytes-delimited-frame #"\r\n") with (bounded-length-frame MAX-HEADER-LENGTH)


dedbox
2017-11-29 22:44:50

and I’d be able to derive performance characteristics for the whole thing based on the given framers.


dedbox
2017-11-29 22:52:03

So then we can also say the header is a frame. We could frame it by mapping pair->http-header-line-frame across a headers alist and joining the results. Call that http-header-frame.


dedbox
2017-11-29 22:53:22

Maybe I’m conflating framing and codecs


dedbox
2017-11-29 22:54:26

In that case, we’d need a less general definition of framing.


dedbox
2017-11-29 23:00:16

I don’t know, it feels kinda right. It seems to work in the opposite direction, too.


dedbox
2017-11-29 23:01:14

“Unframing” a length-prefixed frame is easy.


dedbox
2017-11-29 23:03:29

A bounded-length frame is a little trickier. It would need a sub-unframer to work for frames of length less than the bound.


dedbox
2017-11-29 23:03:50

Bytes-delimited frames are also easy.


dedbox
2017-11-29 23:06:25

So given composable and invertible codecs, we might be able to express arbitrarily complex codecs in a declarative style, which means it will be easy to ascribe and check types.


dedbox
2017-11-29 23:12:56

I could also reason about codec composition as if it were function composition.