Racket Slack Archive

pocmatos

2019-9-25 07:39:21

Currently, according to https://docs.racket-lang.org/reference/characters.html and my tests, we are not supporting surrogate pairs in unicode. Why is this? Nobody implemented support yet, the underlying unicode library we use doesn’t support them or was it some key decision in Racket’s design?

pocmatos

2019-9-25 07:40:19

So I cannot for example create a string in Racket with U+1D400 MATHEMATICAL BOLD CAPITAL A

pocmatos

2019-9-25 07:40:55

which uses the surrogate pair #\uD835 #\uDC00

pocmatos

2019-9-25 07:41:15

the reader chokes at #\ud835

pocmatos

2019-9-25 07:41:41

𝐀 - it seems that slack supports them. :slightly_smiling_face:

lexi.lambda

2019-9-25 08:03:39

@pocmatos Surrogate pairs are fundamentally a part of the UTF–16 encoding scheme—the code points reserved for them are not actually valid Unicode code points. Racket does not use UTF–16, so there is no reason to use surrogate pairs; use the actual code point the surrogate pair would encode instead, in that case #\U1D400.

lexi.lambda

2019-9-25 08:05:31

Racket is correct to reject #\uD835 because U+D835 is not a valid character.

lexi.lambda

2019-9-25 08:11:28

If you need to read or write UTF–16 encoded strings, use bytes-open-converter: https://docs.racket-lang.org/reference/bytestrings.html#%28def._%28%28quote._~23~25kernel%29._bytes-open-converter%29%29 But a Racket character, in the char? sense, always represents exactly one Unicode code point, and a surrogate is only half a code point.

pocmatos

2019-9-25 08:15:49

Ah, of course, racket uses UTF8 - my bad. :slightly_smiling_face: Trying to fix a jsc unicode bug tempted me to try it in racket but I forgot encodings were different.

lexi.lambda

2019-9-25 08:20:45

Racket actually uses UCS–4/UTF–32 internally, but that’s an implementation detail; it could use UTF–16 and still preserve the current interface (though there would be no reason to do so).

pocmatos

2019-9-25 08:22:15

Which libraries do we use to support these?

lexi.lambda

2019-9-25 08:25:23

We don’t; Racket ships its own support for Unicode, generated from the official Unicode data files.

lexi.lambda

2019-9-25 08:26:51

That also means Racket’s support for operations on strings is fairly limited. There is no way to do string normalization or to calculate the number of discrete, renderable glyphs in a string, for example. You only get code points.

pocmatos

2019-9-25 08:27:58

Oh! :slightly_smiling_face: Interesting - certainly something that could be improved if people wanted. But generating stuff directly from the Unicode data files is cool.

lexi.lambda

2019-9-25 08:30:52

I think ICU bindings are usually shipped as separate packages in most ecosystems, mostly because there’s a lot of API complexity there that most code doesn’t need, anyway. But I don’t think any such bindings currently exist for Racket.

pocmatos

2019-9-25 08:49:15

Thanks.

samth

2019-9-25 11:27:32

@lexi.lambda Racket does have normalization procedures, see string-normalize-nfc et al

spdegabrielle

2019-9-25 14:31:54

@samth I think there is

https://github.com/eu90h/racket-github-api

I’ve collected the other OAuth efforts:

https://github.com/racket/racket/wiki/Web-Development#auth-tools

JWT is in there too because it is used in OpenID Connect

@soegaard2

samth

2019-9-25 14:41:41

That library is for using the github API; I’m talking about using github oauth login to as a login service for another site

spdegabrielle

2019-9-25 14:51:28

This https://developer.github.com/apps/building-oauth-apps/authorizing-oauth-apps/#web-application-flow I see

lexi.lambda

2019-9-25 20:06:15

@samth Thanks, I didn’t know about those (or perhaps forgot about them). I think the broader point is still relevant, though: no sophisticated locale handling, no collation, no support for accessing grapheme clusters, etc.

samth

2019-9-25 20:17:28

Yes, I agree

soegaard2

2019-9-25 20:30:39

What would be the best way to integrate “Log in with Github”?

Option 1: Every user creates a standard user with name+password. When logged-in the user can link a github account (by signing in with Github). We now have both a racket-stories username and a github username. If the user logs out, he can login later with Github directly.

Option 2: A user can create an account by logging in with Github. The Github username will also become his racket-stories username. Problem: What if another non-github user already has the username in question?

Option 3: ?

ruyvalle

2019-9-25 21:24:36

Option 3: A user can create an account by logging in with Github. Their password will stay the same by default and they will have to pick a new username. The new username can be the same as the Github username and will only be given to them if it’s not already taken.

samdphillips

2019-9-25 21:32:38

There is a good reason many services just use email addresses as user names, uniqueness.

lexi.lambda

2019-9-25 21:57:16

The best way to handle user accounts is to separate three distinct concepts that are often muddled: user id, authentication, and display name. Use a synthetic, randomly generated, immutable id for user identity, and let users pick whatever string they want for their display name and change it whenever. Associate one or more authentication methods with each user—it could be email + password combo, OAuth provider like Google or GitHub, or something else—and let authenticated users alter those at will.