Racket Slack Archive

hectometrocuadrado

2020-12-28 09:36:12

Using FFI, how can i convert a _cpointer value (It is a char* value) to a Racket object like a vector, string or list?

laurent.orseau

2020-12-28 09:44:08

There are converters such as cvector* https://docs.racket-lang.org/foreign/foreign_cvector.html?q=Cvector#%28def._%28%28lib._ffi%2Funsafe%2Fcvector..rkt%29._make-cvector%2A%29%29\|https://docs.racket-lang.org/foreign/foreign_cvector.html?q=Cvector#%28def._%28%28lib._ffi%2Funsafe%2Fcvector..rkt%29._make-cvector%2A%29%29

laurent.orseau

2020-12-28 09:45:31

Then cvector->list brings you in the racket world if needed

hectometrocuadrado

2020-12-28 09:46:06

Thank you!

hectometrocuadrado

2020-12-28 09:46:13

And another question

hectometrocuadrado

2020-12-28 09:47:24

If I dont know the length of the char*, do i need to create a function searching the nul-termination element?

hectometrocuadrado

2020-12-28 09:47:42

Or is there an easier way?

laurent.orseau

2020-12-28 09:51:03

You need to know the length of you want to use it as a cvector indeed

laurent.orseau

2020-12-28 09:51:47

Often this length is stored somewhere in the C code though

anything

2020-12-28 15:12:10

At my job, whenever we release a new website, someone in the team compares the new and the old website and creates a mapping of URLs from the old to the new so that someone else can create redirections from the old URLs to the new URLs. Assuming I have a list of all relevant old URLs and another list of all new URLs, I want to write a program to create this mapping. Python offers libraries such as https://spacy.io/usage/vectors-similarity and I believe this is what I must do — find the similarity between texts and choose the best one. I shouldn’t compare HTML because the new website will be completely different. But perhaps I can read the visible text on the pages (perhaps apply some transformations) and then see which ones are more similar. I’m not sure how to best to do this. My question is — what is available out there in Racket for this and if you have any strategy-recommendations to suggest. Thanks!

soegaard2

2020-12-28 15:24:42

@anything Maybe you can use the Levenshtein distance? https://docs.racket-lang.org/levenshtein/index.html

badkins

2020-12-28 15:25:45

You’ll probably want to compare the URLs themselves also. For example, maybe a simple page, such as the login page, has very different text/layout (between old & new sites), but the URLs are the same, e.g. <https://example.com/login>

anything

2020-12-28 15:34:12

The Levenshtein distance is polynomial in the input size, so that’s good! I’m going to give it a try. Even if it’s slow, that’s fine because we’d run this just once per website release. But the result should be good, otherwise it’s better do it by hand. @badkins , that’s a very good point. I guess I’m going to have to experiment with that. Maybe add scores, or choose the maximum between URL and page text or something like that.

soegaard2

2020-12-28 15:34:24

Do we have a render html to text-thingie somewhere?

badkins

2020-12-28 15:36:48

Solving this generally is probably harder than solving it for your organization in particular where the types of changes to URLs and pages are constrained somewhat, so maybe adding some heuristics that consider the type of changes expected may help.

badkins

2020-12-28 15:37:54

Phase 1 may be to identify the URLs and pages that are (nearly) identical, and remove them from the mix.

badkins

2020-12-28 15:38:24

And that could support a manual approach if you can’t fully automate it - at least the humans wouldn’t have to look at everything.

anything

2020-12-28 15:41:15

@soegaard2, the first thing that came to mind was to just remove all tags with a regular expression and consider the resulting string as the text, removing extra spaces, say. @badkins, URLs will never be identical I think — considering only the path part. They might share one or two words, though. But, yes, there is no ambition here to fully automate it. The idea is just to help the human that will have the final say. I’ll build just the suggested mapping. Because the human in charge knows both websites, just by looking at the URLs, s/he will know whether that’s right or wrong.

badkins

2020-12-28 15:43:00

@anything by identical, I was referring to comparing the URL & page on the old & new sites i.e. pages where nothing has changed

badkins

2020-12-28 15:43:41

another way to put it - identify all the URLs that don’t need a redirect :)

anything

2020-12-28 15:46:11

This will never happen. All pages are always different (in my situation). There are pages we won’t redirect — some users will just get 404, file not found. So if my program is able to compute a score of best similarity and this score is too low, we might ignore that mapping. But there will be no page that would share the same exact HTTP path such that it needs no redirection. In 100% of our cases, the whole thing is different. (Or perhaps I haven’t understood what you’re saying at all. :slightly_smiling_face: )

soegaard2

2020-12-28 15:47:12

Found this! (But the code is old…) https://github.com/soegaard/little-helper/blob/master/html-to-txt.rkt

anything

2020-12-28 15:48:03

You’ve done it! Let me try it! Thanks great! Thank you so much!

badkins

2020-12-28 15:49:09

@anything now I’m curious :) Do you have a link?

anything

2020-12-28 15:49:59

Link to a website example? I probably wouldn’t have one that would include an old website right now, but let me ask my team. We might have one.

badkins

2020-12-28 15:51:00

I assumed you were referring to a large website that has a fraction of pages modified when the new one is released, so I’m trying to envision a site where everything changes when released!

anything

2020-12-28 15:56:04

Oh, I see what you’re saying. But, no, what we do is a radically new website in which everything is changed. And, by the way, these are never big websites. Everything is small. We take clients with a, say, wordpress website and build it from scratch using totally different back and frontend solutions, with new content (more or less), new images, new everything. Clients ask that we forward people from old-important URLs to new URLs that take the same function as that old one. (Just so we don’t give them a file not found error if they try. But unimportant pages are fine to just blow a file not found error.)

soegaard2

2020-12-28 15:57:24

I think, there is a good chance you could find an existing tool. But it is less fun :slightly_smiling_face:

anything

2020-12-28 15:58:38

Now I’ll search deeper. :slightly_smiling_face: I could use one… At least while I work on a Racket solution. At least I can decrease our release-time. The person that does this reports that it can take one or two days to get this done.

anything

2020-12-28 15:59:30

@badkins Here’s an example. Old: http://www.stranahanfoundation.org/ New: http://asoft200183.accrisoft.com/stranahan (edited)

You can assume that the new website would run under the old’s domain. (So the HTTP’s path will not begin ~with~, say, with /stranahan.)

anything

2020-12-28 16:03:00

Oops. There are some cases where the URLs are exactly the same, actually. Here’s the first: https://www.stranahanfoundation.org/about-us/history-purpose/ http://asoft200183.accrisoft.com/stranahan/about-us/history-purpose/

anything

2020-12-28 16:04:39

In this example, it turns out there are probably many such pages. I had never seen this website. I’ve seen ones in which I didn’t find a single page. What I am just learning now is that if the page is static, then we would actually try to keep the URL exactly equal. But sometimes we turn static websites into dynamic ones and there are no such static pages and the URL ends up completely different.

massung

2020-12-28 16:06:31

I know I’m later to the party on URL redirect mapping, but - based on the URLs you’re mapping - I imagine the easiest way to do this would be to map route subpaths instead of the full URLs. For example, using a tree-like structure…

old: <http://foo.com/get/user/<name\|foo.com/get/user/<name>> new: <http://foo.com/lookup/user/by/name/<name\|foo.com/lookup/user/by/name/<name>> '("/get" ("/lookup ("/user" ("/user/by/name" *)))) old: <http://foo.com\|foo.com> new: <http://bar.org/place\|bar.org/place> '("/" ("/place")) Now, for any request you just break up the path (split by /) and then use assoc calls recursively until you reach the end or a * and have the full path remapped.

Of course, you may need to handle query parameters and anchors special-cased. But, this way you can be very specific about what’s being mapped. This structure can even be stored in a DB pretty easily.

Per the example you just gave, do the domain stuff at the DNS level, don’t try and do that on your site.

Edit: unless you want the old domain subpath to route to a completely different site, in which case the above will still work with a little extra effort.

anything

2020-12-28 16:06:48

It turns out this example is not too good — because if I just compare URLs here, with say the longest subsequence algorithm, I’ll probably get most pages correctly mapped. I’m going to ask them for a different example.

soegaard2

2020-12-28 16:08:08

Found this - haven’t tried it. https://copyleaks.com/compare-two-websites

massung

2020-12-28 16:10:19

I highly recommend against trying to do something “fuzzy” to avoid direct mapping. It will fail, and then a customer/user is going to complain, and fixing that edge case is going to cause nothing but headaches down the road. You’ll be lucky if fixing one of those doesn’t break 3 other cases. :wink:

anything

2020-12-28 16:12:24

@massung, I’m still trying to understand your approach above, but yes I’m actually trying not to do anything at all. With @soegaard2 suggestions right now, if some tool can do half the job here, we’ll consider it done. We’re still in the phase of understanding the problem. We’re still building examples.

anything

2020-12-28 16:14:09

My team is trying to build me an example, but they just remarked that I might have to use my hosts file to see the old website. So that wouldn’t let you guys easily see old and new websites. We are still trying to understand the problem.

massung

2020-12-28 16:15:18

BTW, I went through just similar very recently. http://type2diabetesgenetics.org\|type2diabetesgenetics.org needed to be redirected to http://t2d.hugeamp.org\|t2d.hugeamp.org and the paths to the various pages changed completely. But the old URLs are referenced in published papers and still needed to work. Example:

Old: http://type2diabetesgenetics.org/gene/geneInfo/SLC30A8\|type2diabetesgenetics.org/gene/geneInfo/SLC30A8 New: http://t2d.hugeamp.org/region?gene=SLC30A8\|t2d.hugeamp.org/region?gene=SLC30A8

We did the domain mapping at the DNS level, and the site has a redirects table in a DB where we use a template-like string to do the mapping. Example:

"/gene/geneInfo/{gene}" -> "/region?gene={gene}"

massung

2020-12-28 16:17:00

There weren’t tons of them (maybe 100?), so we don’t actually query the table. The web server just does a SELECT * to load the entire thing into memory at startup and then on every request does a quick check to see if it should be redirected.

anything

2020-12-28 16:17:49

Lol. They’re asking me for $989 for comparing that stranahan website. :-)

massung

2020-12-28 16:19:42

Anyway, I hope at least some of the ideas may help.

anything

2020-12-28 16:20:26

@massung, the mapping will end up being added to Apache’s .htaccess. I have no possibility of extending the webserver’s intelligence. I also can’t do anything at the DNS level because our DNS servers are these standard services such as GoDaddy’s and similar ones.

massung

2020-12-28 16:21:25

> I also can’t do anything at the DNS level because our DNS servers are these standard services such as GoDaddy’s and similar ones. All of those services offer DNS mapping changes (adding A and CNAME records) or just doing simple redirects.

massung

2020-12-28 16:21:51

https://www.godaddy.com/help/add-a-cname-record-19236

massung

2020-12-28 16:22:31

If you don’t own the domain or have access, I get that, though.

massung

2020-12-28 16:23:41

I don’t have experience w/ Apache .htaccess stuff, though. Sorry I won’t be much help there. :disappointed:

soegaard2

2020-12-28 16:26:37

Yikes! Maybe you have found a side business here?

anything

2020-12-28 16:26:53

Thanks! I can see GoDaddy would do a CNAME that seems to forward the user to a certain page, but I don’t know how that would solve the problem. The problem seems harder that way. The Apache solution (as far as we know how to use it) would basically just forward A to B with no rewriting. I know Apache can do much more, but it’s not like we have a specific rule to give to it. So we will use just the most basic form of redirections.

anything

2020-12-28 16:28:04

Hey… Not a bad idea! :slightly_smiling_face: I like it.

massung

2020-12-28 16:28:55

Yeah, a CNAME record wouldn’t. It was just an example of how you do have access to muck with the DNS and have domains rerouted before even getting to the server.

Per your example, though, it looks like you still want the server up and running, just some (all?) pages end up re-routing to other known locations instead. So DNS likely wouldn’t do much for you.

NGINX is what I would be using for much of that kind of work. But it’s a pain at times. The .htaccess might be much easier to work with.

anything

2020-12-28 16:33:00

I couldn’t use nginx either — all our websites run on Accrisoft Freedom, which is Apache with PHP on top. The new IP address will point to their server and we can use .htaccess to configure some stuff with it, but not much more than that. I couldn’t use mod_rewrite say. (That’s Apache’s module for URL rewriting.) If I could run nginx, I could run my own Racket web server, say, and then apply whatever rules I’d like to — but I wouldn’t even know which rules to apply anyway. This is a problem we go through with most of our websites. That stranahan example happens to be a website we haven’t released it yet and so it’s the only example we have. (Sadly, it’s not a good example.)

soegaard2

2020-12-28 16:34:54

https://github.com/TeamHG-Memex/page-compare

anything

2020-12-28 16:35:10

Just to close this chapter and thank you all for the insights. The second example they gave [me] doesn’t work. This is the new website: https://www.prestanproducts.com/ But [for the] older they told me to point the hostname to IP 67.222.59.54, but when I did I see the older website was taken down. So that’s [a] no-example.

anything

2020-12-28 16:39:57

That looks like the best thing out there right now. Thanks very much! I’ll try it right now.

ben.knoble

2020-12-28 17:11:51

Just about to start the Macros chapter in the Racket Guide. Wrote this after much experimentation and would love some feedback. It provides a for/enumerate form that actually emulates any for form (including all the / variants) except for* while providing an anaphoric i that tracks the index in the sequence (like Python’s enumerate). I don’t know that it serves any practical purpose (one can always add the [i (in-naturals)] clauses manually), but it was an interesting experience. I found myself needing to write a “low level” macro to get around hygiene—is that the only way to do anaphoric macros? I’m not even sure if with-syntax or similar would work here. Also, match with quasiquote was harder than I anticipated; needing to unquote the underscore was surprising. (require (for-syntax racket/match racket/syntax)) (define-syntax (for/enumerate stx) (match (syntax->datum stx) [`(,_ (,clause ...) ,body ..1) (datum->syntax stx `(for ([i (in-naturals)] ,@clause) ,@body))] ; handle fold first because ,for-type matches anything [`(,_ /fold (,acc ..1) (,clause ...) ,body ..1) (datum->syntax stx `(for/fold (,@acc) ([i (in-naturals)] ,@clause) ,@body))] [`(,_ ,for-type (,clause ...) ,body ..1) (when (eq? for-type '*) (raise-syntax-error #f "type cannot be * because of (in-naturals)" stx #'*)) (let ([for-name (format-id stx "for~a" for-type)]) (datum->syntax stx `(,for-name ([i (in-naturals)] ,@clause) ,@body)))]))

ben.knoble

2020-12-28 17:12:32

Some samples based on the iterations chapter: (for/enumerate ([book '("Guide" "Reference" "Notes")] #:when (not (equal? book "Notes")) [chapter '("Intro" "Details" "Conclusion" "Index")] #:when (not (equal? chapter "Index"))) (printf "~a Chapter ~a. ~a\n" book (add1 i) chapter)) (for/enumerate /list ([holiday '("Christmas" "New Year's Eve")]) (format "~a. ~a" (add1 i) holiday)) (for/enumerate /and ([num '(1 2 3)]) (= i (sub1 num))) (for/enumerate /fold ([len 0]) ([chapter '("Intro" "Conclusion")]) (+ len i (string-length chapter)))

anything

2020-12-28 17:46:57

For the record, the tool https://github.com/TeamHG-Memex/page-compare seems to work at least in the one easy example I have currently available — the stranahan website. The tool does not scrape your old and new website to generate URLs. You need to specify the URLs yourself. You have to find a way to save all the URLs (of both websites between which you want to create the URL matching) in a certain directory, then ask it to pairwise compare all HTML pages in that directory. The tool then builds a JSON file containining answers such as { "path1": "stranahan/stra-new-1.html", "path2": "stranahan/stra-old-1.html", "similarity": 36.84210526315789 } So to finish the job, we’d read this json file and make decisions based on the similarity number. So my problem now is actually generating the URLs. The hard part is done. The URL generation can be a human-aided job using a crawler, which I can build with other tools here. Thanks @soegaard2 for having found the nice tool!

soegaard2

2020-12-28 17:50:31

@ben.knoble I like your example! If the goal is to write a low level macro without syntax-case or syntax-parse then this is quite nice.

For a one-to-one comparison, I converted your example to syntax-parse. They are very similar. (define-syntax (for/enumerate stx) (with-syntax ([i (datum->syntax stx 'i)]) (syntax-parse stx #:literals (*) [(_ (clause ...) body ...+) (syntax/loc stx (for ([i (in-naturals)] clause ...) body ...))] ; handle fold first because ,for-type matches anything [(_ /fold (acc ...+) (clause ...) body ...+) (syntax/loc stx (for/fold (acc ...) ([i (in-naturals)] clause ...) body ...))] [(_ * (clause ...) body ..1) (raise-syntax-error #f "type cannot be * because of (in-naturals)" stx #'*)] [(_ for-type (clause ...) body ...+) (with-syntax ([for-name (format-id stx "for~a" #'for-type)]) (syntax/loc stx (for-name ([i (in-naturals)] clause ...) body ...)))])))

anything

2020-12-28 17:51:17

Just a newbie looking at this — I would prefer letting the user specify the name i instead of assuming it is i automatically. Because when we look at the samples, we don’t know where i comes from. I would think it’s a variable defined elsewhere.

ben.knoble

2020-12-28 17:51:28

Hmm, that’s nice to see @soegaard2, thanks!

ben.knoble

2020-12-28 17:53:07

@anything it’s called an “anaphoric” macro—you can actually use anaphoric macros to add OO systems on top of lisps (imagine defining a method on a class and having access to a magic this variable—same concept). Point is, the goal was to introduce a magic variable :slightly_smiling_face:

anything

2020-12-28 17:54:13

Lol. Got ya! I didn’t/don’t know what an anaphoric macro is! Thanks for introducing it to me!

soegaard2

2020-12-28 17:55:30

It’s often tricky to get these magic variables to work outside the module in which they are defined.

soegaard2

2020-12-28 17:56:12

To test it: provide for/enumerate and then use it from another module.

soegaard2

2020-12-28 17:56:45

Another tricky thing: Have another macro that expands into a use of a magic variable.

ben.knoble

2020-12-28 17:57:27

hmmm. good point. I tried require-ing it at a repl after adding (provide for/enumerate) to the module, and at least the for/enumerate /and example worked. As for using it in another macro, tbd

soegaard2

2020-12-28 17:58:50

That’s another thing - working from the repl isn’t always the same as working from another module. Often it is, but once in a while…

samdphillips

2020-12-28 18:05:18

Racket Users Video Meetup :telephone: :tv: Saturday 2021/01/09 :: 8pm CET / 7pm UK / 11am Pacific https://gather.town/app/wH1EDG3McffLjrs0/racket-users

Agenda - What have you been working on? - Paper for discussion: > Macros for Domain-Specific Languages by MICHAEL BALLANTYNE, ALEXIS KING & MATTHIAS FELLEISEN > https://2020.splashcon.org/details/splash-2020-oopsla/105/Macros-for-Domain-Specific-Languages Kind regards, Sam Phillips & Stephen De Gabrielle

soegaard2

2020-12-28 18:40:13

@samdphillips Put it on http://racket-stories.com\|racket-stories.com !

anything

2020-12-28 18:42:08

Why are compiled files’ extension named ".zo"?

soegaard2

2020-12-28 18:43:21

That’s a good question. It’s short for “zodiac”. I think, <https://cs.brown.edu/~sk/|Shriram Krishnamurthi> coined the term.

soegaard2

2020-12-28 18:44:56

“Many thanks to both Matthew Flatt and Robby Findler for MrEd, and to Shriram Krishnamurthi for Zodiac, his source-correlating macro-expander.” From https://courses.cs.washington.edu/courses/cse341/99su/scheme/drscheme-docs/mrspidey/node4.htm

soegaard2

2020-12-28 20:21:17

I have forgotten how to do this: I have a struct (struct foo (bar)) in a module A with a smart constructor make-foo, also in A, so I export the struct like this: (provide (except-out (struct-out foo) foo) (rename-out make-foo foo)) That is, I am exporting the smart constructor under the name foo.

But now, in module B, I’d like to use match: (match some-value [(foo bar) bar]) But … now foo isn’t bound to the struct-type-info thingie that match needs.

How can I fix it in module A?

sorawee

2020-12-28 20:23:50

There’re options in struct that start with #:omit-...

sorawee

2020-12-28 20:23:59

I think you want one of those

sorawee

2020-12-28 20:25:16

Oh, actually

sorawee

2020-12-28 20:27:46

#lang racket (module foo racket (provide (struct-out test) make-test) (struct test (foo bar) #:constructor-name internal-make-test) (define (make-test #:foo foo #:bar bar) (internal-make-test foo bar))) (require 'foo) (match (make-test #:foo 1 #:bar 2) [(test a b) (+ a b)])

soegaard2

2020-12-28 20:32:14

Thanks! I knew there a proper (TM) of doing it.

mflatt

2020-12-28 20:32:23

I think that’s a coincidence. It’s “.zo” for the “z” in MzScheme (because “.mo” was taken?) and “o” for its usual “object file” meaning.

soegaard2

2020-12-28 20:33:49

So chosen to be almost like the file ending .so ?

jaz

2020-12-28 20:35:27

This exports the smart constructor as make-test, right? If you want it to be test, you have to do some more gymnastics.

jaz

2020-12-28 20:37:18

An example (somewhat abridged from the original): #lang racket/base ;; struct definition (struct Date (y m d)) ;; Smart constructor (define make-date (let () ;; The internal definition allows the constructor to ;; be called `date` in error messages. (define (date y [m 1] [d 1]) (Date y m d)) date)) (define-module-boundary-contract date make-date (->i ([year exact-integer?]) ([month (integer-in 1 12)] [day (year month) (day-of-month/c year month)]) [d date?])) (define-match-expander $date (syntax-rules () [(_ y m d) (Date y m d)]) (make-variable-like-transformer #'date)) (provide (rename-out [$date date]))

soegaard2

2020-12-28 20:38:04

I used #:extra-name paragraph: and exported paragrah:.

soegaard2

2020-12-28 20:38:16

In my case it led to the smallest set of changes.

mflatt

2020-12-28 20:38:23

Yes — probably more inspired by Chez Scheme’s “.so” than Unix shared-object “.so”, but I forget

soegaard2

2020-12-28 20:39:45

Hmm - nice alternative to use a match expander.

jaz

2020-12-28 20:40:15

The full routine there wouldn’t be necessary if I weren’t also attaching a contract to the constructor.

samdphillips

2020-12-28 20:46:11

Has anyone done any bit-bashing with a Raspberry Pi and Racket? I have a DHT11 Humidity/Temperature sensor that I am looking at playing with. The reference code I have found on the internet uses C (or equivalent) to repeatedly poll the pin state and decode that. Could I do this in Racket or should I just drop down to C and connect it to Racket via ffi?

badkins

2020-12-28 20:58:40

What is the purpose of #:constructor-name in the original example? The following works w/o it: #lang racket (module foo racket (provide (struct-out test) make-test) (struct test (foo bar)) (define (make-test #:foo foo #:bar bar) (test foo bar))) (require 'foo) (match (make-test #:foo 1 #:bar 2) [(test a b) (+ a b)])

soegaard2

2020-12-28 20:59:43

I like to use simply (test ...) outside to construct the structs.

badkins

2020-12-28 21:00:10

Yes, I understand, I was just questioning why @sorawee used #:constructor-name

soegaard2

2020-12-28 21:01:42

Oh! Yeah - the example doesn’t match what I asked, but #:omit-something could be used.

soegaard2

2020-12-28 21:04:55

Sorry - just forgot to provide test: #lang racket (module foo racket (provide (except-out (struct-out test) test) (rename-out [make-test test]) test:) (struct test (foo bar) #:transparent #:extra-name test:) (define (make-test #:foo foo #:bar bar) (test foo bar))) (require 'foo) (test #:foo 1 #:bar 2) (match (test #:foo 1 #:bar 2) [(test: a b) (+ a b)])

badkins

2020-12-28 21:09:08

Seems like (except-out (struct-out test) test) should provide test:

badkins

2020-12-28 21:09:54

I’ve come to peace with just using my smart constructor names to make the structs.

soegaard2

2020-12-28 21:11:36

Normally I am too - in this case I need it to look “pretty” :wink:

kellysmith12.21

2020-12-29 06:10:36

Is there a way to require that, if a struct implements a generic interface, then it must implement a minimal set of methods?