Every kind of web dev :slightly_smiling_face:
I have an XML document that encodes a newline with 

. It seems to me that read-xml
completely discards it. Is there an alternative to read-xml
that reads these sequence correctly?
Maybe the sxml libraries?
:disappointed: I’ve already written my code expecting xexpr. sxml would work, but I need to convert it to the format…
Ok this is slightly gross and maybe overkill, but you can try re-encoding the stream before handing it off to read-xml
https://gist.github.com/samdphillips/48a87ada3587a32654fab1d607155e99
Also it looks like they are in the xexpr as (entity _ _ 10)
(or 13) without re-encoding.
Oh wow, thanks so much! Can you clarify what you mean when you said “they are in the xexpr without … without re-encoding”? I really don’t see anything.
It should not just skip the entity.
I’m actually drafting a PR to fix this right now
In the test I did I got entity
s for those. > (call-with-input-string "<fake>&#10;</fake>" read-xml)
(document
(prolog '() #f '())
(element (location 1 0 1) (location 1 18 19) 'fake '() (list (entity (location 1 6 7) (location 1 11 12) 10)))
'())
(entity (location _ _ _) (location _ _ _) 10)
on the third line of output.
Interesting. I see that too. Could you try the program in https://github.com/racket/racket/issues/2885 ?
Oh I bet it’s because they are in an attribute.
Yeah it is dropped.
The contract on the value
field of attribute
would need to be changed.
The re-encoding trick only works outside of CDATA blocks though FYI.