
Every kind of web dev :slightly_smiling_face:

I have an XML document that encodes a newline with 

. It seems to me that read-xml
completely discards it. Is there an alternative to read-xml
that reads these sequence correctly?

Maybe the sxml libraries?

:disappointed: I’ve already written my code expecting xexpr. sxml would work, but I need to convert it to the format…

Ok this is slightly gross and maybe overkill, but you can try re-encoding the stream before handing it off to read-xml
https://gist.github.com/samdphillips/48a87ada3587a32654fab1d607155e99
Also it looks like they are in the xexpr as (entity _ _ 10)
(or 13) without re-encoding.

Oh wow, thanks so much! Can you clarify what you mean when you said “they are in the xexpr without … without re-encoding”? I really don’t see anything.


It should not just skip the entity.

I’m actually drafting a PR to fix this right now

In the test I did I got entity
s for those. > (call-with-input-string "<fake>&#10;</fake>" read-xml)
(document
(prolog '() #f '())
(element (location 1 0 1) (location 1 18 19) 'fake '() (list (entity (location 1 6 7) (location 1 11 12) 10)))
'())

(entity (location _ _ _) (location _ _ _) 10)
on the third line of output.

Interesting. I see that too. Could you try the program in https://github.com/racket/racket/issues/2885 ?

Oh I bet it’s because they are in an attribute.

Yeah it is dropped.

The contract on the value
field of attribute
would need to be changed.

The re-encoding trick only works outside of CDATA blocks though FYI.
