anything
2021-4-22 22:29:44

@alexharsanyi I’m giving your nice data-frame package a try. Thanks for writing and kindly providing it to everyone.

(*) CSV reading and RFC 4180

I think the procedure df-read/csv does not satisfy RFC 4180. It seems to the expect that every CSV record is one a line by itself — that is, it doesn’t allow a quoted newline. I got around that by using the csv-reading package.

(*) Looking up in data frames, the need for sorting a series

I’m thinking I didn’t understand how to use df-lookup. It demands a sorted series, right? But the library didn’t provide any sorting mechanism for the data frame. If I sort a series using sort, how will the data frame know that I’ve changed rows out of order? The way I think of a data frame is just like a table — the fact that the data structure stores the data in columns just seems to me a matter of implementation. If I sort a column, I should sort all others too. I could get the data out of the data-frame, sort it properly and then place it back, but that tells me I probably didn’t get something.


alexharsanyi
2021-4-22 22:30:04

@alexharsanyi has joined the channel


anything
2021-4-22 22:31:35

Wow. I clicked on the button “invite” the person, but didn’t know this would force-join the person right away. Sorry about this violent move! Didn’t mean to!


anything
2021-4-22 23:28:10

The bot actually says it just invited the person. :slightly_smiling_face:


alexharsanyi
2021-4-23 02:34:00

It is possible that df-read/csv does not fully conform to RFC 4180 and to many other weird CSV variants out there — so far it has worked fine for the datasets I need, and I add features as I need them.

with regards to df-lookup — the data series are expected to be already sorted, and the user just needs to tell the data frame object that a series is sorted using df-set-sorted! , there are no facilities for sorting data, because the data sets I had to deal with were already sorted. For looking up rows based on unsorted series, you can use df-select, or df-select* with a #:filter to select just the rows you are looking for (this will search the series in linear time).