zlacker

[parent] [thread] 5 comments
1. rwmj+(OP)[view] [source] 2017-09-21 14:15:58
I guess these are quite slow (because no indexing) once you have a serious number of records? That in itself isn't a problem as long as you understand the scope of the project. I wonder why they didn't use (a well-defined subset of) CSV as the format however.
replies(3): >>jdemle+G >>Comodo+yg >>zbuf+X53
2. jdemle+G[view] [source] 2017-09-21 14:20:04
>>rwmj+(OP)
CSV is neither human-readable nor -writable.

And I don't think the performance issue exists. Computers are fast nowadays. Parsing recfiles is straightforward. Also you could easily archive historic/old/probably irrelevant records.

replies(1): >>rwmj+d1
◧◩
3. rwmj+d1[view] [source] [discussion] 2017-09-21 14:24:01
>>jdemle+G
This is why I was very careful to say "well-defined subset". I wrote a full CSV library[1], and so I'm well aware of how deceptively difficult CSV is to deal with. However with a well-defined subset (and perhaps not using "," as a separator as well) it should be editable for at least simple changes.

[1] https://github.com/Chris00/ocaml-csv

4. Comodo+yg[view] [source] 2017-09-21 15:49:25
>>rwmj+(OP)
No built-in indexing, but no one forbids you from indexing text files if you need it.
replies(1): >>neulan+Xg
◧◩
5. neulan+Xg[view] [source] [discussion] 2017-09-21 15:51:36
>>Comodo+yg
But, any indexing system you create won't work with the rec* tools. For example, `recsel` will not be any faster on large files.

Not sure if they have indexing on the roadmap, but it does make sense to me for people that have adopted it and are starting to get bigger databases.

Of course, you could argue that when the files get too big, it's time to switch to a different solution.

It seems that's kind of a natural tension in projects. Do you grow the scope to accommodate existing users with growing use cases? Or, do you draw the line in the sand and have people move on to a different solution?

6. zbuf+X53[view] [source] 2017-09-22 20:29:26
>>rwmj+(OP)
CSV locks you to the same fields per data entry, that makes it a little less flexible. Plus one of the appeals for me is readability of the raw data; CSV gets long and thin very quickly. Granted each will best suit a certain type of data.
[go to top]