Op 18-5-2017 om 12:45 schreef Jaap Eldering:
On 18-05-17 00:36, Jaap Eldering wrote:
On 16-05-17 19:00, Jeroen Bransen wrote:
Op 12-5-2017 om 14:27 schreef Jaap Eldering:
On 12-05-17 12:13, Jaap Eldering wrote:
On 08-05-17 12:28, Jeroen Bransen wrote:
[...]
I tried to fix the Haskell code, see attached diff, but that didn't work (at least for me).
Hmm... I guess I was fuzzing around too much. I tested this patch again today, and it yields the expected result: the Haskell code makes a dot match newlines as POSIX extended regex prescribes.
Running through the tests, it did uncover a later test that does fail (in a minor way):
./checktestdata-haskell tests/testprog29.err2 tests/testdata29.in Testdata OK
but the minimum number of digits (-1) is out of range:
FLOATP(10,100,-1,3) NEWLINE
I'll commit changes to fix both. Jeroen, can you verify that my changes are correct?
Well, it's Haskell code and it compiles, so it must be right... ;-) But yes, I thought this was more complicated, but now that I look at your changes this is quite straightforward. I'll try to add some examples soon.
Jeroen
Jaap
I'm also open to suggestions to change to a different regex standard.
I am confused. What standard are you using/implementing now? The docs don't seem to mention one, and I think that looking at the code previously I concluded it was POSIX-extended. I am fine with changing to a different one, but I'd suggest either PCRE, POSIX-basic or POSIX-extended. These three are all implemented in Boost, which I assume you can easily use as the code already depends on Boost. Anyway, it would be good to pick one and make it explicit in the docs.
So I was aiming for POSIX extended, but was inadvertently using C++ regex which defaults to ECMAscript.
I'd prefer to stick to the C++ STL regex library, but that has the disadvantage I mentioned above. So that leaves me wondering what the best choice is.
Jaap
[1] http://stackoverflow.com/questions/399078/what-special-characters-must-be-es... [2] https://wiki.haskell.org/Regex_Posix
Op 13-12-2016 om 19:29 schreef Jaap Eldering: > On 13-12-16 12:27, Jeroen Bransen wrote: >> Hi Jaap, >> >> Op 11-12-2016 om 11:42 schreef Jaap Eldering: >>> Hi Jeroen, >>> >>> On 10-12-16 13:18, Jeroen Bransen wrote: >>>> Hi all, >>>> >>>> Checktestdata is a nice DSL and has proved helpful in many situation, >>>> but from the moment I used it I was missing features and wondering why >>>> it wasn't implemented as an EDSL (Embedded Domain Specific Language). >>>> Now, many years later, I've finally found some time to create it. I've >>>> implemented a little library in Haskell, and as a sneak preview I've >>>> taken the interception problem from NCPC 2016 as referenced to in [1] >>>> and attached it. The first question is: where shall I put the sources to >>>> the library? Somewhere in the checktestdata repository (and if so, >>>> where), or maybe create a checktestdata2 repository? >>> I'd say just put them in the checktestdata repository, simply in the >>> root. We can always move stuff later. >> Done, in a haskell_edsl subdirectory to avoid cluttering of the current >> code a bit. >> >>> Btw, this reminds me of code that Gerben Stavenga once wrote: he also >>> wrote a reimplementation of checktestdata in Haskell, and it was >>> similarly short, >> Would his implementation still be available somewhere? I'd be curious to >> see it. > I think I still have a copy lying around, but on my desktop that's in > storage in the Netherlands... I'll send Gerben an email, but I doubt he > still has it. > >>> but also a preliminary version lacking good error >>> handling/reporting. >> The version I've commited now does have proper error reporting, similar >> to the original >> checktestdata tool. Of course it may need some polishing here and there, >> but the basis >> is there and suffices for most cases I think. >>> I think this can be useful, especially for specific problems where >>> either speed or extensibility of the syntax is needed. On the other >>> hand, I think that the requirement of having GHC around is quite heavy, >>> while a C++ compiler with boost is more standard, even on Windows I'd think? >> What is your definition of standard? In number of people that have them >> installed: honestly no clue, could be that boost is far more popular. I >> have only installed boost in order to build checktestdata, and this has >> always taken me several hours. The long list of instructions [2] make it >> clear that getting boost to work on Windows is far from trivial, while >> Haskell platform has a native binary installer for all popular platforms >> and works pretty much out of the box. Yes, it is a large package, but >> disk space is not so much an issue nowadays, and as a Haskell programmer >> (ouch, I start to sound like a Haskell evangelist) I think it's worth >> the extra bytes on disk, as I really believe this can improve the >> checktestdata tool. > Ok, I wasn't aware that Boost was so tricky to install on Windows, on > Debian it's just 'apt-get install libboost-dev' ;-) > >> Btw, my main point is that I think we should create an EDSL. Because of >> my background and because of its features I've chosen Haskell, but we >> could have look at other languages. Python crossed my mind but I'm >> afraid the syntax will be ugly and it won't be such a nice embedding as >> I've created now. > I like the idea of a EDSL and I think that Haskell is quite good to use > in terms of syntax. I'm just wondering whether that benefit outweighs > the disadvantage that now to run it, you actually have to compile your > specific testcase description EDSL code instead of parsing the current > script with a precompiled binary. > >>> Do you think it would be easy to automatically convert the original >>> checktestdata syntax scripts into this EDSL to then compile them? That >>> way, we could support both syntaxes and users would be able to choose >>> which method to use. >> I guess this is pretty as hard as parsing the old syntax directly. The >> main challenge is the variable binding handling, your suggestion solves >> that for simple variables, but for array-indices we still need to take >> extra care. I'd also see this as a form of backwards compatibility, and >> not as a choice for the user, as this will never fully use the >> capabilities of the EDSL, and then one could as well just use the old >> version. The whole point is that you can do more with this EDSL (and of >> course speed improvement and easy maintainability are nice side-effects >> of that). I'll think about this a bit more. > My point is that I'd rather not dump the current syntax specification, > so therefore I think backward compatibility is an important feature. > Secondly, I think the current syntax is I think easier to start using > than your EDSL, because then you need some knowledge of the underlying > language. > > But it might be best to try to keep both around for a bit to compare > their advantages. > > Jaap > > >> [2] http://www.boost.org/doc/libs/1_62_0/more/getting_started/windows.html >> >>>> Let me give a little more information on why I think an EDSL is a good >>>> idea here, and why I choose for Haskell (other than my background): >>>> >>>> + With an EDSL, you get so much for free from the underlying language, >>>> such as (typed) variables, all sorts of abstraction mechanisms, compiler >>>> optimizations, etc. For example, some testdata script may consist of >>>> some repeated parts, which in an EDSL can simply be given a name to get >>>> rid of the duplication. >>>> + Compiler optimizations, which of course also exist for C++ in which >>>> checktestdata is built now, don't only optimize the checktestdata >>>> runtime, but also the script itself since it is just a program and not >>>> some custom script anymore. Especially with a functional language such >>>> as Haskell this means definitions of primitives can be inlined and whole >>>> parts of the script may get fused to a very efficient version. >>>> + Maintainability: my library is now +- 200 LOC, including docs and >>>> empty lines, and contains quite a lot of the checktestdata language >>>> already. Because Haskell has many well optimized parsing libraries, we >>>> get a lot for free, and only have to create a nice generalization for >>>> this particular use case. >>>> + Efficiency: the current checktestdata is just too slow, as mentioned >>>> in [1]. As a reference, on my machine the checktestdata script for >>>> interception 09.max_ans.in takes 35.5s, the python script takes 2.2s, >>>> and the attached version only 2s. And I haven't done anything to >>>> optimize specifically for this case, I've just translated the ctd script >>>> to the syntax of my EDSL. >>>> + With this, the script can do more than just checking the syntax. I.e. >>>> when the problem specifies that it contains a connected graph, one can >>>> check the syntax, but also the connectedness of the graph, while still >>>> separating concerns, i.e. the syntax of the input can be separated from >>>> the semantic checks, but the semantic checks don't need to come in a >>>> separate program which also has to read the input again, like we should >>>> now. All can be combined naturally. >>>> + Create your own combinators: the primitives can be easily combined >>>> into more complex combinators which you can then reuse. I.e. think of a >>>> 'point2d' combinator that reads two space separated integers. >>>> >>>> - Haskell Platform needs to be available, this is easy on all major OS, >>>> but is a quite large dependency. Then again, GCC and Boost are large as >>>> well, and are less Windows-friendly. >>>> - Slighly different syntax: although Haskell is quite liberal in its >>>> syntax for EDSL's, users have to get used to particularities of the >>>> language/ >>>> - Data generation is not easily possible, if not infeasible. There >>>> should be some way of embedding the language to make this possible, and >>>> this is actually what I've been working on for the past few years when >>>> thinking about this problem, but since we mostly use it for checking >>>> data I went with a monadic embedding which is very easy for data checking. >>>> - Backwards compatibility: parsing existing scripts and running it with >>>> the new backend is not going to be easy. I don't expect show stoppers, >>>> but it definitely won't be as efficient as writing scripts within the >>>> language because of variable bindings. I need to work on this a bit to >>>> be sure though. >>>> >>>> That's it, of course my libary is not done yet (no good error messages, >>>> no regex yet), but it's a proof of concept that is already usable for >>>> simpler problems. I hope to convince you that this is the way to go, and >>>> to get some ideas to create the POC into a working tool. Also, >>>> suggestions on 'interesting use cases' are welcome, the interception is >>>> a nice one but I am sure there are many more problems for which the >>>> standard checktestdata just not works. >>>> >>>> Yours truly, >>>> Jeroen >>>> >>>> [1] https://github.com/DOMjudge/checktestdata/issues/3 >>>> >>>> >>>> >>>> _______________________________________________ >>>> DOMjudge-devel mailing list >>>> DOMjudge-devel@domjudge.org >>>> https://www.domjudge.org/mailman/listinfo/domjudge-devel >>>> >>> _______________________________________________ >>> DOMjudge-devel mailing list >>> DOMjudge-devel@domjudge.org >>> https://www.domjudge.org/mailman/listinfo/domjudge-devel >> _______________________________________________ >> DOMjudge-devel mailing list >> DOMjudge-devel@domjudge.org >> https://www.domjudge.org/mailman/listinfo/domjudge-devel >> > _______________________________________________ > DOMjudge-devel mailing list > DOMjudge-devel@domjudge.org > https://www.domjudge.org/mailman/listinfo/domjudge-devel _______________________________________________ DOMjudge-devel mailing list DOMjudge-devel@domjudge.org https://www.domjudge.org/mailman/listinfo/domjudge-devel
DOMjudge-devel mailing list DOMjudge-devel@domjudge.org https://www.domjudge.org/mailman/listinfo/domjudge-devel
DOMjudge-devel mailing list DOMjudge-devel@domjudge.org https://www.domjudge.org/mailman/listinfo/domjudge-devel
DOMjudge-devel mailing list DOMjudge-devel@domjudge.org https://www.domjudge.org/mailman/listinfo/domjudge-devel
DOMjudge-devel mailing list DOMjudge-devel@domjudge.org https://www.domjudge.org/mailman/listinfo/domjudge-devel
DOMjudge-devel mailing list DOMjudge-devel@domjudge.org https://www.domjudge.org/mailman/listinfo/domjudge-devel