summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Arne Köhn [Mon, 15 Sep 2014 16:51:36 +0000 (18:51 +0200)]
cda2conll: convert features, update cda_parse
also add GPLv3
Arne Köhn [Wed, 11 Dec 2013 13:50:59 +0000 (14:50 +0100)]
can use CONLLs, TurboParser stuff, i18n, cleanup
The skripts now automatically switch between cda and conll format based
on the file ending.
TurboParser output (unused nodes are attached to a special "unused" node
instead of NIL) can be evaluated with the -u switch.
Logging is done with the logging framework.
Predictability computation has been factored out, support for english is
available.
argument parsing is done with argparse
Prefixes can be padded with virtual nodes to train TurboParser
Prefixes can be created with only padded nodes (useful for TP
input). Could be done with much less (CPU-)work but the infrastructure
was already in place.
folding and implicit are no longer supported since they don't map well
to other languages.
Arne Köhn [Wed, 13 Nov 2013 14:34:35 +0000 (15:34 +0100)]
adjust everything to turboparser needs, cleanup
* cdgevaluator_timecourse.py: -u assumes that an additional nodes serves
as attachment facility for unused virtual nodes
* convert-cda2conll.py: -d allows conversion of whole directories
* create_prefixes_virtual_unified.py: -p pads the output to a minimum of
2 VNouns and 1VVerb; -s strips all Virtual nodes except the 2+1
mentioned before.
* tree.py (word2VN): tags for virtual nodes are coarsened to NVIRT and
VVIRT for N* and V* PoS tags
All files: cleanup according to flake8 (i.e. thanks to Jorgens elpy-mode!)
Arne Köhn [Tue, 12 Nov 2013 10:16:16 +0000 (11:16 +0100)]
Remove right-window argument
didn't work with the new evaluation scheme (the right window was exactly
what we wanted to get rid of)
Arne Köhn [Tue, 12 Nov 2013 10:14:46 +0000 (11:14 +0100)]
Ability to read conll files
files ending with conll will be read as conll-x files
Arne Köhn [Mon, 11 Nov 2013 10:35:32 +0000 (11:35 +0100)]
Cleanup: docstrings, unused code, fix argparse
Arne Köhn [Fri, 8 Nov 2013 13:05:18 +0000 (14:05 +0100)]
Move to python3
Arne Köhn [Fri, 26 Jul 2013 14:20:04 +0000 (16:20 +0200)]
add cdgevaluator_timecourse by as used for the Dependency Theory book
Note: This program behaves differently from the one in parseeval:
it has been reworked to not only evaluate against whole sentences but also
against a gold standard of predictive partial dependency analyses.
Arne Köhn [Fri, 21 Jun 2013 14:50:17 +0000 (16:50 +0200)]
add getsentence.py
Niels Beuck [Tue, 18 Jun 2013 16:36:13 +0000 (18:36 +0200)]
added missing module tree, containing serveral helper functions for working on a dependency tree
Niels Beuck [Mon, 17 Jun 2013 18:11:57 +0000 (20:11 +0200)]
added two scripts used in the corpus study
Arne Koehn [Wed, 26 Oct 2011 14:33:21 +0000 (16:33 +0200)]
update submodule cda_eval
Niels Beuck [Wed, 26 Oct 2011 14:30:30 +0000 (16:30 +0200)]
added new skript for converting .cda files to .conll files
Niels Beuck [Wed, 26 Oct 2011 14:20:01 +0000 (16:20 +0200)]
assign valencies to virtual verbs
Arne Koehn [Wed, 1 Dec 2010 13:59:47 +0000 (14:59 +0100)]
new script: tagger2conll.py - takes tagged sentences, gives conll-blind
Arne Koehn [Tue, 26 Oct 2010 18:15:09 +0000 (20:15 +0200)]
makecdgscript: sort the sentences to parse by id and increment
Arne Koehn [Mon, 25 Oct 2010 16:35:42 +0000 (18:35 +0200)]
fixes: don't use the file ending in cdgscript, don't die if some attributes are missing.
Arne Koehn [Mon, 25 Oct 2010 15:17:53 +0000 (17:17 +0200)]
new tool: incrementinformation, some fixes (maybe)
Niels Beuck [Tue, 5 Oct 2010 14:56:51 +0000 (16:56 +0200)]
added a variant of the create prefixes script, that is able to generate virtual replacements for certain words outside of the known prefix.
Arne Koehn [Thu, 29 Jul 2010 12:16:55 +0000 (14:16 +0200)]
Add REAME, add gnuplot-compatible out for countconstraints, fix comment
Arne Koehn [Thu, 1 Jul 2010 15:42:40 +0000 (17:42 +0200)]
set a correct id for newly created sentences
Arne Koehn [Tue, 29 Jun 2010 13:28:40 +0000 (15:28 +0200)]
add countconstraints.py and makecdgscript.sh
makecdgscript creates a cdg script that parses cda files and
prints informations such as violated constraints.
countconstraint creates statistics about violated constraints,
given cdg output.
Arne Koehn [Tue, 29 Jun 2010 13:21:50 +0000 (15:21 +0200)]
make create_prefixes really work.
Arne Koehn [Tue, 29 Jun 2010 11:46:22 +0000 (13:46 +0200)]
writing prefixes works