9 years agocda2conll: convert features, update cda_parse master
Arne Köhn [Mon, 15 Sep 2014 16:51:36 +0000 (18:51 +0200)]
cda2conll: convert features,  update cda_parse

also add GPLv3

9 years agocan use CONLLs, TurboParser stuff, i18n, cleanup
Arne Köhn [Wed, 11 Dec 2013 13:50:59 +0000 (14:50 +0100)]
can use CONLLs, TurboParser stuff, i18n, cleanup

The skripts now automatically switch between cda and conll format based
on the file ending.

TurboParser output (unused nodes are attached to a special "unused" node
instead of NIL) can be evaluated with the -u switch.

Logging is done with the logging framework.

Predictability computation has been factored out, support for english is

argument parsing is done with argparse

Prefixes can be padded with virtual nodes to train TurboParser

Prefixes can be created with only padded nodes (useful for TP
input). Could be done with much less (CPU-)work but the infrastructure
was already in place.

folding and implicit are no longer supported since they don't map well
to other languages.

9 years agoadjust everything to turboparser needs, cleanup
Arne Köhn [Wed, 13 Nov 2013 14:34:35 +0000 (15:34 +0100)]
adjust everything to turboparser needs, cleanup

* cdgevaluator_timecourse.py: -u assumes that an additional nodes serves
  as attachment facility for unused virtual nodes
* convert-cda2conll.py: -d allows conversion of whole directories
* create_prefixes_virtual_unified.py: -p pads the output to a minimum of
  2 VNouns and 1VVerb; -s strips all Virtual nodes except the 2+1
  mentioned before.
* tree.py (word2VN): tags for virtual nodes are coarsened to NVIRT and
  VVIRT for N* and V* PoS tags

All files: cleanup according to flake8 (i.e. thanks to Jorgens elpy-mode!)

9 years agoRemove right-window argument
Arne Köhn [Tue, 12 Nov 2013 10:16:16 +0000 (11:16 +0100)]
Remove right-window argument

didn't work with the new evaluation scheme (the right window was exactly
what we wanted to get rid of)

9 years agoAbility to read conll files
Arne Köhn [Tue, 12 Nov 2013 10:14:46 +0000 (11:14 +0100)]
Ability to read conll files

files ending with conll will be read as conll-x files

9 years agoCleanup: docstrings, unused code, fix argparse
Arne Köhn [Mon, 11 Nov 2013 10:35:32 +0000 (11:35 +0100)]
Cleanup: docstrings, unused code, fix argparse

9 years agoMove to python3
Arne Köhn [Fri, 8 Nov 2013 13:05:18 +0000 (14:05 +0100)]
Move to python3

10 years agoadd cdgevaluator_timecourse by as used for the Dependency Theory book
Arne Köhn [Fri, 26 Jul 2013 14:20:04 +0000 (16:20 +0200)]
add cdgevaluator_timecourse by as used for the Dependency Theory book

Note: This program behaves differently from the one in parseeval:
it has been reworked to not only evaluate against whole sentences but also
against a gold standard of predictive partial dependency analyses.

10 years agoadd getsentence.py
Arne Köhn [Fri, 21 Jun 2013 14:50:17 +0000 (16:50 +0200)]
add getsentence.py

10 years agoadded missing module tree, containing serveral helper functions for working on a...
Niels Beuck [Tue, 18 Jun 2013 16:36:13 +0000 (18:36 +0200)]
added missing module tree, containing serveral helper functions for working on a dependency tree

10 years agoadded two scripts used in the corpus study
Niels Beuck [Mon, 17 Jun 2013 18:11:57 +0000 (20:11 +0200)]
added two scripts used in the corpus study

11 years agoupdate submodule cda_eval
Arne Koehn [Wed, 26 Oct 2011 14:33:21 +0000 (16:33 +0200)]
update submodule cda_eval

11 years agoadded new skript for converting .cda files to .conll files
Niels Beuck [Wed, 26 Oct 2011 14:30:30 +0000 (16:30 +0200)]
added new skript for converting .cda files to .conll files

11 years agoassign valencies to virtual verbs
Niels Beuck [Wed, 26 Oct 2011 14:20:01 +0000 (16:20 +0200)]
assign valencies to virtual verbs

12 years agonew script: tagger2conll.py - takes tagged sentences, gives conll-blind
Arne Koehn [Wed, 1 Dec 2010 13:59:47 +0000 (14:59 +0100)]
new script: tagger2conll.py - takes tagged sentences, gives conll-blind

12 years agomakecdgscript: sort the sentences to parse by id and increment
Arne Koehn [Tue, 26 Oct 2010 18:15:09 +0000 (20:15 +0200)]
makecdgscript: sort the sentences to parse by id and increment

12 years agofixes: don't use the file ending in cdgscript, don't die if some attributes are missing.
Arne Koehn [Mon, 25 Oct 2010 16:35:42 +0000 (18:35 +0200)]
fixes: don't use the file ending in cdgscript, don't die if some attributes are missing.

12 years agonew tool: incrementinformation, some fixes (maybe)
Arne Koehn [Mon, 25 Oct 2010 15:17:53 +0000 (17:17 +0200)]
new tool: incrementinformation, some fixes (maybe)

12 years agoadded a variant of the create prefixes script, that is able to generate virtual repla...
Niels Beuck [Tue, 5 Oct 2010 14:56:51 +0000 (16:56 +0200)]
added a variant of the create prefixes script, that is able to generate virtual replacements for certain words outside of the known prefix.

13 years agoAdd REAME, add gnuplot-compatible out for countconstraints, fix comment
Arne Koehn [Thu, 29 Jul 2010 12:16:55 +0000 (14:16 +0200)]
Add REAME, add gnuplot-compatible out for countconstraints, fix comment

13 years agoset a correct id for newly created sentences
Arne Koehn [Thu, 1 Jul 2010 15:42:40 +0000 (17:42 +0200)]
set a correct id for newly created sentences

13 years agoadd countconstraints.py and makecdgscript.sh
Arne Koehn [Tue, 29 Jun 2010 13:28:40 +0000 (15:28 +0200)]
add countconstraints.py and makecdgscript.sh

makecdgscript creates a cdg script that parses cda files and
prints informations such as violated constraints.

countconstraint creates statistics about violated constraints,
given cdg output.

13 years agomake create_prefixes really work.
Arne Koehn [Tue, 29 Jun 2010 13:21:50 +0000 (15:21 +0200)]
make create_prefixes really work.

13 years agowriting prefixes works
Arne Koehn [Tue, 29 Jun 2010 11:46:22 +0000 (13:46 +0200)]
writing prefixes works