Skip to content

Commit 4cc3e13

Browse files
committed
fix rst syntax (work in progress)
1 parent 30f308f commit 4cc3e13

File tree

11 files changed

+153
-135
lines changed

11 files changed

+153
-135
lines changed

udapi/block/transform/proj.py

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,14 @@
66
http://www.maltparser.org/optiondesc.html#pproj-marking_strategy
77
88
TODO: implement also path and head+path strategies.
9+
910
TODO: Sometimes it would be better (intuitively)
10-
to lower the gap-node (if its whole subtree is in the gap
11-
and if this does not cause more non-projectivities)
12-
rather than to lift several nodes whose parent-edge crosses this gap.
13-
We would need another label value (usually the lowering is of depth 1),
14-
but the advantage is that reconstruction of lowered edges
15-
during deprojectivization is simple and needs no heuristics.
11+
to lower the gap-node (if its whole subtree is in the gap
12+
and if this does not cause more non-projectivities)
13+
rather than to lift several nodes whose parent-edge crosses this gap.
14+
We would need another label value (usually the lowering is of depth 1),
15+
but the advantage is that reconstruction of lowered edges
16+
during deprojectivization is simple and needs no heuristics.
1617
"""
1718
from udapi.core.block import Block
1819

@@ -59,4 +60,4 @@ def mark(self, node, label):
5960
elif self.label == 'deprel':
6061
node.deprel = '%s:%s+%s' % (node.udeprel, node.sdeprel, label)
6162
else:
62-
raise(ValueError('Unknown parameter label=%s' % self.label))
63+
raise ValueError('Unknown parameter label=%s' % self.label)

udapi/block/tutorial/adpositions.py

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
"""tutorial.Adpositions block template.
22
3-
Example usage:
4-
for a in */sample.conllu; do
3+
Example usage::
4+
5+
for a in */sample.conllu; do
56
printf '%50s ' $a;
67
udapy tutorial.Adpositions < $a;
7-
done | tee results.txt
8+
done | tee results.txt
89
9-
# What are the English postpositions?
10-
cat UD_English/sample.conllu | udapy -TM util.Mark \
11-
node='node.upos == "ADP" and node.parent.precedes(node)' | less -R
10+
# What are the English postpositions?
11+
cat UD_English/sample.conllu | udapy -TM util.Mark \
12+
node='node.upos == "ADP" and node.parent.precedes(node)' | less -R
1213
"""
1314
from udapi.core.block import Block
1415

udapi/block/ud/addmwt.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ def multiword_analysis(self, node):
5757
"""Return a dict with MWT info or None if `node` does not represent a multiword token.
5858
5959
An example return value is::
60+
6061
{
6162
'form': 'aby bych',
6263
'lemma': 'aby být',

udapi/block/ud/convert1to2.py

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -275,13 +275,16 @@ def fix_remnants_in_tree(self, root):
275275
Remnant's parent is always the correlate (same-role) node.
276276
Usually, correlate's parent is the head of the whole ellipsis subtree,
277277
i.e. the first conjunct. However, sometimes remnants are deeper, e.g.
278-
'Over 300 Iraqis are reported dead and 500 wounded.' with edges:
279-
nsubjpass(reported, Iraqis)
280-
nummod(Iraqis, 300)
281-
remnant(300, 500)
278+
'Over 300 Iraqis are reported dead and 500 wounded.' with edges::
279+
280+
nsubjpass(reported, Iraqis)
281+
nummod(Iraqis, 300)
282+
remnant(300, 500)
283+
282284
Let's expect all remnants in one tree are part of the same ellipsis structure.
285+
283286
TODO: theoretically, there may be more ellipsis structures with remnants in one tree,
284-
but I have no idea how to distinguish them from the deeper-remnants cases.
287+
but I have no idea how to distinguish them from the deeper-remnants cases.
285288
"""
286289
remnants = [n for n in root.descendants if n.deprel == 'remnant']
287290
if not remnants:

udapi/block/ud/el/addmwt.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
Notice that this should be used only for converting existing conllu files.
44
Ideally a tokenizer should have already split the MWTs.
55
Also notice that this block does not deal with the relatively rare
6-
PRON(Person=2)+'*+PRON(Person=3, i.e. "σ'το" and "στο") MWTs.
6+
``PRON(Person=2)+'*+PRON(Person=3, i.e. "σ'το" and "στο")`` MWTs.
77
"""
88
import udapi.block.ud.addmwt
99

udapi/block/ud/ro/setspaceafter.py

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
"""Block ud.ro.SetSpaceAfter for heuristic setting of SpaceAfter=No in Romanian.
22
3-
Usage:
4-
udapy -s ud.ro.SetSpaceAfter < in.conllu > fixed.conllu
3+
Usage::
4+
5+
udapy -s ud.ro.SetSpaceAfter < in.conllu > fixed.conllu
56
67
Author: Martin Popel
78
"""
@@ -13,13 +14,16 @@ class SetSpaceAfter(udapi.block.ud.setspaceafter.SetSpaceAfter):
1314
"""Block for heuristic setting of the SpaceAfter=No MISC attribute in Romanian.
1415
1516
Romanian uses many contractions, e.g.
16-
raw | meaning | tokenized | lemmatized
17-
-------|---------|-----------|-----------
18-
n-ar | nu ar | n- ar | nu avea
19-
să-i | să îi | să -i | să el
20-
într-o | în o | într- o | întru un
21-
nu-i | nu îi | nu -i | nu el
22-
nu-i | nu e | nu -i | nu fi
17+
18+
======= ======= ========= ==========
19+
raw meaning tokenized lemmatized
20+
======= ======= ========= ==========
21+
n-ar nu ar n- ar nu avea
22+
să-i să îi să -i să el
23+
într-o în o într- o întru un
24+
nu-i nu îi nu -i nu el
25+
nu-i nu e nu -i nu fi
26+
======= ======= ========= ==========
2327
2428
Detokenization is quite simple: no space after word-final hyphen and before word-initial hyphen.
2529
There are just two exceptions, I have found:

udapi/block/write/html.py

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,17 +5,19 @@
55
class Html(BaseWriter):
66
"""A writer for HTML+JavaScript+SVG visualization of dependency trees.
77
8-
Usage:
9-
# from the command line
10-
udapy write.Html < file.conllu > file.html
11-
firefox file.html
12-
13-
# for offline use, we need to download first three JavaScript libraries
14-
wget https://code.jquery.com/jquery-2.1.4.min.js
15-
wget https://cdn.rawgit.com/eligrey/FileSaver.js/master/FileSaver.min.js
16-
wget https://cdn.rawgit.com/ufal/js-treex-view/gh-pages/js-treex-view.js
17-
udapy write.Html path_to_js=. < file.conllu > file.html
18-
firefox file.html
8+
.. code-block:: bash
9+
10+
# from the command line
11+
udapy write.Html < file.conllu > file.html
12+
firefox file.html
13+
14+
For offline use, we need to download first three JavaScript libraries::
15+
16+
wget https://code.jquery.com/jquery-2.1.4.min.js
17+
wget https://cdn.rawgit.com/eligrey/FileSaver.js/master/FileSaver.min.js
18+
wget https://cdn.rawgit.com/ufal/js-treex-view/gh-pages/js-treex-view.js
19+
udapy write.Html path_to_js=. < file.conllu > file.html
20+
firefox file.html
1921
2022
This writer produces an html file with drawings of the dependency trees
2123
in the document (there are buttons for selecting which bundle will be shown).

udapi/block/write/sdparse.py

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -8,33 +8,35 @@ class Sdparse(BaseWriter):
88
"""A writer of files in the Stanford dependencies format, suitable for Brat visualization.
99
1010
Usage:
11-
udapy write.Sdparse print_upos=0 < in.conllu
11+
``udapy write.Sdparse print_upos=0 < in.conllu``
1212
1313
Example output::
1414
15-
~~~ sdparse
16-
Corriere Sport da pagina 23 a pagina 26
17-
name(Corriere, Sport)
18-
case(pagina-4, da)
19-
nmod(Corriere, pagina-4)
20-
nummod(pagina-4, 23)
21-
case(pagina-7, a)
22-
nmod(Corriere, pagina-7)
23-
nummod(pagina-7, 26)
24-
~~~
15+
~~~ sdparse
16+
Corriere Sport da pagina 23 a pagina 26
17+
name(Corriere, Sport)
18+
case(pagina-4, da)
19+
nmod(Corriere, pagina-4)
20+
nummod(pagina-4, 23)
21+
case(pagina-7, a)
22+
nmod(Corriere, pagina-7)
23+
nummod(pagina-7, 26)
24+
~~~
2525
2626
To visualize it, use embedded Brat, e.g. go to
27-
http://universaldependencies.org/visualization.html#editing
27+
http://universaldependencies.org/visualization.html#editing.
2828
Click the edit button and paste the output of this writer excluding the `~~~` marks.
2929
3030
Notes:
31-
Original Stanford dependencies format (http://nlp.stanford.edu/software/dependencies_manual.pdf)
31+
The original `Stanford dependencies format
32+
<http://nlp.stanford.edu/software/dependencies_manual.pdf>`_
3233
allows explicit specification of the root dependency, e.g. `root(ROOT-0, makes-8)`.
3334
However, this is not allowed by Brat, so this writer does not print it.
3435
3536
UD v2.0 allows tokens with spaces, but I am not aware of any Brat support.
3637
3738
Alternatives:
39+
3840
* `write.Conllu` Brat recently supports also the CoNLL-U input
3941
* `write.TextModeTrees` may be more readable/useful in some usecases
4042
* `write.Html` dtto, press "Save as SVG" button, convert to pdf

udapi/block/write/textmodetrees.py

Lines changed: 47 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -20,64 +20,67 @@
2020
class TextModeTrees(BaseWriter):
2121
"""An ASCII pretty printer of dependency trees.
2222
23-
SYNOPSIS
24-
# from command line (visualize CoNLL-U files)
25-
udapy write.TextModeTrees color=1 < file.conllu | less -R
23+
.. code-block:: bash
2624
27-
# is scenario (examples of other parameters)
28-
write.TextModeTrees indent=1 print_sent_id=1 print_sentence=1
29-
write.TextModeTrees zones=en,cs attributes=form,lemma,upos minimize_cross=0
25+
# from the command line (visualize CoNLL-U files)
26+
udapy write.TextModeTrees color=1 < file.conllu | less -R
27+
28+
In scenario (examples of other parameters)::
29+
30+
write.TextModeTrees indent=1 print_sent_id=1 print_sentence=1
31+
write.TextModeTrees zones=en,cs attributes=form,lemma,upos minimize_cross=0
3032
31-
DESCRIPTION
3233
This block prints dependency trees in plain-text format.
33-
For example the following CoNLL-U file (with tabs instead of spaces)
34-
35-
1 I I PRON PRP Number=Sing|Person=1 2 nsubj _ _
36-
2 saw see VERB VBD Tense=Past 0 root _ _
37-
3 a a DET DT Definite=Ind 4 det _ _
38-
4 dog dog NOUN NN Number=Sing 2 dobj _ _
39-
5 today today NOUN NN Number=Sing 2 nmod:tmod _ SpaceAfter=No
40-
6 , , PUNCT , _ 2 punct _ _
41-
7 which which DET WDT PronType=Rel 10 nsubj _ _
42-
8 was be VERB VBD Person=3|Tense=Past 10 cop _ _
43-
9 a a DET DT Definite=Ind 10 det _ _
44-
10 boxer boxer NOUN NN Number=Sing 4 acl:relcl _ SpaceAfter=No
45-
11 . . PUNCT . _ 2 punct _ _
46-
47-
will be printed (with the default parameters) as
48-
─┮
49-
│ ╭─╼ I PRON nsubj
50-
╰─┾ saw VERB root
51-
│ ╭─╼ a DET det
52-
├────────────────────────┾ dog NOUN dobj
53-
├─╼ today NOUN nmod:tmod │
54-
├─╼ , PUNCT punct │
55-
│ │ ╭─╼ which DET nsubj
56-
│ │ ├─╼ was VERB cop
57-
│ │ ├─╼ a DET det
58-
│ ╰─┶ boxer NOUN acl:relcl
59-
╰─╼ . PUNCT punct
34+
For example the following CoNLL-U file (with tabs instead of spaces)::
35+
36+
1 I I PRON PRP Number=Sing|Person=1 2 nsubj _ _
37+
2 saw see VERB VBD Tense=Past 0 root _ _
38+
3 a a DET DT Definite=Ind 4 det _ _
39+
4 dog dog NOUN NN Number=Sing 2 dobj _ _
40+
5 today today NOUN NN Number=Sing 2 nmod:tmod _ SpaceAfter=No
41+
6 , , PUNCT , _ 2 punct _ _
42+
7 which which DET WDT PronType=Rel 10 nsubj _ _
43+
8 was be VERB VBD Person=3|Tense=Past 10 cop _ _
44+
9 a a DET DT Definite=Ind 10 det _ _
45+
10 boxer boxer NOUN NN Number=Sing 4 acl:relcl _ SpaceAfter=No
46+
11 . . PUNCT . _ 2 punct _ _
47+
48+
will be printed (with the default parameters) as::
49+
50+
─┮
51+
│ ╭─╼ I PRON nsubj
52+
╰─┾ saw VERB root
53+
│ ╭─╼ a DET det
54+
├────────────────────────┾ dog NOUN dobj
55+
├─╼ today NOUN nmod:tmod │
56+
├─╼ , PUNCT punct │
57+
│ │ ╭─╼ which DET nsubj
58+
│ │ ├─╼ was VERB cop
59+
│ │ ├─╼ a DET det
60+
│ ╰─┶ boxer NOUN acl:relcl
61+
╰─╼ . PUNCT punct
6062
6163
Some non-projective trees cannot be printed witout crossing edges.
62-
TextModeTrees uses a special "bridge" symbol ─╪─ to mark this:
63-
─┮
64-
│ ╭─╼ 1
65-
├─╪───┮ 2
66-
╰─┶ 3 │
67-
╰─╼ 4
68-
69-
By default parameter `color=auto`, so if the output is printed to the console
64+
TextModeTrees uses a special "bridge" symbol ─╪─ to mark this::
65+
66+
─┮
67+
│ ╭─╼ 1
68+
├─╪───┮ 2
69+
╰─┶ 3 │
70+
╰─╼ 4
71+
72+
By default parameter ``color=auto``, so if the output is printed to the console
7073
(not file or pipe), each node attribute is printed in different color.
7174
If a given node's MISC contains any of `ToDo`, `Bug` or `Mark` attributes
7275
(or any other specified in the parameter `mark`), the node will be highlighted
7376
(by reveresing the background and foreground colors).
7477
7578
This block's method `process_tree` can be called on any node (not only root),
76-
which is useful for printing subtrees using `node.print_subtree()`,
79+
which is useful for printing subtrees using ``node.print_subtree()``,
7780
which is internally implemented using this block.
7881
7982
SEE ALSO
80-
`write.TextModeTreesHtml`
83+
:py:class:`.TextModeTreesHtml`
8184
"""
8285

8386
def __init__(self, print_sent_id=True, print_text=True, add_empty_line=True, indent=1,

udapi/block/write/tikz.py

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,19 +7,20 @@
77
class Tikz(BaseWriter):
88
r"""A writer of files in the LaTeX with tikz-dependency format.
99
10-
Usage:
11-
udapy write.Tikz < my.conllu > my.tex
12-
pdflatex my.tex
13-
xdg-open my.pdf
10+
Usage::
11+
12+
udapy write.Tikz < my.conllu > my.tex
13+
pdflatex my.tex
14+
xdg-open my.pdf
1415
1516
Long sentences may result in too large pictures.
1617
You can tune the width (in addition to changing fontsize or using minipage and rescaling) with
17-
``\begin{deptext}[column sep=0.2cm]``
18+
``\begin{deptext}[column sep=0.2cm]``
1819
or individually for each word:
19-
``My \&[.5cm] dog \& etc.``
20+
``My \&[.5cm] dog \& etc.``
2021
By default, the height of the horizontal segment of a dependency edge is proportional
2122
to the distance between the linked words. You can tune the height with:
22-
``\depedge[edge unit distance=1.5ex]{9}{1}{deprel}``
23+
``\depedge[edge unit distance=1.5ex]{9}{1}{deprel}``
2324
2425
See `tikz-dependency documentation
2526
<http://mirrors.ctan.org/graphics/pgf/contrib/tikz-dependency/tikz-dependency-doc.pdf>`_

0 commit comments

Comments
 (0)