Skip to content

Commit 8f10929

Browse files
committed
clean codes and add comments.
1 parent 431f46f commit 8f10929

File tree

11 files changed

+736
-192
lines changed

11 files changed

+736
-192
lines changed
Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,2 @@
1-
data
21
*.txt
32
*.pyc
Lines changed: 55 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,55 @@
1-
TBD
1+
# Globally Normalized Reader
2+
3+
This model implements the work in the following paper:
4+
5+
Jonathan Raiman and John Miller. Globally Normalized Reader. Empirical Methods in Natural Language Processing (EMNLP), 2017.
6+
7+
If you use the dataset/code in your research, please cite the above paper:
8+
9+
```text
10+
@inproceedings{raiman2015gnr,
11+
author={Raiman, Jonathan and Miller, John},
12+
booktitle={Empirical Methods in Natural Language Processing (EMNLP)},
13+
title={Globally Normalized Reader},
14+
year={2017},
15+
}
16+
```
17+
18+
You can also visit https://github.com/baidu-research/GloballyNormalizedReader to get more information.
19+
20+
21+
# Installation
22+
23+
1. Please use [docker image](http://doc.paddlepaddle.org/develop/doc/getstarted/build_and_install/docker_install_en.html) to install the latest PaddlePaddle, by running:
24+
```bash
25+
docker pull paddledev/paddle
26+
```
27+
2. Download all necessary data by running:
28+
```bash
29+
cd data && ./download.sh
30+
```
31+
3. Featurize the data by running:
32+
```
33+
python featurize.py --datadir data --outdir featurized
34+
```
35+
36+
# Training a Model
37+
38+
- Configurate the model by modifying `config.py` if needed, and then run:
39+
40+
```bash
41+
python train.py 2>&1 | tee train.log
42+
```
43+
44+
# Inferring by a Trained Model
45+
46+
- Infer by a trained model by running:
47+
```bash
48+
python infer.py \
49+
--model_path models/pass_00000.tar.gz \
50+
--data_dir data/featurized/ \
51+
--batch_size 2 \
52+
--use_gpu 0 \
53+
--trainer_count 1 \
54+
2>&1 | tee infer.log
55+
```

globally_normalized_reader/basic_modules.py

Lines changed: 68 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,50 @@
11
#!/usr/bin/env python
22
#coding=utf-8
3+
34
import collections
45

56
import paddle.v2 as paddle
67
from paddle.v2.layer import parse_network
78

89
__all__ = [
910
"stacked_bidirectional_lstm",
11+
"stacked_bidirectional_lstm_by_nested_seq",
1012
"lstm_by_nested_sequence",
1113
]
1214

1315

14-
def stacked_bidirectional_lstm(inputs, size, depth, drop_rate=0., prefix=""):
16+
def stacked_bidirectional_lstm(inputs,
17+
hidden_dim,
18+
depth,
19+
drop_rate=0.,
20+
prefix=""):
21+
""" The stacked bi-directional LSTM.
22+
23+
In PaddlePaddle recurrent layers have two different implementations:
24+
1. recurrent layer implemented by recurrent_group: any intermedia states a
25+
recurent unit computes during one time step, such as hidden states,
26+
input-to-hidden mapping, memory cells and so on, is accessable.
27+
2. recurrent layer as a whole: only outputs of the recurrent layer are
28+
accessable.
29+
30+
The second type (recurrent layer as a whole) is more computation efficient,
31+
because recurrent_group is made up of many basic layers (including add,
32+
element-wise multiplications, matrix multiplication and so on).
33+
34+
This function uses the second type to implement the stacked bi-directional
35+
LSTM.
36+
37+
Arguments:
38+
- inputs: The input layer to the bi-directional LSTM.
39+
- hidden_dim: The dimension of the hidden state of the LSTM.
40+
- depth: Depth of the stacked bi-directional LSTM.
41+
- drop_rate: The drop rate to drop the LSTM output states.
42+
- prefix: A string which will be appended to name of each layer
43+
created in this function. Each layer in a network should
44+
has a unique name. The prefix makes this fucntion can be
45+
called multiple times.
46+
"""
47+
1548
if not isinstance(inputs, collections.Sequence):
1649
inputs = [inputs]
1750

@@ -20,7 +53,7 @@ def stacked_bidirectional_lstm(inputs, size, depth, drop_rate=0., prefix=""):
2053
for i in range(depth):
2154
input_proj = paddle.layer.mixed(
2255
name="%s_in_proj_%0d_%s__" % (prefix, i, dirt),
23-
size=size * 4,
56+
size=hidden_dim * 4,
2457
bias_attr=paddle.attr.Param(initial_std=0.),
2558
input=[paddle.layer.full_matrix_projection(lstm)] if i else [
2659
paddle.layer.full_matrix_projection(in_layer)
@@ -45,8 +78,8 @@ def stacked_bidirectional_lstm(inputs, size, depth, drop_rate=0., prefix=""):
4578

4679

4780
def lstm_by_nested_sequence(input_layer, hidden_dim, name="", reverse=False):
48-
'''
49-
This is a LSTM implemended by nested recurrent_group.
81+
"""This is a LSTM implemended by nested recurrent_group.
82+
5083
Paragraph is a nature nested sequence:
5184
1. each paragraph is a sequence of sentence.
5285
2. each sentence is a sequence of words.
@@ -60,7 +93,14 @@ def lstm_by_nested_sequence(input_layer, hidden_dim, name="", reverse=False):
6093
5. Consequently, this function is just equivalent to concatenate all
6194
sentences in a paragraph into one (long) sentence, and use one LSTM to
6295
encode this new long sentence.
63-
'''
96+
97+
Arguments:
98+
- input_layer: The input layer to the bi-directional LSTM.
99+
- hidden_dim: The dimension of the hidden state of the LSTM.
100+
- name: The name of the bi-directional LSTM.
101+
- reverse: The boolean parameter indicating whether to prcess
102+
the input sequence by the reverse order.
103+
"""
64104

65105
def lstm_outer_step(lstm_group_input, hidden_dim, reverse, name=''):
66106
outer_memory = paddle.layer.memory(
@@ -71,9 +111,8 @@ def lstm_inner_step(input_layer, hidden_dim, reverse, name):
71111
name="__inner_state_%s__" % name,
72112
size=hidden_dim,
73113
boot_layer=outer_memory)
74-
input_proj = paddle.layer.fc(size=hidden_dim * 4,
75-
bias_attr=False,
76-
input=input_layer)
114+
input_proj = paddle.layer.fc(
115+
size=hidden_dim * 4, bias_attr=False, input=input_layer)
77116
return paddle.networks.lstmemory_unit(
78117
input=input_proj,
79118
name="__inner_state_%s__" % name,
@@ -111,7 +150,27 @@ def lstm_inner_step(input_layer, hidden_dim, reverse, name):
111150
reverse=reverse)
112151

113152

114-
def stacked_bi_lstm_by_nested_seq(input_layer, depth, hidden_dim, prefix=""):
153+
def stacked_bidirectional_lstm_by_nested_seq(input_layer,
154+
depth,
155+
hidden_dim,
156+
prefix=""):
157+
""" The stacked bi-directional LSTM to process a nested sequence.
158+
159+
The modules defined in this function is exactly equivalent to
160+
that defined in stacked_bidirectional_lstm, the only difference is the
161+
bi-directional LSTM defined in this function implemented by recurrent_group
162+
in PaddlePaddle, and receive a nested sequence as its input.
163+
164+
Arguments:
165+
- inputs: The input layer to the bi-directional LSTM.
166+
- hidden_dim: The dimension of the hidden state of the LSTM.
167+
- depth: Depth of the stacked bi-directional LSTM.
168+
- prefix: A string which will be appended to name of each layer
169+
created in this function. Each layer in a network should
170+
has a unique name. The prefix makes this fucntion can be
171+
called multiple times.
172+
"""
173+
115174
lstm_final_outs = []
116175
for dirt in ["fwd", "bwd"]:
117176
for i in range(depth):
@@ -122,16 +181,3 @@ def stacked_bi_lstm_by_nested_seq(input_layer, depth, hidden_dim, prefix=""):
122181
reverse=(dirt == "bwd"))
123182
lstm_final_outs.append(lstm_out)
124183
return paddle.layer.concat(input=lstm_final_outs)
125-
126-
127-
if __name__ == "__main__":
128-
vocab_size = 1024
129-
emb_dim = 128
130-
embedding = paddle.layer.embedding(
131-
input=paddle.layer.data(
132-
name="word",
133-
type=paddle.data_type.integer_value_sub_sequence(vocab_size)),
134-
size=emb_dim)
135-
print(parse_network(
136-
stacked_bi_lstm_by_nested_seq(
137-
input_layer=embedding, depth=3, hidden_dim=128, prefix="__lstm")))

0 commit comments

Comments
 (0)