Here we show how to create your own model in T2T.
T2TModel has three typical usages:
-
Estimator: The method
make_estimator_model_fnbuilds amodel_fnfor the tf.Estimator workflow of training, evaluation, and prediction. It performs the methodcall, which performs the core computation, followed byestimator_spec_train,estimator_spec_eval, orestimator_spec_predictdepending on the tf.Estimator mode. -
Layer: The method
callenablesT2TModelto be used a callable by itself. It calls the following methods:bottom, which transforms features according toproblem_hparams' input and targetModalitys;body, which takes features and performs the core model computation to return output and any auxiliary loss terms;top, which takes features and the body output, and transforms them according toproblem_hparams' input and targetModalitys to return the final logits;loss, which takes the logits, forms any missing training loss, and sums all loss terms.
-
Inference: The method
inferenablesT2TModelto make sequence predictions by itself.
-
Create a class that extends
T2TModel. This example creates a copy of an existing basic fully-connected network:from tensor2tensor.utils import t2t_model class MyFC(t2t_model.T2TModel): pass
-
Implement the
bodymethod:class MyFC(t2t_model.T2TModel): def body(self, features): hparams = self.hparams x = features["inputs"] shape = common_layers.shape_list(x) x = tf.reshape(x, [-1, shape[1] * shape[2] * shape[3]]) # Flatten input as in T2T they are all 4D vectors for i in range(hparams.num_hidden_layers): # create layers x = tf.layers.dense(x, hparams.hidden_size, name="layer_%d" % i) x = tf.nn.dropout(x, keep_prob=1.0 - hparams.dropout) x = tf.nn.relu(x) return tf.expand_dims(tf.expand_dims(x, axis=1), axis=1) # 4D For T2T.
Method Signature:
-
Args:
- features: dict of str to Tensor, where each Tensor has shape
[batch_size, ..., hidden_size]. It typically contains keys
inputsandtargets.
- features: dict of str to Tensor, where each Tensor has shape
[batch_size, ..., hidden_size]. It typically contains keys
-
Returns one of:
- output: Tensor of pre-logit activations with shape [batch_size, ..., hidden_size].
- losses: Either single loss as a scalar, a list, a Tensor (to be averaged), or a dictionary of losses. If losses is a dictionary with the key "training", losses["training"] is considered the final training loss and output is considered logits; self.top and self.loss will be skipped.
-
-
Register your model:
from tensor2tensor.utils import registry @registry.register_model class MyFC(t2t_model.T2TModel): # ...
-
Use it with t2t tools as any other model:
Have in mind that names are translated from camel case to snake_case
MyFC->my_fcand that you need to point t2t to the directory containing your model with the--t2t_usr_dirflag. For example if you want to train a model on gcloud with 1 GPU worker on the IMDB sentiment task, you can run your model by executing the following command from your model class directory.t2t-trainer \ --model=my_fc \ --t2t_usr_dir=. --cloud_mlengine --worker_gpu=1 \ --generate_data \ --data_dir='gs://data' \ --output_dir='gs://out' \ --problem=sentiment_imdb \ --hparams_set=basic_fc_small \ --train_steps=10000 \ --eval_steps=10 \