examples.gsf package¶

Submodules¶

examples.gsf.mltr30k module¶

examples.gsf.mltr30k.main(args: argparse.Namespace)[source]¶

examples.gsf.mltr30k.make_command_line_options()[source]¶

examples.gsf.mltr30k.prepare_data(args: argparse.Namespace)[source]¶

examples.gsf.mltr30k.setup_model(args: argparse.Namespace)[source]¶

examples.gsf.mltr30k_boost module¶

examples.gsf.mltr30k_boost.main(args: argparse.Namespace)[source]¶

Split the training dataset T and validation dataset V into 2 parts:

Training/validation for the “weak” learners.
Training/validating an ensemble from the weak learners.

Assign uniform weights to every query/instance in T_{1}.

For computational efficiency assign the same hyper-parameters to all the weak learners.

for i…N (until stopping criteria is met):

Construct a weak neural NN_{i} that uses a random (but reproducible) subset F of available features.
Train NN_{i} on T_{1} until performance on V_{1} plateaus.
Calculate the error on T_{1}. What this means is TBD but should be based on the NDCG.
Use the error to update the weights for each instance

Construct an ensemble model. To train this model, each weak learner will make a prediction for each data point (i.e., each document in a query) and these predictions will be used as additional sequential input features to the ensemble model (in addition to the full set of features for the data).

Parameters: args –
Returns

examples.gsf.mltr30k_boost.make_callbacks(args, model_num: Optional[int] = None)[source]¶

examples.gsf.mltr30k_boost.prepare_data(args: argparse.Namespace)[source]¶

examples.gsf.mltr30k_boost.prepare_strong_data(new_data, args, is_training: bool)[source]¶

Create a new dataset that merges the original data with the predictions from the weak models.

Parameters

orig_data – A dataset created using the standard pipeline.
pred_data – A list of prediction datasets. Each prediction dataset should contain data points with a single 1D tensor with predicted scores for each document. Note: the number of documents in each data point may be different.
args –
is_training –

Returns

examples.gsf.mltr30k_boost.prepare_test_data(data: tensorflow.python.data.ops.dataset_ops.DatasetV2, features, args)[source]¶

examples.gsf.mltr30k_boost.prepare_weak_data(train_data, valid_data, features, weights, args)[source]¶

examples.gsf.mltr30k_boost.setup_model(args: argparse.Namespace, n_model_features: int = 0)[source]¶

examples.gsf.mltr30k_boost.update_weights(weak_model, data: tensorflow.python.data.ops.dataset_ops.DatasetV2, weights: tensorflow.python.framework.ops.Tensor, ndcg_interval: float = 1)[source]¶

Module contents¶

Documentation