examples.gsf package

Submodules

examples.gsf.mltr30k module

examples.gsf.mltr30k.main(args: argparse.Namespace)[source]
examples.gsf.mltr30k.make_command_line_options()[source]
examples.gsf.mltr30k.prepare_data(args: argparse.Namespace)[source]
examples.gsf.mltr30k.setup_model(args: argparse.Namespace)[source]

examples.gsf.mltr30k_boost module

examples.gsf.mltr30k_boost.main(args: argparse.Namespace)[source]
Split the training dataset T and validation dataset V into 2 parts:
  1. Training/validation for the “weak” learners.

  2. Training/validating an ensemble from the weak learners.

Assign uniform weights to every query/instance in T_{1}.

For computational efficiency assign the same hyper-parameters to all the weak learners.

for i…N (until stopping criteria is met):
  • Construct a weak neural NN_{i} that uses a random (but reproducible) subset F of available features.

  • Train NN_{i} on T_{1} until performance on V_{1} plateaus.

  • Calculate the error on T_{1}. What this means is TBD but should be based on the NDCG.

  • Use the error to update the weights for each instance

Construct an ensemble model. To train this model, each weak learner will make a prediction for each data point (i.e., each document in a query) and these predictions will be used as additional sequential input features to the ensemble model (in addition to the full set of features for the data).

Parameters

args

Returns

examples.gsf.mltr30k_boost.make_callbacks(args, model_num: Optional[int] = None)[source]
examples.gsf.mltr30k_boost.prepare_data(args: argparse.Namespace)[source]
examples.gsf.mltr30k_boost.prepare_strong_data(new_data, args, is_training: bool)[source]

Create a new dataset that merges the original data with the predictions from the weak models.

Parameters
  • orig_data – A dataset created using the standard pipeline.

  • pred_data – A list of prediction datasets. Each prediction dataset should contain data points with a single 1D tensor with predicted scores for each document. Note: the number of documents in each data point may be different.

  • args

  • is_training

Returns

examples.gsf.mltr30k_boost.prepare_test_data(data: tensorflow.python.data.ops.dataset_ops.DatasetV2, features, args)[source]
examples.gsf.mltr30k_boost.prepare_weak_data(train_data, valid_data, features, weights, args)[source]
examples.gsf.mltr30k_boost.setup_model(args: argparse.Namespace, n_model_features: int = 0)[source]
examples.gsf.mltr30k_boost.update_weights(weak_model, data: tensorflow.python.data.ops.dataset_ops.DatasetV2, weights: tensorflow.python.framework.ops.Tensor, ndcg_interval: float = 1)[source]

Module contents

Documentation