ReferentialGym.agents package

Submodules

ReferentialGym.agents.agent module

ReferentialGym.agents.agent.vae_loss_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
ReferentialGym.agents.agent.maxl1_loss_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
class ReferentialGym.agents.agent.Agent(agent_id='l0', obs_shape=[1, 1, 1, 32, 32], vocab_size=100, max_sentence_length=10, logger=None, kwargs=None, role=None)

Bases: ReferentialGym.modules.module.Module

get_input_stream_keys()
get_input_stream_ids()
clone(clone_id='a0')
save(path)
_tidyup()
_log(log_dict, batch_size)
register_hook(hook)
forward(sentences, experiences, multi_round=False, graphtype='straight_through_gumbel_softmax', tau0=0.2)
Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • multi_round – Boolean defining whether to utter a sentence back or not.

  • graphtype – String defining the type of symbols used in the output sentence: - ‘categorical’: one-hot-encoded symbols. - ‘gumbel_softmax’: continuous relaxation of a categorical distribution. - ‘straight_through_gumbel_softmax’: improved continuous relaxation… - ‘obverter’: obverter training scheme…

  • tau0 – Float, temperature with which to apply gumbel-softmax estimator.

compute(input_streams_dict: Dict[str, object]) → Dict[str, object]

Compute the losses and return them along with the produced outputs.

Parameters

input_streams_dict

Dict that should contain, at least, the following keys and values: - ‘sentences_logits’: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits over symbols. - ‘sentences_widx’: Tensor of shape (batch_size, max_sentence_length, 1) containing the padded sequence of symbols’ indices. - ‘sentences_one_hot’: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols. - ‘experiences’: Tensor of shape (batch_size, *self.obs_shape). - ‘exp_latents’: Tensor of shape (batch_size, nbr_latent_dimensions). - ‘multi_round’: Boolean defining whether to utter a sentence back or not. - ‘graphtype’: String defining the type of symbols used in the output sentence:

  • ’categorical’: one-hot-encoded symbols.

  • ’gumbel_softmax’: continuous relaxation of a categorical distribution.

  • ’straight_through_gumbel_softmax’: improved continuous relaxation…

  • ’obverter’: obverter training scheme…

  • ’tau0’: Float, temperature with which to apply gumbel-softmax estimator.

  • ’sample’: Dict that contains the speaker and listener experiences as well as the target index.

  • ’config’: Dict of hyperparameters to the referential game.

  • ’mode’: String that defines what mode we are in, e.g. ‘train’ or ‘test’. Those keywords are expected.

  • ’it’: Integer specifying the iteration number of the current function call.

ReferentialGym.agents.attention_lstm_cnn_listener module

class ReferentialGym.agents.attention_lstm_cnn_listener.AttentionLSTMCNNListener(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='l0', logger=None)

Bases: ReferentialGym.agents.discriminative_listener.DiscriminativeListener

reset()
_tidyup()
_compute_tau(tau0, h)

invtau = 1.0 / (self.tau_fc(h).squeeze() + tau0) return invtau

_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the stimuli so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols. NOTE: max_sentence_length may be different from self.max_sentence_lenght as the padding is padding by batch and only care about the maximal sentence length of said batch.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

ReferentialGym.agents.caption_speaker module

class ReferentialGym.agents.caption_speaker.CaptionSpeaker(kwargs, obs_shape, vocab_size=100, max_sentence_length=10)

Bases: ReferentialGym.agents.speaker.Speaker

reset()
_compute_tau(tau0, h)
_sense(stimuli, sentences=None)

Infers features from the stimuli that have been provided.

Parameters
  • stimuli – Tensor of shape (batch_size, *self.obs_shape). stimuli[:, 0] is assumed as the target stimulus, while the others are distractors, if any.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, *(self.obs_shape[:2]), feature_dim).

_utter(features, sentences=None)

Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

ReferentialGym.agents.categorical_obverter_agent module

class ReferentialGym.agents.categorical_obverter_agent.CategoricalObverterAgent(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='o0', logger=None)

Bases: ReferentialGym.agents.listener.Listener

reset()
_tidyup()
_compute_tau(tau0, h)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1) / ? (descriptive mode depends on the role of the agent) *temporal_feature_dim).

_compute_sentence(target_idx, symbol_encoder, symbol_processing, symbol_decoder, decision_decoder, init_rnn_states=None, vocab_size=10, max_sentence_length=14, nbr_distractors_po=1, operation=<built-in method max of type object>, vocab_stop_idx=0, use_obverter_threshold_to_stop_message_generation=False, use_stop_word=False)

Compute sentences using the obverter approach, adapted to referential game variants following the descriptive approach described in the work of [Choi et al., 2018](http://arxiv.org/abs/1804.02341).

In descriptive mode, nbr_distractors_po=1 and target_idx=torch.zeros((batch_size,1)), thus the algorithm behaves exactly like in Choi et al. (2018). Otherwise, the the likelyhoods for the target experience of being chosen by the decision module is considered solely and the algorithm aims at maximizing/minimizing (following :param operation:) this likelyhood over the sentence’s next word.

Parameters
  • features_embedding – Tensor of (temporal) features embedding of shape (batch_size, *self.obs_shape).

  • target_idx – Tensor of indices of the target experiences of shape (batch_size, 1).

  • symbol_encoder – torch.nn.Module used to embed vocabulary indices into vocabulary embeddings.

  • symbol_processing – torch.nn.Module used to generate the sentences.

  • symbol_decoder – torch.nn.Module used to decode the embeddings generated by the :param symbol_processing: module.

  • decision_decoder – torch.nn.Module used to output the decision over the experiences.

  • init_rnn_states – None or Tuple of Tensors to initialize the symbol_processing’s rnn states.

  • vocab_size – int, size of the vocabulary.

  • max_sentence_length – int, maximal length for each generated sentences.

  • nbr_distractors_po – int, number of distractors and target, i.e. `nbr_distractors+1.

  • operation – Function, expect torch.max or torch.min.

  • vocab_stop_idx – int, index of the STOP symbol in the vocabulary.

  • use_obverter_threshold_to_stop_message_generation – boolean, or float that specifies whether to stop the message generation when the decision module’s output probability is abobe a given threshold (or below it if the operation is torch.min). If it is a float, then it is the value of the threshold.

  • use_stop_word – boolean that specifies whether to use one of the word in the vocabulary with a pre-defined meaning, that is that it is a STOP token, thus effictively ending the symbol generation for the current sentence.

Returns

  • sentences_widx: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], 1) where b is the batch index.

    It represents the indices of the chosen words.

  • sentences_logits: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the logits of words over the decision module’s potential to choose the target experience as output.

  • sentences_one_hots: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the sentences as one-hot-encoded word vectors.

ReferentialGym.agents.differentiable_obverter_agent module

ReferentialGym.agents.differentiable_obverter_agent.sentence_length_logging_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
class ReferentialGym.agents.differentiable_obverter_agent.DifferentiableObverterAgent(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='o0', logger=None, use_sentences_one_hot_vectors=True, differentiable=True)

Bases: ReferentialGym.agents.discriminative_listener.DiscriminativeListener

symbol_processing = None
if self.use_sentences_one_hot_vectors:

#self.symbol_encoder = nn.Linear(self.vocab_size, self.kwargs[‘symbol_processing_nbr_hidden_units’], bias=False) self.symbol_encoder = nn.Linear(self.vocab_size, self.kwargs[‘symbol_embedding_size’], bias=False)

else:

#self.symbol_encoder = nn.Embedding(self.vocab_size+2, self.kwargs[‘symbol_processing_nbr_hidden_units’], padding_idx=self.vocab_size) self.symbol_encoder = nn.Embedding(self.vocab_size+2, self.kwargs[‘symbol_embedding_size’], padding_idx=self.vocab_size)

self.symbol_decoder = nn.ModuleList() self.symbol_decoder.append(nn.Linear(self.kwargs[‘symbol_processing_nbr_hidden_units’], self.vocab_size)) if self.kwargs[‘dropout_prob’]: self.symbol_decoder.append(nn.Dropout(p=self.kwargs[‘dropout_prob’]))

self.tau_fc = layer_init(nn.Linear(self.kwargs[‘temporal_encoder_nbr_hidden_units’], 1 , bias=False))

self.not_target_logits_per_token = nn.Parameter(torch.ones((1,self.kwargs[‘max_sentence_length’]))) self.register_parameter(name=’not_target_logits_per_token’, param=self.not_target_logits_per_token)

reset()
_tidyup()
_compute_tau(tau0, h)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1) / ? (descriptive mode depends on the role of the agent) *temporal_feature_dim).

_compute_sentence(target_idx, symbol_encoder, symbol_processing, symbol_decoder, init_rnn_states=None, allowed_vocab_size=10, vocab_size=10, max_sentence_length=14, nbr_distractors_po=1, operation=<built-in method max of type object>, vocab_stop_idx=0, use_obverter_threshold_to_stop_message_generation=False, use_stop_word=False, _compute_tau=None, not_target_logits_per_token=None, use_sentences_one_hot_vectors=False, logger=None)

Compute sentences using the obverter approach, adapted to referential game variants following the descriptive approach described in the work of [Choi et al., 2018](http://arxiv.org/abs/1804.02341).

In descriptive mode, nbr_distractors_po=1 and target_idx=torch.zeros((batch_size,1)), thus the algorithm behaves exactly like in Choi et al. (2018). Otherwise, the the likelyhoods for the target experience of being chosen by the decision module is considered solely and the algorithm aims at maximizing/minimizing (following :param operation:) this likelyhood over the sentence’s next word.

Parameters
  • features_embedding – Tensor of (temporal) features embedding of shape (batch_size, *self.obs_shape).

  • target_idx – Tensor of indices of the target experiences of shape (batch_size, 1).

  • symbol_encoder – torch.nn.Module used to embed vocabulary indices into vocabulary embeddings.

  • symbol_processing – torch.nn.Module used to generate the sentences.

  • symbol_decoder – torch.nn.Module used to decode the embeddings generated by the :param symbol_processing: module.

  • init_rnn_states – None or Tuple of Tensors to initialize the symbol_processing’s rnn states.

  • vocab_size – int, size of the vocabulary.

  • max_sentence_length – int, maximal length for each generated sentences.

  • nbr_distractors_po – int, number of distractors and target, i.e. `nbr_distractors+1.

  • operation – Function, expect torch.max or torch.min.

  • vocab_stop_idx – int, index of the STOP symbol in the vocabulary.

  • use_obverter_threshold_to_stop_message_generation – boolean, or float that specifies whether to stop the message generation when the decision module’s output probability is abobe a given threshold (or below it if the operation is torch.min). If it is a float, then it is the value of the threshold.

  • use_stop_word – boolean that specifies whether to use one of the word in the vocabulary with a pre-defined meaning, that is that it is a STOP token, thus effictively ending the symbol generation for the current sentence.

Returns

  • sentences_widx: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], 1) where b is the batch index.

    It represents the indices of the chosen words.

  • sentences_logits: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the logits of words over the decision module’s potential to choose the target experience as output.

  • sentences_one_hots: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the sentences as one-hot-encoded word vectors.

ReferentialGym.agents.differentiable_relational_obverter module

class ReferentialGym.agents.differentiable_relational_obverter.DifferentiableRelationalObverterAgent(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='o0', logger=None)

Bases: ReferentialGym.agents.discriminative_listener.DiscriminativeListener

reset()
_tidyup()
_compute_tau(tau0, emb=None)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_makeVisualXYTSfeatures(features)
_makeSymbolicXYTSfeatures(sentences)
_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • features – Tensor of shape (batch_size, nbr_distractors_po, nbr_stimulus, mm_ponderer_depth_dim=thought_space_depth_dim+5, ..nbr_visual_entity..).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1) / ? (descriptive mode depends on the role of the agent) *temporal_feature_dim).

_compute_sentence(features, target_idx, _reason=None, allowed_vocab_size=10, vocab_size=10, max_sentence_length=14, operation=<built-in method max of type object>, vocab_stop_idx=0, use_obverter_threshold_to_stop_message_generation=False, use_stop_word=False, _compute_tau=None, not_target_logits_per_token=None, logger=None)

Compute sentences using the obverter approach, adapted to referential game variants following the descriptive approach described in the work of [Choi et al., 2018](http://arxiv.org/abs/1804.02341).

In descriptive mode, nbr_distractors_po=1 and target_idx=torch.zeros((batch_size,1)), thus the algorithm behaves exactly like in Choi et al. (2018). Otherwise, the the likelyhoods for the target experience of being chosen by the decision module is considered solely and the algorithm aims at maximizing/minimizing (following :param operation:) this likelyhood over the sentence’s next word.

Parameters
  • features – Tensor of (temporal) features embedding of shape (batch_size, *self.obs_shape).

  • target_idx – Tensor of indices of the target experiences of shape (batch_size, 1).

  • _reason – Function used to reason about the visual and textual entities.

  • vocab_size – int, size of the vocabulary.

  • max_sentence_length – int, maximal length for each generated sentences.

  • operation – Function, expect torch.max or torch.min.

  • vocab_stop_idx – int, index of the STOP symbol in the vocabulary.

  • use_obverter_threshold_to_stop_message_generation – boolean, or float that specifies whether to stop the message generation when the decision module’s output probability is abobe a given threshold (or below it if the operation is torch.min). If it is a float, then it is the value of the threshold.

  • use_stop_word – boolean that specifies whether to use one of the word in the vocabulary with a pre-defined meaning, that is that it is a STOP token, thus effictively ending the symbol generation for the current sentence.

Returns

  • sentences_widx: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], 1) where b is the batch index.

    It represents the indices of the chosen words.

  • sentences_logits: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the logits of words over the decision module’s potential to choose the target experience as output.

  • sentences_one_hots: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the sentences as one-hot-encoded word vectors.

ReferentialGym.agents.discriminative_listener module

ReferentialGym.agents.discriminative_listener.havrylov_hinge_learning_signal(decision_logits, target_decision_idx, sampled_decision_idx=None, multi_round=False)
ReferentialGym.agents.discriminative_listener.discriminative_st_gs_referential_game_loss(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
ReferentialGym.agents.discriminative_listener.penalize_multi_round_binary_reward_fn(sampled_decision_idx, target_decision_idx, decision_logits=None, multi_round=False)

Computes the reward and done boolean of the current timestep. Episode ends if the decisions are correct (or if the max number of round is achieved, but this is handled outside of this function).

class ReferentialGym.agents.discriminative_listener.ExperienceBuffer(capacity, keys=None, circular_keys={'succ_s': 's'}, circular_offsets={'succ_s': 1})

Bases: object

add_key(key)
add(data)
pop()

Output a data dict of the latest ‘complete’ data experience.

reset()
cat(keys, indices=None)
sample(batch_size, keys=None)
ReferentialGym.agents.discriminative_listener.compute_reinforce_losses(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
ReferentialGym.agents.discriminative_listener.discriminative_reinforce_referential_game_loss(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
class ReferentialGym.agents.discriminative_listener.DiscriminativeListener(obs_shape, vocab_size=100, max_sentence_length=10, agent_id='l0', logger=None, kwargs=None)

Bases: ReferentialGym.agents.listener.Listener

_compute_tau(tau0)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • exp – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, *(self.obs_shape[:2]), feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

forward(sentences, experiences, multi_round=False, graphtype='straight_through_gumbel_softmax', tau0=0.2)
Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • multi_round – Boolean defining whether to utter a sentence back or not.

  • graphtype – String defining the type of symbols used in the output sentence: - ‘categorical’: one-hot-encoded symbols. - ‘gumbel_softmax’: continuous relaxation of a categorical distribution. - ‘straight_through_gumbel_softmax’: improved continuous relaxation… - ‘obverter’: obverter training scheme…

  • tau0 – Float, temperature with which to apply gumbel-softmax estimator.

ReferentialGym.agents.eos_priored_lstm_cnn_speaker module

ReferentialGym.agents.eos_priored_lstm_cnn_speaker.eos_priored_loss_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
class ReferentialGym.agents.eos_priored_lstm_cnn_speaker.EoSPrioredLSTMCNNSpeaker(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='s0', logger=None)

Bases: ReferentialGym.agents.speaker.Speaker

reset()
_tidyup()
_compute_tau(tau0, h)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_utter(features, sentences=None)

TODO: update this description… Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

ReferentialGym.agents.generative_listener module

ReferentialGym.agents.generative_listener.havrylov_hinge_learning_signal(decision_logits, target_decision_idx, sampled_decision_idx=None, multi_round=False)
ReferentialGym.agents.generative_listener.generative_st_gs_referential_game_loss(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
class ReferentialGym.agents.generative_listener.GenerativeListener(obs_shape, vocab_size=100, max_sentence_length=10, agent_id='l0', logger=None, kwargs=None)

Bases: ReferentialGym.agents.listener.Listener

forward(sentences, experiences, multi_round=False, graphtype='straight_through_gumbel_softmax', tau0=0.2)
Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • multi_round – Boolean defining whether to utter a sentence back or not.

  • graphtype – String defining the type of symbols used in the output sentence: - ‘categorical’: one-hot-encoded symbols. - ‘gumbel_softmax’: continuous relaxation of a categorical distribution. - ‘straight_through_gumbel_softmax’: improved continuous relaxation… - ‘obverter’: obverter training scheme…

  • tau0 – Float, temperature with which to apply gumbel-softmax estimator.

ReferentialGym.agents.listener module

class ReferentialGym.agents.listener.Listener(obs_shape, vocab_size=100, max_sentence_length=10, agent_id='l0', logger=None, kwargs=None)

Bases: ReferentialGym.agents.agent.Agent

reset()
_reset_rnn_states()
_compute_tau(tau0)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • exp – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, *(self.obs_shape[:2]), feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

ReferentialGym.agents.lstm_cnn_listener module

class ReferentialGym.agents.lstm_cnn_listener.LSTMCNNListener(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='l0', logger=None)

Bases: ReferentialGym.agents.discriminative_listener.DiscriminativeListener

symbol_processing = None
self.symbol_processing_learnable_initial_state = nn.Parameter(

torch.zeros(1,1,self.kwargs[‘symbol_processing_nbr_hidden_units’])

)

tau_fc = None

self.not_target_logits_per_token = nn.Parameter(torch.ones((1, self.kwargs[‘max_sentence_length’], 1)))

reset()
_tidyup()
_compute_tau(tau0, h)

invtau = 1.0 / (self.tau_fc(h).squeeze() + tau0) return invtau

_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the stimuli so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols. NOTE: max_sentence_length may be different from self.max_sentence_lenght as the padding is padding by batch and only care about the maximal sentence length of said batch.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

ReferentialGym.agents.lstm_cnn_speaker module

class ReferentialGym.agents.lstm_cnn_speaker.LSTMCNNSpeaker(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='s0', logger=None)

Bases: ReferentialGym.agents.speaker.Speaker

reset()
_tidyup()
_compute_tau(tau0, h)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_utter(features, sentences=None)

TODO: update this description… Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

ReferentialGym.agents.multi_head_lstm_cnn_speaker module

class ReferentialGym.agents.multi_head_lstm_cnn_speaker.MultiHeadLSTMCNNSpeaker(kwargs, multi_head_config, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='s0', logger=None)

Bases: ReferentialGym.agents.speaker.Speaker

reset()
_tidyup()
_compute_tau(tau0, h)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_utter(features, sentences=None)

TODO: update this description… Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

ReferentialGym.agents.obverter_agent module

class ReferentialGym.agents.obverter_agent.ObverterAgent(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='o0', logger=None)

Bases: ReferentialGym.agents.discriminative_listener.DiscriminativeListener

reset()
_tidyup()
_compute_tau(tau0)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1) / ? (descriptive mode depends on the role of the agent) *temporal_feature_dim).

_compute_sentence(target_idx, symbol_encoder, symbol_processing, symbol_decoder, init_rnn_states=None, vocab_size=10, max_sentence_length=14, nbr_distractors_po=1, operation=<built-in method max of type object>, vocab_stop_idx=0, use_obverter_threshold_to_stop_message_generation=False, use_stop_word=False)

Compute sentences using the obverter approach, adapted to referential game variants following the descriptive approach described in the work of [Choi et al., 2018](http://arxiv.org/abs/1804.02341).

In descriptive mode, nbr_distractors_po=1 and target_idx=torch.zeros((batch_size,1)), thus the algorithm behaves exactly like in Choi et al. (2018). Otherwise, the the likelyhoods for the target experience of being chosen by the decision module is considered solely and the algorithm aims at maximizing/minimizing (following :param operation:) this likelyhood over the sentence’s next word.

Parameters
  • features_embedding – Tensor of (temporal) features embedding of shape (batch_size, *self.obs_shape).

  • target_idx – Tensor of indices of the target experiences of shape (batch_size, 1).

  • symbol_encoder – torch.nn.Module used to embed vocabulary indices into vocabulary embeddings.

  • symbol_processing – torch.nn.Module used to generate the sentences.

  • symbol_decoder – torch.nn.Module used to decode the embeddings generated by the :param symbol_processing: module.

  • init_rnn_states – None or Tuple of Tensors to initialize the symbol_processing’s rnn states.

  • vocab_size – int, size of the vocabulary.

  • max_sentence_length – int, maximal length for each generated sentences.

  • nbr_distractors_po – int, number of distractors and target, i.e. `nbr_distractors+1.

  • operation – Function, expect torch.max or torch.min.

  • vocab_stop_idx – int, index of the STOP symbol in the vocabulary.

  • use_obverter_threshold_to_stop_message_generation – boolean, or float that specifies whether to stop the message generation when the decision module’s output probability is abobe a given threshold (or below it if the operation is torch.min). If it is a float, then it is the value of the threshold.

  • use_stop_word – boolean that specifies whether to use one of the word in the vocabulary with a pre-defined meaning, that is that it is a STOP token, thus effictively ending the symbol generation for the current sentence.

Returns

  • sentences_widx: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], 1) where b is the batch index.

    It represents the indices of the chosen words.

  • sentences_logits: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the logits of words over the decision module’s potential to choose the target experience as output.

  • sentences_one_hots: List[Tensor] of length batch_size with shapes (1, sentences_lenght[b], vocab_size) where b is the batch index.

    It represents the sentences as one-hot-encoded word vectors.

ReferentialGym.agents.speaker module

ReferentialGym.agents.speaker.sentence_length_logging_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
ReferentialGym.agents.speaker.entropy_logging_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
ReferentialGym.agents.speaker.entropy_regularization_loss_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
ReferentialGym.agents.speaker.mdl_principle_loss_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
ReferentialGym.agents.speaker.oov_loss_hook(agent, losses_dict, input_streams_dict, outputs_dict, logs_dict, **kwargs)
class ReferentialGym.agents.speaker.Speaker(obs_shape, vocab_size=100, max_sentence_length=10, agent_id='s0', logger=None, kwargs=None)

Bases: ReferentialGym.agents.agent.Agent

reset()
_reset_rnn_states()
_compute_tau(tau0)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). experiences[:, 0] is assumed as the target experience, while the others are distractors, if any.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, *(self.obs_shape[:2]), feature_dim).

_utter(features, sentences=None)

Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

forward(experiences, sentences=None, multi_round=False, graphtype='straight_through_gumbel_softmax', tau0=0.2)
Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

  • experiences – Tensor of shape (batch_size, *self.obs_shape). experiences[:,0] is assumed as the target experience, while the others are distractors, if any.

  • multi_round – Boolean defining whether to utter a sentence back or not.

  • graphtype – String defining the type of symbols used in the output sentence: - ‘categorical’: one-hot-encoded symbols. - ‘gumbel_softmax’: continuous relaxation of a categorical distribution. - ‘straight_through_gumbel_softmax’: improved continuous relaxation… - ‘obverter’: obverter training scheme…

  • tau0 – Float, temperature with which to apply gumbel-softmax estimator.

ReferentialGym.agents.transcoding_lstm_cnn_listener module

class ReferentialGym.agents.transcoding_lstm_cnn_listener.TranscodingLSTMCNNListener(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='l0', logger=None)

Bases: ReferentialGym.agents.discriminative_listener.DiscriminativeListener

reset()
_tidyup()
_compute_tau(tau0, h)

invtau = 1.0 / (self.tau_fc(h).squeeze() + tau0) return invtau

_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the stimuli so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, feature_dim).

_reason(sentences, features)

Reasons about the features and sentences to yield the target-prediction logits.

Parameters
  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols. NOTE: max_sentence_length may be different from self.max_sentence_lenght as the padding is padding by batch and only care about the maximal sentence length of said batch.

  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

Returns

  • decision_logits: Tensor of shape (batch_size, self.obs_shape[1]) containing the target-prediction logits.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

_utter(features, sentences)

Reasons about the features and the listened sentences to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

ReferentialGym.agents.transcoding_lstm_cnn_speaker module

class ReferentialGym.agents.transcoding_lstm_cnn_speaker.TranscodingLSTMCNNSpeaker(kwargs, obs_shape, vocab_size=100, max_sentence_length=10, agent_id='s0', logger=None)

Bases: ReferentialGym.agents.speaker.Speaker

reset()
_tidyup()
_compute_tau(tau0, h)
_sense(experiences, sentences=None)

Infers features from the experiences that have been provided.

Parameters
  • experiences – Tensor of shape (batch_size, *self.obs_shape). Make sure to shuffle the experiences so that the order does not give away the target.

  • sentences – None or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

features: Tensor of shape `(batch_size, -1, nbr_stimulus, feature_dim).

_utter(features, sentences=None)

TODO: update this description… Reasons about the features and the listened sentences, if multi_round, to yield the sentences to utter back.

Parameters
  • features – Tensor of shape (batch_size, *self.obs_shape[:2], feature_dim).

  • sentences – None, or Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of (potentially one-hot-encoded) symbols.

Returns

  • word indices: Tensor of shape (batch_size, max_sentence_length, 1) of type long containing the indices of the words that make up the sentences.

  • logits: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of logits.

  • sentences: Tensor of shape (batch_size, max_sentence_length, vocab_size) containing the padded sequence of one-hot-encoded symbols.

  • temporal features: Tensor of shape (batch_size, (nbr_distractors+1)*temporal_feature_dim).

Module contents