layer_attention

Creates attention layer

Description

Dot-product attention layer, a.k.a. Luong-style attention.

Usage

 
layer_attention( 
  inputs, 
  use_scale = FALSE, 
  causal = FALSE, 
  batch_size = NULL, 
  dtype = NULL, 
  name = NULL, 
  trainable = NULL, 
  weights = NULL 
)

Arguments

Arguments	Description
inputs	a list of inputs first should be the query tensor, the second the value tensor
use_scale	If True, will create a scalar variable to scale the attention scores.
causal	Boolean. Set to True for decoder self-attention. Adds a mask such that position i cannot attend to positions j > i. This prevents the flow of information from the future towards the past.
batch_size	Fixed batch size for layer
dtype	The data type expected by the input, as a string (`float32`, `float64`, `int32`…)
name	An optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn’t provided.
trainable	Whether the layer weights will be updated during training.
weights	Initial weights for layer.

layer_attention

Creates attention layer

Description

Usage

Arguments

See Also