Shortcuts

MultiScaleDeformableAttention

class mmcv.ops.MultiScaleDeformableAttention(embed_dims: int = 256, num_heads: int = 8, num_levels: int = 4, num_points: int = 4, im2col_step: int = 64, dropout: float = 0.1, batch_first: bool = False, norm_cfg: Optional[dict] = None, init_cfg: Optional[mmengine.config.config.ConfigDict] = None, value_proj_ratio: float = 1.0)[source]

An attention module used in Deformable-Detr.

Deformable DETR: Deformable Transformers for End-to-End Object Detection..

Parameters
  • embed_dims (int) – The embedding dimension of Attention. Default: 256.

  • num_heads (int) – Parallel attention heads. Default: 8.

  • num_levels (int) – The number of feature map used in Attention. Default: 4.

  • num_points (int) – The number of sampling points for each query in each head. Default: 4.

  • im2col_step (int) – The step used in image_to_column. Default: 64.

  • dropout (float) – A Dropout layer on inp_identity. Default: 0.1.

  • batch_first (bool) – Key, Query and Value are shape of (batch, n, embed_dim) or (n, batch, embed_dim). Default to False.

  • norm_cfg (dict) – Config dict for normalization layer. Default: None.

  • (obj (init_cfg) – mmcv.ConfigDict): The Config for initialization. Default: None.

  • value_proj_ratio (float) – The expansion ratio of value_proj. Default: 1.0.

forward(query: torch.Tensor, key: Optional[torch.Tensor] = None, value: Optional[torch.Tensor] = None, identity: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, key_padding_mask: Optional[torch.Tensor] = None, reference_points: Optional[torch.Tensor] = None, spatial_shapes: Optional[torch.Tensor] = None, level_start_index: Optional[torch.Tensor] = None, **kwargs)torch.Tensor[source]

Forward Function of MultiScaleDeformAttention.

Parameters
  • query (torch.Tensor) – Query of Transformer with shape (num_query, bs, embed_dims).

  • key (torch.Tensor) – The key tensor with shape (num_key, bs, embed_dims).

  • value (torch.Tensor) – The value tensor with shape (num_key, bs, embed_dims).

  • identity (torch.Tensor) – The tensor used for addition, with the same shape as query. Default None. If None, query will be used.

  • query_pos (torch.Tensor) – The positional encoding for query. Default: None.

  • key_padding_mask (torch.Tensor) – ByteTensor for query, with shape [bs, num_key].

  • reference_points (torch.Tensor) – The normalized reference points with shape (bs, num_query, num_levels, 2), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.

  • spatial_shapes (torch.Tensor) – Spatial shape of features in different levels. With shape (num_levels, 2), last dimension represents (h, w).

  • level_start_index (torch.Tensor) – The start index of each level. A tensor has shape (num_levels, ) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].

Returns

forwarded results with shape [num_query, bs, embed_dims].

Return type

torch.Tensor

init_weights()None[source]

Default initialization for Parameters of Module.

Read the Docs v: 2.x
Versions
master
latest
2.x
1.x
v1.7.0
v1.6.2
v1.6.1
v1.6.0
v1.5.3
v1.5.2_a
v1.5.1
v1.5.0
v1.4.8
v1.4.7
v1.4.6
v1.4.5
v1.4.4
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.3.18
v1.3.17
v1.3.16
v1.3.15
v1.3.14
v1.3.13
v1.3.12
v1.3.11
v1.3.10
v1.3.9
v1.3.8
v1.3.7
v1.3.6
v1.3.5
v1.3.4
v1.3.3
v1.3.2
v1.3.1
v1.3.0
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.