Shortcuts

fileio

class mmcv.fileio.BaseStorageBackend[源代码]

Abstract class of storage backends.

All backends need to implement two apis: get() and get_text(). get() reads the file as a byte stream and get_text() reads the file as texts.

class mmcv.fileio.FileClient(backend=None, prefix=None, **kwargs)[源代码]

A general file client to access files in different backends.

The client loads a file or text in a specified backend from its path and returns it as a binary or text file. There are two ways to choose a backend, the name of backend and the prefix of path. Although both of them can be used to choose a storage backend, backend has a higher priority that is if they are all set, the storage backend will be chosen by the backend argument. If they are all None, the disk backend will be chosen. Note that It can also register other backend accessor with a given name, prefixes, and backend class. In addition, We use the singleton pattern to avoid repeated object creation. If the arguments are the same, the same object will be returned.

参数
  • backend (str, optional) – The storage backend type. Options are “disk”, “ceph”, “memcached”, “lmdb”, “http” and “petrel”. Default: None.

  • prefix (str, optional) – The prefix of the registered storage backend. Options are “s3”, “http”, “https”. Default: None.

实际案例

>>> # only set backend
>>> file_client = FileClient(backend='petrel')
>>> # only set prefix
>>> file_client = FileClient(prefix='s3')
>>> # set both backend and prefix but use backend to choose client
>>> file_client = FileClient(backend='petrel', prefix='s3')
>>> # if the arguments are the same, the same object is returned
>>> file_client1 = FileClient(backend='petrel')
>>> file_client1 is file_client
True
client

The backend object.

Type

BaseStorageBackend

exists(filepath: Union[str, pathlib.Path])bool[源代码]

Check whether a file path exists.

参数

filepath (str or Path) – Path to be checked whether exists.

返回

Return True if filepath exists, False otherwise.

返回类型

bool

get(filepath: Union[str, pathlib.Path])Union[bytes, memoryview][源代码]

Read data from a given filepath with ‘rb’ mode.

注解

There are two types of return values for get, one is bytes and the other is memoryview. The advantage of using memoryview is that you can avoid copying, and if you want to convert it to bytes, you can use .tobytes().

参数

filepath (str or Path) – Path to read data.

返回

Expected bytes object or a memory view of the bytes object.

返回类型

bytes | memoryview

get_local_path(filepath: Union[str, pathlib.Path])Iterable[str][源代码]

Download data from filepath and write the data to local path.

get_local_path is decorated by contxtlib.contextmanager(). It can be called with with statement, and when exists from the with statement, the temporary path will be released.

注解

If the filepath is a local path, just return itself.

警告

get_local_path is an experimental interface that may change in the future.

参数

filepath (str or Path) – Path to be read data.

实际案例

>>> file_client = FileClient(prefix='s3')
>>> with file_client.get_local_path('s3://bucket/abc.jpg') as path:
...     # do something here
生成器

Iterable[str] – Only yield one path.

get_text(filepath: Union[str, pathlib.Path], encoding='utf-8')str[源代码]

Read data from a given filepath with ‘r’ mode.

参数
  • filepath (str or Path) – Path to read data.

  • encoding (str) – The encoding format used to open the filepath. Default: ‘utf-8’.

返回

Expected text reading from filepath.

返回类型

str

classmethod infer_client(file_client_args: Optional[dict] = None, uri: Optional[Union[str, pathlib.Path]] = None)mmcv.fileio.file_client.FileClient[源代码]

Infer a suitable file client based on the URI and arguments.

参数
  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. Default: None.

  • uri (str | Path, optional) – Uri to be parsed that contains the file prefix. Default: None.

实际案例

>>> uri = 's3://path/of/your/file'
>>> file_client = FileClient.infer_client(uri=uri)
>>> file_client_args = {'backend': 'petrel'}
>>> file_client = FileClient.infer_client(file_client_args)
返回

Instantiated FileClient object.

返回类型

FileClient

isdir(filepath: Union[str, pathlib.Path])bool[源代码]

Check whether a file path is a directory.

参数

filepath (str or Path) – Path to be checked whether it is a directory.

返回

Return True if filepath points to a directory, False otherwise.

返回类型

bool

isfile(filepath: Union[str, pathlib.Path])bool[源代码]

Check whether a file path is a file.

参数

filepath (str or Path) – Path to be checked whether it is a file.

返回

Return True if filepath points to a file, False otherwise.

返回类型

bool

join_path(filepath: Union[str, pathlib.Path], *filepaths: Union[str, pathlib.Path])str[源代码]

Concatenate all file paths.

Join one or more filepath components intelligently. The return value is the concatenation of filepath and any members of *filepaths.

参数

filepath (str or Path) – Path to be concatenated.

返回

The result of concatenation.

返回类型

str

list_dir_or_file(dir_path: Union[str, pathlib.Path], list_dir: bool = True, list_file: bool = True, suffix: Optional[Union[str, Tuple[str]]] = None, recursive: bool = False)Iterator[str][源代码]

Scan a directory to find the interested directories or files in arbitrary order.

注解

list_dir_or_file() returns the path relative to dir_path.

参数
  • dir_path (str | Path) – Path of the directory.

  • list_dir (bool) – List the directories. Default: True.

  • list_file (bool) – List the path of files. Default: True.

  • suffix (str or tuple[str], optional) – File suffix that we are interested in. Default: None.

  • recursive (bool) – If set to True, recursively scan the directory. Default: False.

生成器

Iterable[str] – A relative path to dir_path.

static parse_uri_prefix(uri: Union[str, pathlib.Path])Optional[str][源代码]

Parse the prefix of a uri.

参数

uri (str | Path) – Uri to be parsed that contains the file prefix.

实际案例

>>> FileClient.parse_uri_prefix('s3://path/of/your/file')
's3'
返回

Return the prefix of uri if the uri contains ‘://’ else None.

返回类型

str | None

put(obj: bytes, filepath: Union[str, pathlib.Path])None[源代码]

Write data to a given filepath with ‘wb’ mode.

注解

put should create a directory if the directory of filepath does not exist.

参数
  • obj (bytes) – Data to be written.

  • filepath (str or Path) – Path to write data.

put_text(obj: str, filepath: Union[str, pathlib.Path])None[源代码]

Write data to a given filepath with ‘w’ mode.

注解

put_text should create a directory if the directory of filepath does not exist.

参数
  • obj (str) – Data to be written.

  • filepath (str or Path) – Path to write data.

  • encoding (str, optional) – The encoding format used to open the filepath. Default: ‘utf-8’.

classmethod register_backend(name, backend=None, force=False, prefixes=None)[源代码]

Register a backend to FileClient.

This method can be used as a normal class method or a decorator.

class NewBackend(BaseStorageBackend):

    def get(self, filepath):
        return filepath

    def get_text(self, filepath):
        return filepath

FileClient.register_backend('new', NewBackend)

or

@FileClient.register_backend('new')
class NewBackend(BaseStorageBackend):

    def get(self, filepath):
        return filepath

    def get_text(self, filepath):
        return filepath
参数
  • name (str) – The name of the registered backend.

  • backend (class, optional) – The backend class to be registered, which must be a subclass of BaseStorageBackend. When this method is used as a decorator, backend is None. Defaults to None.

  • force (bool, optional) – Whether to override the backend if the name has already been registered. Defaults to False.

  • prefixes (str or list[str] or tuple[str], optional) – The prefixes of the registered storage backend. Default: None. New in version 1.3.15.

remove(filepath: Union[str, pathlib.Path])None[源代码]

Remove a file.

参数

filepath (str, Path) – Path to be removed.

mmcv.fileio.dict_from_file(filename, key_type=<class 'str'>, encoding='utf-8', file_client_args=None)[源代码]

Load a text file and parse the content as a dict.

Each line of the text file will be two or more columns split by whitespaces or tabs. The first column will be parsed as dict keys, and the following columns will be parsed as dict values.

注解

In v1.3.16 and later, dict_from_file supports loading a text file which can be storaged in different backends and parsing the content as a dict.

参数
  • filename (str) – Filename.

  • key_type (type) – Type of the dict keys. str is user by default and type conversion will be performed if specified.

  • encoding (str) – Encoding used to open the file. Default utf-8.

  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None.

实际案例

>>> dict_from_file('/path/of/your/file')  # disk
{'key1': 'value1', 'key2': 'value2'}
>>> dict_from_file('s3://path/of/your/file')  # ceph or petrel
{'key1': 'value1', 'key2': 'value2'}
返回

The parsed contents.

返回类型

dict

mmcv.fileio.dump(obj, file=None, file_format=None, file_client_args=None, **kwargs)[源代码]

Dump data to json/yaml/pickle strings or files.

This method provides a unified api for dumping data as strings or to files, and also supports custom arguments for each file format.

注解

In v1.3.16 and later, dump supports dumping data as strings or to files which is saved to different backends.

参数
  • obj (any) – The python object to be dumped.

  • file (str or Path or file-like object, optional) – If not specified, then the object is dumped to a str, otherwise to a file specified by the filename or file-like object.

  • file_format (str, optional) – Same as load().

  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None.

实际案例

>>> dump('hello world', '/path/of/your/file')  # disk
>>> dump('hello world', 's3://path/of/your/file')  # ceph or petrel
返回

True for success, False otherwise.

返回类型

bool

mmcv.fileio.list_from_file(filename, prefix='', offset=0, max_num=0, encoding='utf-8', file_client_args=None)[源代码]

Load a text file and parse the content as a list of strings.

注解

In v1.3.16 and later, list_from_file supports loading a text file which can be storaged in different backends and parsing the content as a list for strings.

参数
  • filename (str) – Filename.

  • prefix (str) – The prefix to be inserted to the beginning of each item.

  • offset (int) – The offset of lines.

  • max_num (int) – The maximum number of lines to be read, zeros and negatives mean no limitation.

  • encoding (str) – Encoding used to open the file. Default utf-8.

  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None.

实际案例

>>> list_from_file('/path/of/your/file')  # disk
['hello', 'world']
>>> list_from_file('s3://path/of/your/file')  # ceph or petrel
['hello', 'world']
返回

A list of strings.

返回类型

list[str]

mmcv.fileio.load(file, file_format=None, file_client_args=None, **kwargs)[源代码]

Load data from json/yaml/pickle files.

This method provides a unified api for loading data from serialized files.

注解

In v1.3.16 and later, load supports loading data from serialized files those can be storaged in different backends.

参数
  • file (str or Path or file-like object) – Filename or a file-like object.

  • file_format (str, optional) – If not specified, the file format will be inferred from the file extension, otherwise use the specified one. Currently supported formats include “json”, “yaml/yml” and “pickle/pkl”.

  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None.

实际案例

>>> load('/path/of/your/file')  # file is storaged in disk
>>> load('https://path/of/your/file')  # file is storaged in Internet
>>> load('s3://path/of/your/file')  # file is storaged in petrel
返回

The content from the file.

image

mmcv.image.adjust_brightness(img, factor=1.0)[源代码]

Adjust image brightness.

This function controls the brightness of an image. An enhancement factor of 0.0 gives a black image. A factor of 1.0 gives the original image. This function blends the source image and the degenerated black image:

\[output = img * factor + degenerated * (1 - factor)\]
参数
  • img (ndarray) – Image to be brightened.

  • factor (float) – A value controls the enhancement. Factor 1.0 returns the original image, lower factors mean less color (brightness, contrast, etc), and higher values more. Default 1.

返回

The brightened image.

返回类型

ndarray

mmcv.image.adjust_color(img, alpha=1, beta=None, gamma=0)[源代码]

It blends the source image and its gray image:

\[output = img * alpha + gray\_img * beta + gamma\]
参数
  • img (ndarray) – The input source image.

  • alpha (int | float) – Weight for the source image. Default 1.

  • beta (int | float) – Weight for the converted gray image. If None, it’s assigned the value (1 - alpha).

  • gamma (int | float) – Scalar added to each sum. Same as cv2.addWeighted(). Default 0.

返回

Colored image which has the same size and dtype as input.

返回类型

ndarray

mmcv.image.adjust_contrast(img, factor=1.0)[源代码]

Adjust image contrast.

This function controls the contrast of an image. An enhancement factor of 0.0 gives a solid grey image. A factor of 1.0 gives the original image. It blends the source image and the degenerated mean image:

\[output = img * factor + degenerated * (1 - factor)\]
参数
  • img (ndarray) – Image to be contrasted. BGR order.

  • factor (float) – Same as mmcv.adjust_brightness().

返回

The contrasted image.

返回类型

ndarray

mmcv.image.adjust_lighting(img, eigval, eigvec, alphastd=0.1, to_rgb=True)[源代码]

AlexNet-style PCA jitter.

This data augmentation is proposed in ImageNet Classification with Deep Convolutional Neural Networks.

参数
  • img (ndarray) – Image to be adjusted lighting. BGR order.

  • eigval (ndarray) – the eigenvalue of the convariance matrix of pixel values, respectively.

  • eigvec (ndarray) – the eigenvector of the convariance matrix of pixel values, respectively.

  • alphastd (float) – The standard deviation for distribution of alpha. Defaults to 0.1

  • to_rgb (bool) – Whether to convert img to rgb.

返回

The adjusted image.

返回类型

ndarray

mmcv.image.adjust_sharpness(img, factor=1.0, kernel=None)[源代码]

Adjust image sharpness.

This function controls the sharpness of an image. An enhancement factor of 0.0 gives a blurred image. A factor of 1.0 gives the original image. And a factor of 2.0 gives a sharpened image. It blends the source image and the degenerated mean image:

\[output = img * factor + degenerated * (1 - factor)\]
参数
  • img (ndarray) – Image to be sharpened. BGR order.

  • factor (float) – Same as mmcv.adjust_brightness().

  • kernel (np.ndarray, optional) – Filter kernel to be applied on the img to obtain the degenerated img. Defaults to None.

注解

No value sanity check is enforced on the kernel set by users. So with an inappropriate kernel, the adjust_sharpness may fail to perform the function its name indicates but end up performing whatever transform determined by the kernel.

返回

The sharpened image.

返回类型

ndarray

mmcv.image.auto_contrast(img, cutoff=0)[源代码]

Auto adjust image contrast.

This function maximize (normalize) image contrast by first removing cutoff percent of the lightest and darkest pixels from the histogram and remapping the image so that the darkest pixel becomes black (0), and the lightest becomes white (255).

参数
  • img (ndarray) – Image to be contrasted. BGR order.

  • cutoff (int | float | tuple) – The cutoff percent of the lightest and darkest pixels to be removed. If given as tuple, it shall be (low, high). Otherwise, the single value will be used for both. Defaults to 0.

返回

The contrasted image.

返回类型

ndarray

mmcv.image.bgr2gray(img, keepdim=False)[源代码]

Convert a BGR image to grayscale image.

参数
  • img (ndarray) – The input image.

  • keepdim (bool) – If False (by default), then return the grayscale image with 2 dims, otherwise 3 dims.

返回

The converted grayscale image.

返回类型

ndarray

mmcv.image.bgr2hls(img)
Convert a BGR image to HLS

image.

参数

img (ndarray or str) – The input image.

返回

The converted HLS image.

返回类型

ndarray

mmcv.image.bgr2hsv(img)
Convert a BGR image to HSV

image.

参数

img (ndarray or str) – The input image.

返回

The converted HSV image.

返回类型

ndarray

mmcv.image.bgr2rgb(img)
Convert a BGR image to RGB

image.

参数

img (ndarray or str) – The input image.

返回

The converted RGB image.

返回类型

ndarray

mmcv.image.bgr2ycbcr(img, y_only=False)[源代码]

Convert a BGR image to YCbCr image.

The bgr version of rgb2ycbcr. It implements the ITU-R BT.601 conversion for standard-definition television. See more details in https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.

It differs from a similar function in cv2.cvtColor: BGR <-> YCrCb. In OpenCV, it implements a JPEG conversion. See more details in https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.

参数
  • img (ndarray) – The input image. It accepts: 1. np.uint8 type with range [0, 255]; 2. np.float32 type with range [0, 1].

  • y_only (bool) – Whether to only return Y channel. Default: False.

返回

The converted YCbCr image. The output image has the same type and range as input image.

返回类型

ndarray

mmcv.image.clahe(img, clip_limit=40.0, tile_grid_size=(8, 8))[源代码]

Use CLAHE method to process the image.

See ZUIDERVELD,K. Contrast Limited Adaptive Histogram Equalization[J]. Graphics Gems, 1994:474-485. for more information.

参数
  • img (ndarray) – Image to be processed.

  • clip_limit (float) – Threshold for contrast limiting. Default: 40.0.

  • tile_grid_size (tuple[int]) – Size of grid for histogram equalization. Input image will be divided into equally sized rectangular tiles. It defines the number of tiles in row and column. Default: (8, 8).

返回

The processed image.

返回类型

ndarray

mmcv.image.cutout(img, shape, pad_val=0)[源代码]

Randomly cut out a rectangle from the original img.

参数
  • img (ndarray) – Image to be cutout.

  • shape (int | tuple[int]) – Expected cutout shape (h, w). If given as a int, the value will be used for both h and w.

  • pad_val (int | float | tuple[int | float]) – Values to be filled in the cut area. Defaults to 0.

返回

The cutout image.

返回类型

ndarray

mmcv.image.gray2bgr(img)[源代码]

Convert a grayscale image to BGR image.

参数

img (ndarray) – The input image.

返回

The converted BGR image.

返回类型

ndarray

mmcv.image.gray2rgb(img)[源代码]

Convert a grayscale image to RGB image.

参数

img (ndarray) – The input image.

返回

The converted RGB image.

返回类型

ndarray

mmcv.image.hls2bgr(img)
Convert a HLS image to BGR

image.

参数

img (ndarray or str) – The input image.

返回

The converted BGR image.

返回类型

ndarray

mmcv.image.hsv2bgr(img)
Convert a HSV image to BGR

image.

参数

img (ndarray or str) – The input image.

返回

The converted BGR image.

返回类型

ndarray

mmcv.image.imconvert(img, src, dst)[源代码]

Convert an image from the src colorspace to dst colorspace.

参数
  • img (ndarray) – The input image.

  • src (str) – The source colorspace, e.g., ‘rgb’, ‘hsv’.

  • dst (str) – The destination colorspace, e.g., ‘rgb’, ‘hsv’.

返回

The converted image.

返回类型

ndarray

mmcv.image.imcrop(img, bboxes, scale=1.0, pad_fill=None)[源代码]

Crop image patches.

3 steps: scale the bboxes -> clip bboxes -> crop and pad.

参数
  • img (ndarray) – Image to be cropped.

  • bboxes (ndarray) – Shape (k, 4) or (4, ), location of cropped bboxes.

  • scale (float, optional) – Scale ratio of bboxes, the default value 1.0 means no padding.

  • pad_fill (Number | list[Number]) – Value to be filled for padding. Default: None, which means no padding.

返回

The cropped image patches.

返回类型

list[ndarray] | ndarray

mmcv.image.imequalize(img)[源代码]

Equalize the image histogram.

This function applies a non-linear mapping to the input image, in order to create a uniform distribution of grayscale values in the output image.

参数

img (ndarray) – Image to be equalized.

返回

The equalized image.

返回类型

ndarray

mmcv.image.imflip(img, direction='horizontal')[源代码]

Flip an image horizontally or vertically.

参数
  • img (ndarray) – Image to be flipped.

  • direction (str) – The flip direction, either “horizontal” or “vertical” or “diagonal”.

返回

The flipped image.

返回类型

ndarray

mmcv.image.imflip_(img, direction='horizontal')[源代码]

Inplace flip an image horizontally or vertically.

参数
  • img (ndarray) – Image to be flipped.

  • direction (str) – The flip direction, either “horizontal” or “vertical” or “diagonal”.

返回

The flipped image (inplace).

返回类型

ndarray

mmcv.image.imfrombytes(content, flag='color', channel_order='bgr', backend=None)[源代码]

Read an image from bytes.

参数
  • content (bytes) – Image bytes got from files or other streams.

  • flag (str) – Same as imread().

  • backend (str | None) – The image decoding backend type. Options are cv2, pillow, turbojpeg, tifffile, None. If backend is None, the global imread_backend specified by mmcv.use_backend() will be used. Default: None.

返回

Loaded image array.

返回类型

ndarray

实际案例

>>> img_path = '/path/to/img.jpg'
>>> with open(img_path, 'rb') as f:
>>>     img_buff = f.read()
>>> img = mmcv.imfrombytes(img_buff)
>>> img = mmcv.imfrombytes(img_buff, flag='color', channel_order='rgb')
>>> img = mmcv.imfrombytes(img_buff, backend='pillow')
>>> img = mmcv.imfrombytes(img_buff, backend='cv2')
mmcv.image.iminvert(img)[源代码]

Invert (negate) an image.

参数

img (ndarray) – Image to be inverted.

返回

The inverted image.

返回类型

ndarray

mmcv.image.imnormalize(img, mean, std, to_rgb=True)[源代码]

Normalize an image with mean and std.

参数
  • img (ndarray) – Image to be normalized.

  • mean (ndarray) – The mean to be used for normalize.

  • std (ndarray) – The std to be used for normalize.

  • to_rgb (bool) – Whether to convert to rgb.

返回

The normalized image.

返回类型

ndarray

mmcv.image.imnormalize_(img, mean, std, to_rgb=True)[源代码]

Inplace normalize an image with mean and std.

参数
  • img (ndarray) – Image to be normalized.

  • mean (ndarray) – The mean to be used for normalize.

  • std (ndarray) – The std to be used for normalize.

  • to_rgb (bool) – Whether to convert to rgb.

返回

The normalized image.

返回类型

ndarray

mmcv.image.impad(img, *, shape=None, padding=None, pad_val=0, padding_mode='constant')[源代码]

Pad the given image to a certain shape or pad on all sides with specified padding mode and padding value.

参数
  • img (ndarray) – Image to be padded.

  • shape (tuple[int]) – Expected padding shape (h, w). Default: None.

  • padding (int or tuple[int]) – Padding on each border. If a single int is provided this is used to pad all borders. If tuple of length 2 is provided this is the padding on left/right and top/bottom respectively. If a tuple of length 4 is provided this is the padding for the left, top, right and bottom borders respectively. Default: None. Note that shape and padding can not be both set.

  • pad_val (Number | Sequence[Number]) – Values to be filled in padding areas when padding_mode is ‘constant’. Default: 0.

  • padding_mode (str) –

    Type of padding. Should be: constant, edge, reflect or symmetric. Default: constant.

    • constant: pads with a constant value, this value is specified with pad_val.

    • edge: pads with the last value at the edge of the image.

    • reflect: pads with reflection of image without repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in reflect mode will result in [3, 2, 1, 2, 3, 4, 3, 2].

    • symmetric: pads with reflection of image repeating the last value on the edge. For example, padding [1, 2, 3, 4] with 2 elements on both sides in symmetric mode will result in [2, 1, 1, 2, 3, 4, 4, 3]

返回

The padded image.

返回类型

ndarray

mmcv.image.impad_to_multiple(img, divisor, pad_val=0)[源代码]

Pad an image to ensure each edge to be multiple to some number.

参数
  • img (ndarray) – Image to be padded.

  • divisor (int) – Padded image edges will be multiple to divisor.

  • pad_val (Number | Sequence[Number]) – Same as impad().

返回

The padded image.

返回类型

ndarray

mmcv.image.imread(img_or_path, flag='color', channel_order='bgr', backend=None, file_client_args=None)[源代码]

Read an image.

注解

In v1.4.1 and later, add file_client_args parameters.

参数
  • img_or_path (ndarray or str or Path) – Either a numpy array or str or pathlib.Path. If it is a numpy array (loaded image), then it will be returned as is.

  • flag (str) – Flags specifying the color type of a loaded image, candidates are color, grayscale, unchanged, color_ignore_orientation and grayscale_ignore_orientation. By default, cv2 and pillow backend would rotate the image according to its EXIF info unless called with unchanged or *_ignore_orientation flags. turbojpeg and tifffile backend always ignore image’s EXIF info regardless of the flag. The turbojpeg backend only supports color and grayscale.

  • channel_order (str) – Order of channel, candidates are bgr and rgb.

  • backend (str | None) – The image decoding backend type. Options are cv2, pillow, turbojpeg, tifffile, None. If backend is None, the global imread_backend specified by mmcv.use_backend() will be used. Default: None.

  • file_client_args (dict | None) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None.

返回

Loaded image array.

返回类型

ndarray

实际案例

>>> import mmcv
>>> img_path = '/path/to/img.jpg'
>>> img = mmcv.imread(img_path)
>>> img = mmcv.imread(img_path, flag='color', channel_order='rgb',
...     backend='cv2')
>>> img = mmcv.imread(img_path, flag='color', channel_order='bgr',
...     backend='pillow')
>>> s3_img_path = 's3://bucket/img.jpg'
>>> # infer the file backend by the prefix s3
>>> img = mmcv.imread(s3_img_path)
>>> # manually set the file backend petrel
>>> img = mmcv.imread(s3_img_path, file_client_args={
...     'backend': 'petrel'})
>>> http_img_path = 'http://path/to/img.jpg'
>>> img = mmcv.imread(http_img_path)
>>> img = mmcv.imread(http_img_path, file_client_args={
...     'backend': 'http'})
mmcv.image.imrescale(img, scale, return_scale=False, interpolation='bilinear', backend=None)[源代码]

Resize image while keeping the aspect ratio.

参数
  • img (ndarray) – The input image.

  • scale (float | tuple[int]) – The scaling factor or maximum size. If it is a float number, then the image will be rescaled by this factor, else if it is a tuple of 2 integers, then the image will be rescaled as large as possible within the scale.

  • return_scale (bool) – Whether to return the scaling factor besides the rescaled image.

  • interpolation (str) – Same as resize().

  • backend (str | None) – Same as resize().

返回

The rescaled image.

返回类型

ndarray

mmcv.image.imresize(img, size, return_scale=False, interpolation='bilinear', out=None, backend=None)[源代码]

Resize image to a given size.

参数
  • img (ndarray) – The input image.

  • size (tuple[int]) – Target size (w, h).

  • return_scale (bool) – Whether to return w_scale and h_scale.

  • interpolation (str) – Interpolation method, accepted values are “nearest”, “bilinear”, “bicubic”, “area”, “lanczos” for ‘cv2’ backend, “nearest”, “bilinear” for ‘pillow’ backend.

  • out (ndarray) – The output destination.

  • backend (str | None) – The image resize backend type. Options are cv2, pillow, None. If backend is None, the global imread_backend specified by mmcv.use_backend() will be used. Default: None.

返回

(resized_img, w_scale, h_scale) or resized_img.

返回类型

tuple | ndarray

mmcv.image.imresize_like(img, dst_img, return_scale=False, interpolation='bilinear', backend=None)[源代码]

Resize image to the same size of a given image.

参数
  • img (ndarray) – The input image.

  • dst_img (ndarray) – The target image.

  • return_scale (bool) – Whether to return w_scale and h_scale.

  • interpolation (str) – Same as resize().

  • backend (str | None) – Same as resize().

返回

(resized_img, w_scale, h_scale) or resized_img.

返回类型

tuple or ndarray

mmcv.image.imresize_to_multiple(img, divisor, size=None, scale_factor=None, keep_ratio=False, return_scale=False, interpolation='bilinear', out=None, backend=None)[源代码]

Resize image according to a given size or scale factor and then rounds up the the resized or rescaled image size to the nearest value that can be divided by the divisor.

参数
  • img (ndarray) – The input image.

  • divisor (int | tuple) – Resized image size will be a multiple of divisor. If divisor is a tuple, divisor should be (w_divisor, h_divisor).

  • size (None | int | tuple[int]) – Target size (w, h). Default: None.

  • scale_factor (None | float | tuple[float]) – Multiplier for spatial size. Should match input size if it is a tuple and the 2D style is (w_scale_factor, h_scale_factor). Default: None.

  • keep_ratio (bool) – Whether to keep the aspect ratio when resizing the image. Default: False.

  • return_scale (bool) – Whether to return w_scale and h_scale.

  • interpolation (str) – Interpolation method, accepted values are “nearest”, “bilinear”, “bicubic”, “area”, “lanczos” for ‘cv2’ backend, “nearest”, “bilinear” for ‘pillow’ backend.

  • out (ndarray) – The output destination.

  • backend (str | None) – The image resize backend type. Options are cv2, pillow, None. If backend is None, the global imread_backend specified by mmcv.use_backend() will be used. Default: None.

返回

(resized_img, w_scale, h_scale) or resized_img.

返回类型

tuple | ndarray

mmcv.image.imrotate(img, angle, center=None, scale=1.0, border_value=0, interpolation='bilinear', auto_bound=False)[源代码]

Rotate an image.

参数
  • img (ndarray) – Image to be rotated.

  • angle (float) – Rotation angle in degrees, positive values mean clockwise rotation.

  • center (tuple[float], optional) – Center point (w, h) of the rotation in the source image. If not specified, the center of the image will be used.

  • scale (float) – Isotropic scale factor.

  • border_value (int) – Border value.

  • interpolation (str) – Same as resize().

  • auto_bound (bool) – Whether to adjust the image size to cover the whole rotated image.

返回

The rotated image.

返回类型

ndarray

mmcv.image.imshear(img, magnitude, direction='horizontal', border_value=0, interpolation='bilinear')[源代码]

Shear an image.

参数
  • img (ndarray) – Image to be sheared with format (h, w) or (h, w, c).

  • magnitude (int | float) – The magnitude used for shear.

  • direction (str) – The flip direction, either “horizontal” or “vertical”.

  • border_value (int | tuple[int]) – Value used in case of a constant border.

  • interpolation (str) – Same as resize().

返回

The sheared image.

返回类型

ndarray

mmcv.image.imtranslate(img, offset, direction='horizontal', border_value=0, interpolation='bilinear')[源代码]

Translate an image.

参数
  • img (ndarray) – Image to be translated with format (h, w) or (h, w, c).

  • offset (int | float) – The offset used for translate.

  • direction (str) – The translate direction, either “horizontal” or “vertical”.

  • border_value (int | tuple[int]) – Value used in case of a constant border.

  • interpolation (str) – Same as resize().

返回

The translated image.

返回类型

ndarray

mmcv.image.imwrite(img, file_path, params=None, auto_mkdir=None, file_client_args=None)[源代码]

Write image to file.

注解

In v1.4.1 and later, add file_client_args parameters.

警告

The parameter auto_mkdir will be deprecated in the future and every file clients will make directory automatically.

参数
  • img (ndarray) – Image array to be written.

  • file_path (str) – Image file path.

  • params (None or list) – Same as opencv imwrite() interface.

  • auto_mkdir (bool) – If the parent folder of file_path does not exist, whether to create it automatically. It will be deprecated.

  • file_client_args (dict | None) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None.

返回

Successful or not.

返回类型

bool

实际案例

>>> # write to hard disk client
>>> ret = mmcv.imwrite(img, '/path/to/img.jpg')
>>> # infer the file backend by the prefix s3
>>> ret = mmcv.imwrite(img, 's3://bucket/img.jpg')
>>> # manually set the file backend petrel
>>> ret = mmcv.imwrite(img, 's3://bucket/img.jpg', file_client_args={
...     'backend': 'petrel'})
mmcv.image.lut_transform(img, lut_table)[源代码]

Transform array by look-up table.

The function lut_transform fills the output array with values from the look-up table. Indices of the entries are taken from the input array.

参数
  • img (ndarray) – Image to be transformed.

  • lut_table (ndarray) – look-up table of 256 elements; in case of multi-channel input array, the table should either have a single channel (in this case the same table is used for all channels) or the same number of channels as in the input array.

返回

The transformed image.

返回类型

ndarray

mmcv.image.posterize(img, bits)[源代码]

Posterize an image (reduce the number of bits for each color channel)

参数
  • img (ndarray) – Image to be posterized.

  • bits (int) – Number of bits (1 to 8) to use for posterizing.

返回

The posterized image.

返回类型

ndarray

mmcv.image.rescale_size(old_size, scale, return_scale=False)[源代码]

Calculate the new size to be rescaled to.

参数
  • old_size (tuple[int]) – The old size (w, h) of image.

  • scale (float | tuple[int]) – The scaling factor or maximum size. If it is a float number, then the image will be rescaled by this factor, else if it is a tuple of 2 integers, then the image will be rescaled as large as possible within the scale.

  • return_scale (bool) – Whether to return the scaling factor besides the rescaled image size.

返回

The new rescaled image size.

返回类型

tuple[int]

mmcv.image.rgb2bgr(img)
Convert a RGB image to BGR

image.

参数

img (ndarray or str) – The input image.

返回

The converted BGR image.

返回类型

ndarray

mmcv.image.rgb2gray(img, keepdim=False)[源代码]

Convert a RGB image to grayscale image.

参数
  • img (ndarray) – The input image.

  • keepdim (bool) – If False (by default), then return the grayscale image with 2 dims, otherwise 3 dims.

返回

The converted grayscale image.

返回类型

ndarray

mmcv.image.rgb2ycbcr(img, y_only=False)[源代码]

Convert a RGB image to YCbCr image.

This function produces the same results as Matlab’s rgb2ycbcr function. It implements the ITU-R BT.601 conversion for standard-definition television. See more details in https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.

It differs from a similar function in cv2.cvtColor: RGB <-> YCrCb. In OpenCV, it implements a JPEG conversion. See more details in https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.

参数
  • img (ndarray) – The input image. It accepts: 1. np.uint8 type with range [0, 255]; 2. np.float32 type with range [0, 1].

  • y_only (bool) – Whether to only return Y channel. Default: False.

返回

The converted YCbCr image. The output image has the same type and range as input image.

返回类型

ndarray

mmcv.image.solarize(img, thr=128)[源代码]

Solarize an image (invert all pixel values above a threshold)

参数
  • img (ndarray) – Image to be solarized.

  • thr (int) – Threshold for solarizing (0 - 255).

返回

The solarized image.

返回类型

ndarray

mmcv.image.tensor2imgs(tensor, mean=None, std=None, to_rgb=True)[源代码]

Convert tensor to 3-channel images or 1-channel gray images.

参数
  • tensor (torch.Tensor) – Tensor that contains multiple images, shape ( N, C, H, W). \(C\) can be either 3 or 1.

  • mean (tuple[float], optional) – Mean of images. If None, (0, 0, 0) will be used for tensor with 3-channel, while (0, ) for tensor with 1-channel. Defaults to None.

  • std (tuple[float], optional) – Standard deviation of images. If None, (1, 1, 1) will be used for tensor with 3-channel, while (1, ) for tensor with 1-channel. Defaults to None.

  • to_rgb (bool, optional) – Whether the tensor was converted to RGB format in the first place. If so, convert it back to BGR. For the tensor with 1 channel, it must be False. Defaults to True.

返回

A list that contains multiple images.

返回类型

list[np.ndarray]

mmcv.image.use_backend(backend)[源代码]

Select a backend for image decoding.

参数
  • backend (str) – The image decoding backend type. Options are cv2,

  • pillow – //github.com/lilohuang/PyTurboJPEG)

  • (see https (turbojpeg) – //github.com/lilohuang/PyTurboJPEG)

  • tifffile. turbojpeg is faster but it only supports .jpeg (and) –

  • format. (file) –

mmcv.image.ycbcr2bgr(img)[源代码]

Convert a YCbCr image to BGR image.

The bgr version of ycbcr2rgb. It implements the ITU-R BT.601 conversion for standard-definition television. See more details in https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.

It differs from a similar function in cv2.cvtColor: YCrCb <-> BGR. In OpenCV, it implements a JPEG conversion. See more details in https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.

参数

img (ndarray) – The input image. It accepts: 1. np.uint8 type with range [0, 255]; 2. np.float32 type with range [0, 1].

返回

The converted BGR image. The output image has the same type and range as input image.

返回类型

ndarray

mmcv.image.ycbcr2rgb(img)[源代码]

Convert a YCbCr image to RGB image.

This function produces the same results as Matlab’s ycbcr2rgb function. It implements the ITU-R BT.601 conversion for standard-definition television. See more details in https://en.wikipedia.org/wiki/YCbCr#ITU-R_BT.601_conversion.

It differs from a similar function in cv2.cvtColor: YCrCb <-> RGB. In OpenCV, it implements a JPEG conversion. See more details in https://en.wikipedia.org/wiki/YCbCr#JPEG_conversion.

参数

img (ndarray) – The input image. It accepts: 1. np.uint8 type with range [0, 255]; 2. np.float32 type with range [0, 1].

返回

The converted RGB image. The output image has the same type and range as input image.

返回类型

ndarray

video

class mmcv.video.VideoReader(filename, cache_capacity=10)[源代码]

Video class with similar usage to a list object.

This video warpper class provides convenient apis to access frames. There exists an issue of OpenCV’s VideoCapture class that jumping to a certain frame may be inaccurate. It is fixed in this class by checking the position after jumping each time. Cache is used when decoding videos. So if the same frame is visited for the second time, there is no need to decode again if it is stored in the cache.

实际案例

>>> import mmcv
>>> v = mmcv.VideoReader('sample.mp4')
>>> len(v)  # get the total frame number with `len()`
120
>>> for img in v:  # v is iterable
>>>     mmcv.imshow(img)
>>> v[5]  # get the 6th frame
current_frame()[源代码]

Get the current frame (frame that is just visited).

返回

If the video is fresh, return None, otherwise return the frame.

返回类型

ndarray or None

cvt2frames(frame_dir, file_start=0, filename_tmpl='{:06d}.jpg', start=0, max_num=0, show_progress=True)[源代码]

Convert a video to frame images.

参数
  • frame_dir (str) – Output directory to store all the frame images.

  • file_start (int) – Filenames will start from the specified number.

  • filename_tmpl (str) – Filename template with the index as the placeholder.

  • start (int) – The starting frame index.

  • max_num (int) – Maximum number of frames to be written.

  • show_progress (bool) – Whether to show a progress bar.

property fourcc

“Four character code” of the video.

Type

str

property fps

FPS of the video.

Type

float

property frame_cnt

Total frames of the video.

Type

int

get_frame(frame_id)[源代码]

Get frame by index.

参数

frame_id (int) – Index of the expected frame, 0-based.

返回

Return the frame if successful, otherwise None.

返回类型

ndarray or None

property height

Height of video frames.

Type

int

property opened

Indicate whether the video is opened.

Type

bool

property position

Current cursor position, indicating frame decoded.

Type

int

read()[源代码]

Read the next frame.

If the next frame have been decoded before and in the cache, then return it directly, otherwise decode, cache and return it.

返回

Return the frame if successful, otherwise None.

返回类型

ndarray or None

property resolution

Video resolution (width, height).

Type

tuple

property vcap

The raw VideoCapture object.

Type

cv2.VideoCapture

property width

Width of video frames.

Type

int

mmcv.video.concat_video(video_list, out_file, vcodec=None, acodec=None, log_level='info', print_cmd=False)[源代码]

Concatenate multiple videos into a single one.

参数
  • video_list (list) – A list of video filenames

  • out_file (str) – Output video filename

  • vcodec (None or str) – Output video codec, None for unchanged

  • acodec (None or str) – Output audio codec, None for unchanged

  • log_level (str) – Logging level of ffmpeg.

  • print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.convert_video(in_file, out_file, print_cmd=False, pre_options='', **kwargs)[源代码]

Convert a video with ffmpeg.

This provides a general api to ffmpeg, the executed command is:

`ffmpeg -y <pre_options> -i <in_file> <options> <out_file>`

Options(kwargs) are mapped to ffmpeg commands with the following rules:

  • key=val: “-key val”

  • key=True: “-key”

  • key=False: “”

参数
  • in_file (str) – Input video filename.

  • out_file (str) – Output video filename.

  • pre_options (str) – Options appears before “-i <in_file>”.

  • print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.cut_video(in_file, out_file, start=None, end=None, vcodec=None, acodec=None, log_level='info', print_cmd=False)[源代码]

Cut a clip from a video.

参数
  • in_file (str) – Input video filename.

  • out_file (str) – Output video filename.

  • start (None or float) – Start time (in seconds).

  • end (None or float) – End time (in seconds).

  • vcodec (None or str) – Output video codec, None for unchanged.

  • acodec (None or str) – Output audio codec, None for unchanged.

  • log_level (str) – Logging level of ffmpeg.

  • print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.dequantize_flow(dx, dy, max_val=0.02, denorm=True)[源代码]

Recover from quantized flow.

参数
  • dx (ndarray) – Quantized dx.

  • dy (ndarray) – Quantized dy.

  • max_val (float) – Maximum value used when quantizing.

  • denorm (bool) – Whether to multiply flow values with width/height.

返回

Dequantized flow.

返回类型

ndarray

mmcv.video.flow_from_bytes(content)[源代码]

Read dense optical flow from bytes.

注解

This load optical flow function works for FlyingChairs, FlyingThings3D, Sintel, FlyingChairsOcc datasets, but cannot load the data from ChairsSDHom.

参数

content (bytes) – Optical flow bytes got from files or other streams.

返回

Loaded optical flow with the shape (H, W, 2).

返回类型

ndarray

mmcv.video.flow_warp(img, flow, filling_value=0, interpolate_mode='nearest')[源代码]

Use flow to warp img.

参数
  • img (ndarray, float or uint8) – Image to be warped.

  • flow (ndarray, float) – Optical Flow.

  • filling_value (int) – The missing pixels will be set with filling_value.

  • interpolate_mode (str) – bilinear -> Bilinear Interpolation; nearest -> Nearest Neighbor.

返回

Warped image with the same shape of img

返回类型

ndarray

mmcv.video.flowread(flow_or_path, quantize=False, concat_axis=0, *args, **kwargs)[源代码]

Read an optical flow map.

参数
  • flow_or_path (ndarray or str) – A flow map or filepath.

  • quantize (bool) – whether to read quantized pair, if set to True, remaining args will be passed to dequantize_flow().

  • concat_axis (int) – The axis that dx and dy are concatenated, can be either 0 or 1. Ignored if quantize is False.

返回

Optical flow represented as a (h, w, 2) numpy array

返回类型

ndarray

mmcv.video.flowwrite(flow, filename, quantize=False, concat_axis=0, *args, **kwargs)[源代码]

Write optical flow to file.

If the flow is not quantized, it will be saved as a .flo file losslessly, otherwise a jpeg image which is lossy but of much smaller size. (dx and dy will be concatenated horizontally into a single image if quantize is True.)

参数
  • flow (ndarray) – (h, w, 2) array of optical flow.

  • filename (str) – Output filepath.

  • quantize (bool) – Whether to quantize the flow and save it to 2 jpeg images. If set to True, remaining args will be passed to quantize_flow().

  • concat_axis (int) – The axis that dx and dy are concatenated, can be either 0 or 1. Ignored if quantize is False.

mmcv.video.frames2video(frame_dir, video_file, fps=30, fourcc='XVID', filename_tmpl='{:06d}.jpg', start=0, end=0, show_progress=True)[源代码]

Read the frame images from a directory and join them as a video.

参数
  • frame_dir (str) – The directory containing video frames.

  • video_file (str) – Output filename.

  • fps (float) – FPS of the output video.

  • fourcc (str) – Fourcc of the output video, this should be compatible with the output file type.

  • filename_tmpl (str) – Filename template with the index as the variable.

  • start (int) – Starting frame index.

  • end (int) – Ending frame index.

  • show_progress (bool) – Whether to show a progress bar.

mmcv.video.quantize_flow(flow, max_val=0.02, norm=True)[源代码]

Quantize flow to [0, 255].

After this step, the size of flow will be much smaller, and can be dumped as jpeg images.

参数
  • flow (ndarray) – (h, w, 2) array of optical flow.

  • max_val (float) – Maximum value of flow, values beyond [-max_val, max_val] will be truncated.

  • norm (bool) – Whether to divide flow values by image width/height.

返回

Quantized dx and dy.

返回类型

tuple[ndarray]

mmcv.video.resize_video(in_file, out_file, size=None, ratio=None, keep_ar=False, log_level='info', print_cmd=False)[源代码]

Resize a video.

参数
  • in_file (str) – Input video filename.

  • out_file (str) – Output video filename.

  • size (tuple) – Expected size (w, h), eg, (320, 240) or (320, -1).

  • ratio (tuple or float) – Expected resize ratio, (2, 0.5) means (w*2, h*0.5).

  • keep_ar (bool) – Whether to keep original aspect ratio.

  • log_level (str) – Logging level of ffmpeg.

  • print_cmd (bool) – Whether to print the final ffmpeg command.

mmcv.video.sparse_flow_from_bytes(content)[源代码]

Read the optical flow in KITTI datasets from bytes.

This function is modified from RAFT load the KITTI datasets.

参数

content (bytes) – Optical flow bytes got from files or other streams.

返回

Loaded optical flow with the shape (H, W, 2) and flow valid mask with the shape (H, W).

返回类型

Tuple(ndarray, ndarray)

arraymisc

mmcv.arraymisc.dequantize(arr, min_val, max_val, levels, dtype=<class 'numpy.float64'>)[源代码]

Dequantize an array.

参数
  • arr (ndarray) – Input array.

  • min_val (scalar) – Minimum value to be clipped.

  • max_val (scalar) – Maximum value to be clipped.

  • levels (int) – Quantization levels.

  • dtype (np.type) – The type of the dequantized array.

返回

Dequantized array.

返回类型

tuple

mmcv.arraymisc.quantize(arr, min_val, max_val, levels, dtype=<class 'numpy.int64'>)[源代码]

Quantize an array of (-inf, inf) to [0, levels-1].

参数
  • arr (ndarray) – Input array.

  • min_val (scalar) – Minimum value to be clipped.

  • max_val (scalar) – Maximum value to be clipped.

  • levels (int) – Quantization levels.

  • dtype (np.type) – The type of the quantized array.

返回

Quantized array.

返回类型

tuple

visualization

class mmcv.visualization.Color(value)[源代码]

An enum that defines common colors.

Contains red, green, blue, cyan, yellow, magenta, white and black.

mmcv.visualization.color_val(color)[源代码]

Convert various input to color tuples.

参数

color (Color/str/tuple/int/ndarray) – Color inputs

返回

A tuple of 3 integers indicating BGR channels.

返回类型

tuple[int]

mmcv.visualization.flow2rgb(flow, color_wheel=None, unknown_thr=1000000.0)[源代码]

Convert flow map to RGB image.

参数
  • flow (ndarray) – Array of optical flow.

  • color_wheel (ndarray or None) – Color wheel used to map flow field to RGB colorspace. Default color wheel will be used if not specified.

  • unknown_thr (str) – Values above this threshold will be marked as unknown and thus ignored.

返回

RGB image that can be visualized.

返回类型

ndarray

mmcv.visualization.flowshow(flow, win_name='', wait_time=0)[源代码]

Show optical flow.

参数
  • flow (ndarray or str) – The optical flow to be displayed.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

mmcv.visualization.imshow(img, win_name='', wait_time=0)[源代码]

Show an image.

参数
  • img (str or ndarray) – The image to be displayed.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

mmcv.visualization.imshow_bboxes(img, bboxes, colors='green', top_k=- 1, thickness=1, show=True, win_name='', wait_time=0, out_file=None)[源代码]

Draw bboxes on an image.

参数
  • img (str or ndarray) – The image to be displayed.

  • bboxes (list or ndarray) – A list of ndarray of shape (k, 4).

  • colors (list[str or tuple or Color]) – A list of colors.

  • top_k (int) – Plot the first k bboxes only if set positive.

  • thickness (int) – Thickness of lines.

  • show (bool) – Whether to show the image.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

  • out_file (str, optional) – The filename to write the image.

返回

The image with bboxes drawn on it.

返回类型

ndarray

mmcv.visualization.imshow_det_bboxes(img, bboxes, labels, class_names=None, score_thr=0, bbox_color='green', text_color='green', thickness=1, font_scale=0.5, show=True, win_name='', wait_time=0, out_file=None)[源代码]

Draw bboxes and class labels (with scores) on an image.

参数
  • img (str or ndarray) – The image to be displayed.

  • bboxes (ndarray) – Bounding boxes (with scores), shaped (n, 4) or (n, 5).

  • labels (ndarray) – Labels of bboxes.

  • class_names (list[str]) – Names of each classes.

  • score_thr (float) – Minimum score of bboxes to be shown.

  • bbox_color (str or tuple or Color) – Color of bbox lines.

  • text_color (str or tuple or Color) – Color of texts.

  • thickness (int) – Thickness of lines.

  • font_scale (float) – Font scales of texts.

  • show (bool) – Whether to show the image.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

  • out_file (str or None) – The filename to write the image.

返回

The image with bboxes drawn on it.

返回类型

ndarray

mmcv.visualization.make_color_wheel(bins=None)[源代码]

Build a color wheel.

参数

bins (list or tuple, optional) – Specify the number of bins for each color range, corresponding to six ranges: red -> yellow, yellow -> green, green -> cyan, cyan -> blue, blue -> magenta, magenta -> red. [15, 6, 4, 11, 13, 6] is used for default (see Middlebury).

返回

Color wheel of shape (total_bins, 3).

返回类型

ndarray

utils

class mmcv.utils.BuildExtension(*args, **kwargs)[源代码]

A custom setuptools build extension .

This setuptools.build_ext subclass takes care of passing the minimum required compiler flags (e.g. -std=c++14) as well as mixed C++/CUDA compilation (and support for CUDA files in general).

When using BuildExtension, it is allowed to supply a dictionary for extra_compile_args (rather than the usual list) that maps from languages (cxx or nvcc) to a list of additional compiler flags to supply to the compiler. This makes it possible to supply different flags to the C++ and CUDA compiler during mixed compilation.

use_ninja (bool): If use_ninja is True (default), then we attempt to build using the Ninja backend. Ninja greatly speeds up compilation compared to the standard setuptools.build_ext. Fallbacks to the standard distutils backend if Ninja is not available.

注解

By default, the Ninja backend uses #CPUS + 2 workers to build the extension. This may use up too many resources on some systems. One can control the number of workers by setting the MAX_JOBS environment variable to a non-negative number.

finalize_options()None[源代码]

Set final values for all the options that this command supports. This is always called as late as possible, ie. after any option assignments from the command-line or from other commands have been done. Thus, this is the place to code option dependencies: if ‘foo’ depends on ‘bar’, then it is safe to set ‘foo’ from ‘bar’ as long as ‘foo’ still has the same value it was assigned in ‘initialize_options()’.

This method must be implemented by all command classes.

get_ext_filename(ext_name)[源代码]

Convert the name of an extension (eg. “foo.bar”) into the name of the file from which it will be loaded (eg. “foo/bar.so”, or “foobar.pyd”).

classmethod with_options(**options)[源代码]

Returns a subclass with alternative constructor that extends any original keyword arguments to the original constructor with the given options.

mmcv.utils.CUDAExtension(name, sources, *args, **kwargs)[源代码]

Creates a setuptools.Extension for CUDA/C++.

Convenience method that creates a setuptools.Extension with the bare minimum (but often sufficient) arguments to build a CUDA/C++ extension. This includes the CUDA include path, library path and runtime library.

All arguments are forwarded to the setuptools.Extension constructor.

示例

>>> from setuptools import setup
>>> from torch.utils.cpp_extension import BuildExtension, CUDAExtension
>>> setup(
        name='cuda_extension',
        ext_modules=[
            CUDAExtension(
                    name='cuda_extension',
                    sources=['extension.cpp', 'extension_kernel.cu'],
                    extra_compile_args={'cxx': ['-g'],
                                        'nvcc': ['-O2']})
        ],
        cmdclass={
            'build_ext': BuildExtension
        })

Compute capabilities:

By default the extension will be compiled to run on all archs of the cards visible during the building process of the extension, plus PTX. If down the road a new card is installed the extension may need to be recompiled. If a visible card has a compute capability (CC) that’s newer than the newest version for which your nvcc can build fully-compiled binaries, Pytorch will make nvcc fall back to building kernels with the newest version of PTX your nvcc does support (see below for details on PTX).

You can override the default behavior using TORCH_CUDA_ARCH_LIST to explicitly specify which CCs you want the extension to support:

TORCH_CUDA_ARCH_LIST=”6.1 8.6” python build_my_extension.py TORCH_CUDA_ARCH_LIST=”5.2 6.0 6.1 7.0 7.5 8.0 8.6+PTX” python build_my_extension.py

The +PTX option causes extension kernel binaries to include PTX instructions for the specified CC. PTX is an intermediate representation that allows kernels to runtime-compile for any CC >= the specified CC (for example, 8.6+PTX generates PTX that can runtime-compile for any GPU with CC >= 8.6). This improves your binary’s forward compatibility. However, relying on older PTX to provide forward compat by runtime-compiling for newer CCs can modestly reduce performance on those newer CCs. If you know exact CC(s) of the GPUs you want to target, you’re always better off specifying them individually. For example, if you want your extension to run on 8.0 and 8.6, “8.0+PTX” would work functionally because it includes PTX that can runtime-compile for 8.6, but “8.0 8.6” would be better.

Note that while it’s possible to include all supported archs, the more archs get included the slower the building process will be, as it will build a separate kernel image for each arch.

class mmcv.utils.Config(cfg_dict=None, cfg_text=None, filename=None)[源代码]

A facility for config and config files.

It supports common file formats as configs: python/json/yaml. The interface is the same as a dict object and also allows access config values as attributes.

示例

>>> cfg = Config(dict(a=1, b=dict(b1=[0, 1])))
>>> cfg.a
1
>>> cfg.b
{'b1': [0, 1]}
>>> cfg.b.b1
[0, 1]
>>> cfg = Config.fromfile('tests/data/config/a.py')
>>> cfg.filename
"/home/kchen/projects/mmcv/tests/data/config/a.py"
>>> cfg.item4
'test'
>>> cfg
"Config [path: /home/kchen/projects/mmcv/tests/data/config/a.py]: "
"{'item1': [1, 2], 'item2': {'a': 0}, 'item3': True, 'item4': 'test'}"
static auto_argparser(description=None)[源代码]

Generate argparser from config file automatically (experimental)

static fromstring(cfg_str, file_format)[源代码]

Generate config from config str.

参数
  • cfg_str (str) – Config str.

  • file_format (str) – Config file format corresponding to the config str. Only py/yml/yaml/json type are supported now!

返回

Config obj.

返回类型

Config

merge_from_dict(options, allow_list_keys=True)[源代码]

Merge list into cfg_dict.

Merge the dict parsed by MultipleKVAction into this cfg.

实际案例

>>> options = {'model.backbone.depth': 50,
...            'model.backbone.with_cp':True}
>>> cfg = Config(dict(model=dict(backbone=dict(type='ResNet'))))
>>> cfg.merge_from_dict(options)
>>> cfg_dict = super(Config, self).__getattribute__('_cfg_dict')
>>> assert cfg_dict == dict(
...     model=dict(backbone=dict(depth=50, with_cp=True)))
>>> # Merge list element
>>> cfg = Config(dict(pipeline=[
...     dict(type='LoadImage'), dict(type='LoadAnnotations')]))
>>> options = dict(pipeline={'0': dict(type='SelfLoadImage')})
>>> cfg.merge_from_dict(options, allow_list_keys=True)
>>> cfg_dict = super(Config, self).__getattribute__('_cfg_dict')
>>> assert cfg_dict == dict(pipeline=[
...     dict(type='SelfLoadImage'), dict(type='LoadAnnotations')])
参数
  • options (dict) – dict of configs to merge from.

  • allow_list_keys (bool) – If True, int string keys (e.g. ‘0’, ‘1’) are allowed in options and will replace the element of the corresponding index in the config if the config is a list. Default: True.

class mmcv.utils.ConfigDict(*args, **kwargs)[源代码]
mmcv.utils.CppExtension(name, sources, *args, **kwargs)[源代码]

Creates a setuptools.Extension for C++.

Convenience method that creates a setuptools.Extension with the bare minimum (but often sufficient) arguments to build a C++ extension.

All arguments are forwarded to the setuptools.Extension constructor.

示例

>>> from setuptools import setup
>>> from torch.utils.cpp_extension import BuildExtension, CppExtension
>>> setup(
        name='extension',
        ext_modules=[
            CppExtension(
                name='extension',
                sources=['extension.cpp'],
                extra_compile_args=['-g']),
        ],
        cmdclass={
            'build_ext': BuildExtension
        })
class mmcv.utils.DataLoader(dataset: torch.utils.data.dataset.Dataset[torch.utils.data.dataloader.T_co], batch_size: Optional[int] = 1, shuffle: bool = False, sampler: Optional[torch.utils.data.sampler.Sampler] = None, batch_sampler: Optional[torch.utils.data.sampler.Sampler[Sequence]] = None, num_workers: int = 0, collate_fn: Optional[Callable[[List[torch.utils.data.dataloader.T]], Any]] = None, pin_memory: bool = False, drop_last: bool = False, timeout: float = 0, worker_init_fn: Optional[Callable[[int], None]] = None, multiprocessing_context=None, generator=None, *, prefetch_factor: int = 2, persistent_workers: bool = False)[源代码]

Data loader. Combines a dataset and a sampler, and provides an iterable over the given dataset.

The DataLoader supports both map-style and iterable-style datasets with single- or multi-process loading, customizing loading order and optional automatic batching (collation) and memory pinning.

See torch.utils.data documentation page for more details.

参数
  • dataset (Dataset) – dataset from which to load the data.

  • batch_size (int, optional) – how many samples per batch to load (default: 1).

  • shuffle (bool, optional) – set to True to have the data reshuffled at every epoch (default: False).

  • sampler (Sampler or Iterable, optional) – defines the strategy to draw samples from the dataset. Can be any Iterable with __len__ implemented. If specified, shuffle must not be specified.

  • batch_sampler (Sampler or Iterable, optional) – like sampler, but returns a batch of indices at a time. Mutually exclusive with batch_size, shuffle, sampler, and drop_last.

  • num_workers (int, optional) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)

  • collate_fn (callable, optional) – merges a list of samples to form a mini-batch of Tensor(s). Used when using batched loading from a map-style dataset.

  • pin_memory (bool, optional) – If True, the data loader will copy Tensors into CUDA pinned memory before returning them. If your data elements are a custom type, or your collate_fn returns a batch that is a custom type, see the example below.

  • drop_last (bool, optional) – set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False)

  • timeout (numeric, optional) – if positive, the timeout value for collecting a batch from workers. Should always be non-negative. (default: 0)

  • worker_init_fn (callable, optional) – If not None, this will be called on each worker subprocess with the worker id (an int in [0, num_workers - 1]) as input, after seeding and before data loading. (default: None)

  • generator (torch.Generator, optional) – If not None, this RNG will be used by RandomSampler to generate random indexes and multiprocessing to generate base_seed for workers. (default: None)

  • prefetch_factor (int, optional, keyword-only arg) – Number of samples loaded in advance by each worker. 2 means there will be a total of 2 * num_workers samples prefetched across all workers. (default: 2)

  • persistent_workers (bool, optional) – If True, the data loader will not shutdown the worker processes after a dataset has been consumed once. This allows to maintain the workers Dataset instances alive. (default: False)

警告

If the spawn start method is used, worker_init_fn cannot be an unpicklable object, e.g., a lambda function. See multiprocessing-best-practices on more details related to multiprocessing in PyTorch.

警告

len(dataloader) heuristic is based on the length of the sampler used. When dataset is an IterableDataset, it instead returns an estimate based on len(dataset) / batch_size, with proper rounding depending on drop_last, regardless of multi-process loading configurations. This represents the best guess PyTorch can make because PyTorch trusts user dataset code in correctly handling multi-process loading to avoid duplicate data.

However, if sharding results in multiple workers having incomplete last batches, this estimate can still be inaccurate, because (1) an otherwise complete batch can be broken into multiple ones and (2) more than one batch worth of samples can be dropped when drop_last is set. Unfortunately, PyTorch can not detect such cases in general.

See `Dataset Types`_ for more details on these two types of datasets and how IterableDataset interacts with `Multi-process data loading`_.

警告

See reproducibility, and dataloader-workers-random-seed, and data-loading-randomness notes for random seed related questions.

class mmcv.utils.DictAction(option_strings, dest, nargs=None, const=None, default=None, type=None, choices=None, required=False, help=None, metavar=None)[源代码]

argparse action to split an argument into KEY=VALUE form on the first = and append to a dictionary. List options can be passed as comma separated values, i.e ‘KEY=V1,V2,V3’, or with explicit brackets, i.e. ‘KEY=[V1,V2,V3]’. It also support nested brackets to build list/tuple values. e.g. ‘KEY=[(V1,V2),(V3,V4)]’

mmcv.utils.PoolDataLoader

alias of torch.utils.data.dataloader.DataLoader

class mmcv.utils.ProgressBar(task_num=0, bar_width=50, start=True, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[源代码]

A progress bar which can print the progress.

class mmcv.utils.Registry(name, build_func=None, parent=None, scope=None)[源代码]

A registry to map strings to classes.

Registered object could be built from registry.

示例

>>> MODELS = Registry('models')
>>> @MODELS.register_module()
>>> class ResNet:
>>>     pass
>>> resnet = MODELS.build(dict(type='ResNet'))

Please refer to https://mmcv.readthedocs.io/en/latest/understand_mmcv/registry.html for advanced usage.

参数
  • name (str) – Registry name.

  • build_func (func, optional) – Build function to construct instance from Registry, func:build_from_cfg is used if neither parent or build_func is specified. If parent is specified and build_func is not given, build_func will be inherited from parent. Default: None.

  • parent (Registry, optional) – Parent registry. The class registered in children registry could be built from parent. Default: None.

  • scope (str, optional) – The scope of registry. It is the key to search for children registry. If not specified, scope will be the name of the package where class is defined, e.g. mmdet, mmcls, mmseg. Default: None.

get(key)[源代码]

Get the registry record.

参数

key (str) – The class name in string format.

返回

The corresponding class.

返回类型

class

static infer_scope()[源代码]

Infer the scope of registry.

The name of the package where registry is defined will be returned.

示例

>>> # in mmdet/models/backbone/resnet.py
>>> MODELS = Registry('models')
>>> @MODELS.register_module()
>>> class ResNet:
>>>     pass
The scope of ``ResNet`` will be ``mmdet``.
返回

The inferred scope name.

返回类型

str

register_module(name=None, force=False, module=None)[源代码]

Register a module.

A record will be added to self._module_dict, whose key is the class name or the specified name, and value is the class itself. It can be used as a decorator or a normal function.

示例

>>> backbones = Registry('backbone')
>>> @backbones.register_module()
>>> class ResNet:
>>>     pass
>>> backbones = Registry('backbone')
>>> @backbones.register_module(name='mnet')
>>> class MobileNet:
>>>     pass
>>> backbones = Registry('backbone')
>>> class ResNet:
>>>     pass
>>> backbones.register_module(ResNet)
参数
  • name (str | None) – The module name to be registered. If not specified, the class name will be used.

  • force (bool, optional) – Whether to override an existing class with the same name. Default: False.

  • module (type) – Module class to be registered.

static split_scope_key(key)[源代码]

Split scope and key.

The first scope will be split from key.

实际案例

>>> Registry.split_scope_key('mmdet.ResNet')
'mmdet', 'ResNet'
>>> Registry.split_scope_key('ResNet')
None, 'ResNet'
返回

The former element is the first scope of the key, which can be None. The latter is the remaining key.

返回类型

tuple[str | None, str]

class mmcv.utils.SyncBatchNorm(num_features: int, eps: float = 1e-05, momentum: float = 0.1, affine: bool = True, track_running_stats: bool = True, process_group: Optional[Any] = None, device=None, dtype=None)[源代码]
class mmcv.utils.Timer(start=True, print_tmpl=None)[源代码]

A flexible Timer class.

实际案例

>>> import time
>>> import mmcv
>>> with mmcv.Timer():
>>>     # simulate a code block that will run for 1s
>>>     time.sleep(1)
1.000
>>> with mmcv.Timer(print_tmpl='it takes {:.1f} seconds'):
>>>     # simulate a code block that will run for 1s
>>>     time.sleep(1)
it takes 1.0 seconds
>>> timer = mmcv.Timer()
>>> time.sleep(0.5)
>>> print(timer.since_start())
0.500
>>> time.sleep(0.5)
>>> print(timer.since_last_check())
0.500
>>> print(timer.since_start())
1.000
property is_running

indicate whether the timer is running

Type

bool

since_last_check()[源代码]

Time since the last checking.

Either since_start() or since_last_check() is a checking operation.

返回

Time in seconds.

返回类型

float

since_start()[源代码]

Total time since the timer is started.

返回

Time in seconds.

返回类型

float

start()[源代码]

Start the timer.

exception mmcv.utils.TimerError(message)[源代码]
mmcv.utils.assert_attrs_equal(obj: Any, expected_attrs: Dict[str, Any])bool[源代码]

Check if attribute of class object is correct.

参数
  • obj (object) – Class object to be checked.

  • expected_attrs (Dict[str, Any]) – Dict of the expected attrs.

返回

Whether the attribute of class object is correct.

返回类型

bool

mmcv.utils.assert_dict_contains_subset(dict_obj: Dict[Any, Any], expected_subset: Dict[Any, Any])bool[源代码]

Check if the dict_obj contains the expected_subset.

参数
  • dict_obj (Dict[Any, Any]) – Dict object to be checked.

  • expected_subset (Dict[Any, Any]) – Subset expected to be contained in dict_obj.

返回

Whether the dict_obj contains the expected_subset.

返回类型

bool

mmcv.utils.assert_dict_has_keys(obj: Dict[str, Any], expected_keys: List[str])bool[源代码]

Check if the obj has all the expected_keys.

参数
  • obj (Dict[str, Any]) – Object to be checked.

  • expected_keys (List[str]) – Keys expected to contained in the keys of the obj.

返回

Whether the obj has the expected keys.

返回类型

bool

mmcv.utils.assert_is_norm_layer(module)bool[源代码]

Check if the module is a norm layer.

参数

module (nn.Module) – The module to be checked.

返回

Whether the module is a norm layer.

返回类型

bool

mmcv.utils.assert_keys_equal(result_keys: List[str], target_keys: List[str])bool[源代码]

Check if target_keys is equal to result_keys.

参数
  • result_keys (List[str]) – Result keys to be checked.

  • target_keys (List[str]) – Target keys to be checked.

返回

Whether target_keys is equal to result_keys.

返回类型

bool

mmcv.utils.assert_params_all_zeros(module)bool[源代码]

Check if the parameters of the module is all zeros.

参数

module (nn.Module) – The module to be checked.

返回

Whether the parameters of the module is all zeros.

返回类型

bool

mmcv.utils.build_from_cfg(cfg, registry, default_args=None)[源代码]

Build a module from config dict.

参数
  • cfg (dict) – Config dict. It should at least contain the key “type”.

  • registry (Registry) – The registry to search the type from.

  • default_args (dict, optional) – Default initialization arguments.

返回

The constructed object.

返回类型

object

mmcv.utils.check_prerequisites(prerequisites, checker, msg_tmpl='Prerequisites "{}" are required in method "{}" but not found, please install them first.')[源代码]

A decorator factory to check if prerequisites are satisfied.

参数
  • prerequisites (str of list[str]) – Prerequisites to be checked.

  • checker (callable) – The checker method that returns True if a prerequisite is meet, False otherwise.

  • msg_tmpl (str) – The message template with two variables.

返回

A specific decorator.

返回类型

decorator

mmcv.utils.check_python_script(cmd)[源代码]

Run the python cmd script with __main__. The difference between os.system is that, this function exectues code in the current process, so that it can be tracked by coverage tools. Currently it supports two forms:

  • ./tests/data/scripts/hello.py zz

  • python tests/data/scripts/hello.py zz

mmcv.utils.check_time(timer_id)[源代码]

Add check points in a single line.

This method is suitable for running a task on a list of items. A timer will be registered when the method is called for the first time.

实际案例

>>> import time
>>> import mmcv
>>> for i in range(1, 6):
>>>     # simulate a code block
>>>     time.sleep(i)
>>>     mmcv.check_time('task1')
2.000
3.000
4.000
5.000
参数

str – Timer identifier.

mmcv.utils.collect_env()[源代码]

Collect the information of the running environments.

返回

The environment information. The following fields are contained.

  • sys.platform: The variable of sys.platform.

  • Python: Python version.

  • CUDA available: Bool, indicating if CUDA is available.

  • GPU devices: Device type of each GPU.

  • CUDA_HOME (optional): The env var CUDA_HOME.

  • NVCC (optional): NVCC version.

  • GCC: GCC version, “n/a” if GCC is not installed.

  • PyTorch: PyTorch version.

  • PyTorch compiling details: The output of torch.__config__.show().

  • TorchVision (optional): TorchVision version.

  • OpenCV: OpenCV version.

  • MMCV: MMCV version.

  • MMCV Compiler: The GCC version for compiling MMCV ops.

  • MMCV CUDA Compiler: The CUDA version for compiling MMCV ops.

返回类型

dict

mmcv.utils.concat_list(in_list)[源代码]

Concatenate a list of list into a single list.

参数

in_list (list) – The list of list to be merged.

返回

The concatenated flat list.

返回类型

list

mmcv.utils.deprecated_api_warning(name_dict, cls_name=None)[源代码]

A decorator to check if some arguments are deprecate and try to replace deprecate src_arg_name to dst_arg_name.

参数

name_dict (dict) – key (str): Deprecate argument names. val (str): Expected argument names.

返回

New function.

返回类型

func

mmcv.utils.digit_version(version_str: str, length: int = 4)[源代码]

Convert a version string into a tuple of integers.

This method is usually used for comparing two versions. For pre-release versions: alpha < beta < rc.

参数
  • version_str (str) – The version string.

  • length (int) – The maximum number of version levels. Default: 4.

返回

The version info in digits (integers).

返回类型

tuple[int]

mmcv.utils.get_git_hash(fallback='unknown', digits=None)[源代码]

Get the git hash of the current repo.

参数
  • fallback (str, optional) – The fallback string when git hash is unavailable. Defaults to ‘unknown’.

  • digits (int, optional) – kept digits of the hash. Defaults to None, meaning all digits are kept.

返回

Git commit hash.

返回类型

str

mmcv.utils.get_logger(name, log_file=None, log_level=20, file_mode='w')[源代码]

Initialize and get a logger by name.

If the logger has not been initialized, this method will initialize the logger by adding one or two handlers, otherwise the initialized logger will be directly returned. During initialization, a StreamHandler will always be added. If log_file is specified and the process rank is 0, a FileHandler will also be added.

参数
  • name (str) – Logger name.

  • log_file (str | None) – The log filename. If specified, a FileHandler will be added to the logger.

  • log_level (int) – The logger level. Note that only the process of rank 0 is affected, and other processes will set the level to “Error” thus be silent most of the time.

  • file_mode (str) – The file mode used in opening log file. Defaults to ‘w’.

返回

The expected logger.

返回类型

logging.Logger

mmcv.utils.has_method(obj: object, method: str)bool[源代码]

Check whether the object has a method.

参数
  • method (str) – The method name to check.

  • obj (object) – The object to check.

返回

True if the object has the method else False.

返回类型

bool

mmcv.utils.import_modules_from_strings(imports, allow_failed_imports=False)[源代码]

Import modules from the given list of strings.

参数
  • imports (list | str | None) – The given module names to be imported.

  • allow_failed_imports (bool) – If True, the failed imports will return None. Otherwise, an ImportError is raise. Default: False.

返回

The imported modules.

返回类型

list[module] | module | None

实际案例

>>> osp, sys = import_modules_from_strings(
...     ['os.path', 'sys'])
>>> import os.path as osp_
>>> import sys as sys_
>>> assert osp == osp_
>>> assert sys == sys_
mmcv.utils.is_list_of(seq, expected_type)[源代码]

Check whether it is a list of some type.

A partial method of is_seq_of().

mmcv.utils.is_method_overridden(method, base_class, derived_class)[源代码]

Check if a method of base class is overridden in derived class.

参数
  • method (str) – the method name to check.

  • base_class (type) – the class of the base class.

  • derived_class (type | Any) – the class or instance of the derived class.

mmcv.utils.is_seq_of(seq, expected_type, seq_type=None)[源代码]

Check whether it is a sequence of some type.

参数
  • seq (Sequence) – The sequence to be checked.

  • expected_type (type) – Expected type of sequence items.

  • seq_type (type, optional) – Expected sequence type.

返回

Whether the sequence is valid.

返回类型

bool

mmcv.utils.is_str(x)[源代码]

Whether the input is an string instance.

Note: This method is deprecated since python 2 is no longer supported.

mmcv.utils.is_tuple_of(seq, expected_type)[源代码]

Check whether it is a tuple of some type.

A partial method of is_seq_of().

mmcv.utils.iter_cast(inputs, dst_type, return_type=None)[源代码]

Cast elements of an iterable object into some type.

参数
  • inputs (Iterable) – The input object.

  • dst_type (type) – Destination type.

  • return_type (type, optional) – If specified, the output object will be converted to this type, otherwise an iterator.

返回

The converted object.

返回类型

iterator or specified type

mmcv.utils.list_cast(inputs, dst_type)[源代码]

Cast elements of an iterable object into a list of some type.

A partial method of iter_cast().

mmcv.utils.load_url(url, model_dir=None, map_location=None, progress=True, check_hash=False, file_name=None)

Loads the Torch serialized object at the given URL.

If downloaded file is a zip file, it will be automatically decompressed.

If the object is already present in model_dir, it’s deserialized and returned. The default value of model_dir is <hub_dir>/checkpoints where hub_dir is the directory returned by get_dir().

参数
  • url (string) – URL of the object to download

  • model_dir (string, optional) – directory in which to save the object

  • map_location (optional) – a function or a dict specifying how to remap storage locations (see torch.load)

  • progress (bool, optional) – whether or not to display a progress bar to stderr. Default: True

  • check_hash (bool, optional) – If True, the filename part of the URL should follow the naming convention filename-<sha256>.ext where <sha256> is the first eight or more digits of the SHA256 hash of the contents of the file. The hash is used to ensure unique names and to verify the contents of the file. Default: False

  • file_name (string, optional) – name for the downloaded file. Filename from url will be used if not set.

示例

>>> state_dict = torch.hub.load_state_dict_from_url('https://s3.amazonaws.com/pytorch/models/resnet18-5c106cde.pth')
mmcv.utils.print_log(msg, logger=None, level=20)[源代码]

Print a log message.

参数
  • msg (str) – The message to be logged.

  • logger (logging.Logger | str | None) – The logger to be used. Some special loggers are: - “silent”: no message will be printed. - other str: the logger obtained with get_root_logger(logger). - None: The print() method will be used to print log messages.

  • level (int) – Logging level. Only available when logger is a Logger object or “root”.

mmcv.utils.requires_executable(prerequisites)[源代码]

A decorator to check if some executable files are installed.

示例

>>> @requires_executable('ffmpeg')
>>> func(arg1, args):
>>>     print(1)
1
mmcv.utils.requires_package(prerequisites)[源代码]

A decorator to check if some python packages are installed.

示例

>>> @requires_package('numpy')
>>> func(arg1, args):
>>>     return numpy.zeros(1)
array([0.])
>>> @requires_package(['numpy', 'non_package'])
>>> func(arg1, args):
>>>     return numpy.zeros(1)
ImportError
mmcv.utils.scandir(dir_path, suffix=None, recursive=False, case_sensitive=True)[源代码]

Scan a directory to find the interested files.

参数
  • dir_path (str | Path) – Path of the directory.

  • suffix (str | tuple(str), optional) – File suffix that we are interested in. Default: None.

  • recursive (bool, optional) – If set to True, recursively scan the directory. Default: False.

  • case_sensitive (bool, optional) – If set to False, ignore the case of suffix. Default: True.

返回

A generator for all the interested files with relative paths.

mmcv.utils.slice_list(in_list, lens)[源代码]

Slice a list into several sub lists by a list of given length.

参数
  • in_list (list) – The list to be sliced.

  • lens (int or list) – The expected length of each out list.

返回

A list of sliced list.

返回类型

list

mmcv.utils.track_iter_progress(tasks, bar_width=50, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[源代码]

Track the progress of tasks iteration or enumeration with a progress bar.

Tasks are yielded with a simple for-loop.

参数
  • tasks (list or tuple[Iterable, int]) – A list of tasks or (tasks, total num).

  • bar_width (int) – Width of progress bar.

生成器

list – The task results.

mmcv.utils.track_parallel_progress(func, tasks, nproc, initializer=None, initargs=None, bar_width=50, chunksize=1, skip_first=False, keep_order=True, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[源代码]

Track the progress of parallel task execution with a progress bar.

The built-in multiprocessing module is used for process pools and tasks are done with Pool.map() or Pool.imap_unordered().

参数
  • func (callable) – The function to be applied to each task.

  • tasks (list or tuple[Iterable, int]) – A list of tasks or (tasks, total num).

  • nproc (int) – Process (worker) number.

  • initializer (None or callable) – Refer to multiprocessing.Pool for details.

  • initargs (None or tuple) – Refer to multiprocessing.Pool for details.

  • chunksize (int) – Refer to multiprocessing.Pool for details.

  • bar_width (int) – Width of progress bar.

  • skip_first (bool) – Whether to skip the first sample for each worker when estimating fps, since the initialization step may takes longer.

  • keep_order (bool) – If True, Pool.imap() is used, otherwise Pool.imap_unordered() is used.

返回

The task results.

返回类型

list

mmcv.utils.track_progress(func, tasks, bar_width=50, file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>, **kwargs)[源代码]

Track the progress of tasks execution with a progress bar.

Tasks are done with a simple for-loop.

参数
  • func (callable) – The function to be applied to each task.

  • tasks (list or tuple[Iterable, int]) – A list of tasks or (tasks, total num).

  • bar_width (int) – Width of progress bar.

返回

The task results.

返回类型

list

mmcv.utils.tuple_cast(inputs, dst_type)[源代码]

Cast elements of an iterable object into a tuple of some type.

A partial method of iter_cast().

cnn

class mmcv.cnn.AlexNet(num_classes=- 1)[源代码]

AlexNet backbone.

参数

num_classes (int) – number of classes for classification.

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.Caffe2XavierInit(**kwargs)[源代码]
class mmcv.cnn.ConstantInit(val, **kwargs)[源代码]

Initialize module parameters with constant values.

参数
  • val (int | float) – the value to fill the weights in the module with

  • bias (int | float) – the value to fill the bias. Defaults to 0.

  • bias_prob (float, optional) – the probability for bias initialization. Defaults to None.

  • layer (str | list[str], optional) – the layer will be initialized. Defaults to None.

class mmcv.cnn.ContextBlock(in_channels, ratio, pooling_type='att', fusion_types=('channel_add'))[源代码]

ContextBlock module in GCNet.

See ‘GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond’ (https://arxiv.org/abs/1904.11492) for details.

参数
  • in_channels (int) – Channels of the input feature map.

  • ratio (float) – Ratio of channels of transform bottleneck

  • pooling_type (str) – Pooling method for context modeling. Options are ‘att’ and ‘avg’, stand for attention pooling and average pooling respectively. Default: ‘att’.

  • fusion_types (Sequence[str]) – Fusion method for feature fusion, Options are ‘channels_add’, ‘channel_mul’, stand for channelwise addition and multiplication respectively. Default: (‘channel_add’,)

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.Conv2d(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int]] = 1, padding: Union[str, int, Tuple[int, int]] = 0, dilation: Union[int, Tuple[int, int]] = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', device=None, dtype=None)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.Conv3d(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int, int]], stride: Union[int, Tuple[int, int, int]] = 1, padding: Union[str, int, Tuple[int, int, int]] = 0, dilation: Union[int, Tuple[int, int, int]] = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros', device=None, dtype=None)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.ConvAWS2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)[源代码]

AWS (Adaptive Weight Standardization)

This is a variant of Weight Standardization (https://arxiv.org/pdf/1903.10520.pdf) It is used in DetectoRS to avoid NaN (https://arxiv.org/pdf/2006.02334.pdf)

参数
  • in_channels (int) – Number of channels in the input image

  • out_channels (int) – Number of channels produced by the convolution

  • kernel_size (int or tuple) – Size of the conv kernel

  • stride (int or tuple, optional) – Stride of the convolution. Default: 1

  • padding (int or tuple, optional) – Zero-padding added to both sides of the input. Default: 0

  • dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1

  • groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1

  • bias (bool, optional) – If set True, adds a learnable bias to the output. Default: True

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.ConvModule(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias='auto', conv_cfg=None, norm_cfg=None, act_cfg={'type': 'ReLU'}, inplace=True, with_spectral_norm=False, padding_mode='zeros', order=('conv', 'norm', 'act'))[源代码]

A conv block that bundles conv/norm/activation layers.

This block simplifies the usage of convolution layers, which are commonly used with a norm layer (e.g., BatchNorm) and activation layer (e.g., ReLU). It is based upon three build methods: build_conv_layer(), build_norm_layer() and build_activation_layer().

Besides, we add some additional features in this module. 1. Automatically set bias of the conv layer. 2. Spectral norm is supported. 3. More padding modes are supported. Before PyTorch 1.5, nn.Conv2d only supports zero and circular padding, and we add “reflect” padding mode.

参数
  • in_channels (int) – Number of channels in the input feature map. Same as that in nn._ConvNd.

  • out_channels (int) – Number of channels produced by the convolution. Same as that in nn._ConvNd.

  • kernel_size (int | tuple[int]) – Size of the convolving kernel. Same as that in nn._ConvNd.

  • stride (int | tuple[int]) – Stride of the convolution. Same as that in nn._ConvNd.

  • padding (int | tuple[int]) – Zero-padding added to both sides of the input. Same as that in nn._ConvNd.

  • dilation (int | tuple[int]) – Spacing between kernel elements. Same as that in nn._ConvNd.

  • groups (int) – Number of blocked connections from input channels to output channels. Same as that in nn._ConvNd.

  • bias (bool | str) – If specified as auto, it will be decided by the norm_cfg. Bias will be set as True if norm_cfg is None, otherwise False. Default: “auto”.

  • conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.

  • norm_cfg (dict) – Config dict for normalization layer. Default: None.

  • act_cfg (dict) – Config dict for activation layer. Default: dict(type=’ReLU’).

  • inplace (bool) – Whether to use inplace mode for activation. Default: True.

  • with_spectral_norm (bool) – Whether use spectral norm in conv module. Default: False.

  • padding_mode (str) – If the padding_mode has not been supported by current Conv2d in PyTorch, we will use our own padding layer instead. Currently, we support [‘zeros’, ‘circular’] with official implementation and [‘reflect’] with our own implementation. Default: ‘zeros’.

  • order (tuple[str]) – The order of conv/norm/activation layers. It is a sequence of “conv”, “norm” and “act”. Common examples are (“conv”, “norm”, “act”) and (“act”, “conv”, “norm”). Default: (‘conv’, ‘norm’, ‘act’).

forward(x, activate=True, norm=True)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.ConvTranspose2d(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int]] = 1, padding: Union[int, Tuple[int, int]] = 0, output_padding: Union[int, Tuple[int, int]] = 0, groups: int = 1, bias: bool = True, dilation: int = 1, padding_mode: str = 'zeros', device=None, dtype=None)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.ConvTranspose3d(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int, int]], stride: Union[int, Tuple[int, int, int]] = 1, padding: Union[int, Tuple[int, int, int]] = 0, output_padding: Union[int, Tuple[int, int, int]] = 0, groups: int = 1, bias: bool = True, dilation: Union[int, Tuple[int, int, int]] = 1, padding_mode: str = 'zeros', device=None, dtype=None)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.ConvWS2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, eps=1e-05)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.DepthwiseSeparableConvModule(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, norm_cfg=None, act_cfg={'type': 'ReLU'}, dw_norm_cfg='default', dw_act_cfg='default', pw_norm_cfg='default', pw_act_cfg='default', **kwargs)[源代码]

Depthwise separable convolution module.

See https://arxiv.org/pdf/1704.04861.pdf for details.

This module can replace a ConvModule with the conv block replaced by two conv block: depthwise conv block and pointwise conv block. The depthwise conv block contains depthwise-conv/norm/activation layers. The pointwise conv block contains pointwise-conv/norm/activation layers. It should be noted that there will be norm/activation layer in the depthwise conv block if norm_cfg and act_cfg are specified.

参数
  • in_channels (int) – Number of channels in the input feature map. Same as that in nn._ConvNd.

  • out_channels (int) – Number of channels produced by the convolution. Same as that in nn._ConvNd.

  • kernel_size (int | tuple[int]) – Size of the convolving kernel. Same as that in nn._ConvNd.

  • stride (int | tuple[int]) – Stride of the convolution. Same as that in nn._ConvNd. Default: 1.

  • padding (int | tuple[int]) – Zero-padding added to both sides of the input. Same as that in nn._ConvNd. Default: 0.

  • dilation (int | tuple[int]) – Spacing between kernel elements. Same as that in nn._ConvNd. Default: 1.

  • norm_cfg (dict) – Default norm config for both depthwise ConvModule and pointwise ConvModule. Default: None.

  • act_cfg (dict) – Default activation config for both depthwise ConvModule and pointwise ConvModule. Default: dict(type=’ReLU’).

  • dw_norm_cfg (dict) – Norm config of depthwise ConvModule. If it is ‘default’, it will be the same as norm_cfg. Default: ‘default’.

  • dw_act_cfg (dict) – Activation config of depthwise ConvModule. If it is ‘default’, it will be the same as act_cfg. Default: ‘default’.

  • pw_norm_cfg (dict) – Norm config of pointwise ConvModule. If it is ‘default’, it will be the same as norm_cfg. Default: ‘default’.

  • pw_act_cfg (dict) – Activation config of pointwise ConvModule. If it is ‘default’, it will be the same as act_cfg. Default: ‘default’.

  • kwargs (optional) – Other shared arguments for depthwise and pointwise ConvModule. See ConvModule for ref.

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.GeneralizedAttention(in_channels, spatial_range=- 1, num_heads=9, position_embedding_dim=- 1, position_magnitude=1, kv_stride=2, q_stride=1, attention_type='1111')[源代码]

GeneralizedAttention module.

See ‘An Empirical Study of Spatial Attention Mechanisms in Deep Networks’ (https://arxiv.org/abs/1711.07971) for details.

参数
  • in_channels (int) – Channels of the input feature map.

  • spatial_range (int) – The spatial range. -1 indicates no spatial range constraint. Default: -1.

  • num_heads (int) – The head number of empirical_attention module. Default: 9.

  • position_embedding_dim (int) – The position embedding dimension. Default: -1.

  • position_magnitude (int) – A multiplier acting on coord difference. Default: 1.

  • kv_stride (int) – The feature stride acting on key/value feature map. Default: 2.

  • q_stride (int) – The feature stride acting on query feature map. Default: 1.

  • attention_type (str) –

    A binary indicator string for indicating which items in generalized empirical_attention module are used. Default: ‘1111’.

    • ’1000’ indicates ‘query and key content’ (appr - appr) item,

    • ’0100’ indicates ‘query content and relative position’ (appr - position) item,

    • ’0010’ indicates ‘key content only’ (bias - appr) item,

    • ’0001’ indicates ‘relative position only’ (bias - position) item.

forward(x_input)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.HSigmoid(bias=3.0, divisor=6.0, min_value=0.0, max_value=1.0)[源代码]

Hard Sigmoid Module. Apply the hard sigmoid function: Hsigmoid(x) = min(max((x + bias) / divisor, min_value), max_value) Default: Hsigmoid(x) = min(max((x + 3) / 6, 0), 1)

注解

In MMCV v1.4.4, we modified the default value of args to align with PyTorch official.

参数
  • bias (float) – Bias of the input feature map. Default: 3.0.

  • divisor (float) – Divisor of the input feature map. Default: 6.0.

  • min_value (float) – Lower bound value. Default: 0.0.

  • max_value (float) – Upper bound value. Default: 1.0.

返回

The output tensor.

返回类型

Tensor

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.HSwish(inplace=False)[源代码]

Hard Swish Module.

This module applies the hard swish function:

\[Hswish(x) = x * ReLU6(x + 3) / 6\]
参数

inplace (bool) – can optionally do the operation in-place. Default: False.

返回

The output tensor.

返回类型

Tensor

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.KaimingInit(a=0, mode='fan_out', nonlinearity='relu', distribution='normal', **kwargs)[源代码]

Initialize module parameters with the values according to the method described in Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015).

参数
  • a (int | float) – the negative slope of the rectifier used after this layer (only used with 'leaky_relu'). Defaults to 0.

  • mode (str) – either 'fan_in' or 'fan_out'. Choosing 'fan_in' preserves the magnitude of the variance of the weights in the forward pass. Choosing 'fan_out' preserves the magnitudes in the backwards pass. Defaults to 'fan_out'.

  • nonlinearity (str) – the non-linear function (nn.functional name), recommended to use only with 'relu' or 'leaky_relu' . Defaults to ‘relu’.

  • bias (int | float) – the value to fill the bias. Defaults to 0.

  • bias_prob (float, optional) – the probability for bias initialization. Defaults to None.

  • distribution (str) – distribution either be 'normal' or 'uniform'. Defaults to 'normal'.

  • layer (str | list[str], optional) – the layer will be initialized. Defaults to None.

class mmcv.cnn.Linear(in_features: int, out_features: int, bias: bool = True, device=None, dtype=None)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.MaxPool2d(kernel_size: Union[int, Tuple[int, ...]], stride: Optional[Union[int, Tuple[int, ...]]] = None, padding: Union[int, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, return_indices: bool = False, ceil_mode: bool = False)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.MaxPool3d(kernel_size: Union[int, Tuple[int, ...]], stride: Optional[Union[int, Tuple[int, ...]]] = None, padding: Union[int, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, return_indices: bool = False, ceil_mode: bool = False)[源代码]
forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.NonLocal1d(in_channels, sub_sample=False, conv_cfg={'type': 'Conv1d'}, **kwargs)[源代码]

1D Non-local module.

参数
  • in_channels (int) – Same as NonLocalND.

  • sub_sample (bool) – Whether to apply max pooling after pairwise function (Note that the sub_sample is applied on spatial only). Default: False.

  • conv_cfg (None | dict) – Same as NonLocalND. Default: dict(type=’Conv1d’).

class mmcv.cnn.NonLocal2d(in_channels, sub_sample=False, conv_cfg={'type': 'Conv2d'}, **kwargs)[源代码]

2D Non-local module.

参数
  • in_channels (int) – Same as NonLocalND.

  • sub_sample (bool) – Whether to apply max pooling after pairwise function (Note that the sub_sample is applied on spatial only). Default: False.

  • conv_cfg (None | dict) – Same as NonLocalND. Default: dict(type=’Conv2d’).

class mmcv.cnn.NonLocal3d(in_channels, sub_sample=False, conv_cfg={'type': 'Conv3d'}, **kwargs)[源代码]

3D Non-local module.

参数
  • in_channels (int) – Same as NonLocalND.

  • sub_sample (bool) – Whether to apply max pooling after pairwise function (Note that the sub_sample is applied on spatial only). Default: False.

  • conv_cfg (None | dict) – Same as NonLocalND. Default: dict(type=’Conv3d’).

class mmcv.cnn.NormalInit(mean=0, std=1, **kwargs)[源代码]

Initialize module parameters with the values drawn from the normal distribution \(\mathcal{N}(\text{mean}, \text{std}^2)\).

参数
  • mean (int | float) – the mean of the normal distribution. Defaults to 0.

  • std (int | float) – the standard deviation of the normal distribution. Defaults to 1.

  • bias (int | float) – the value to fill the bias. Defaults to 0.

  • bias_prob (float, optional) – the probability for bias initialization. Defaults to None.

  • layer (str | list[str], optional) – the layer will be initialized. Defaults to None.

class mmcv.cnn.PretrainedInit(checkpoint, prefix=None, map_location=None)[源代码]

Initialize module by loading a pretrained model.

参数
  • checkpoint (str) – the checkpoint file of the pretrained model should be load.

  • prefix (str, optional) – the prefix of a sub-module in the pretrained model. it is for loading a part of the pretrained model to initialize. For example, if we would like to only load the backbone of a detector model, we can set prefix='backbone.'. Defaults to None.

  • map_location (str) – map tensors into proper locations.

class mmcv.cnn.ResNet(depth, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3), style='pytorch', frozen_stages=- 1, bn_eval=True, bn_frozen=False, with_cp=False)[源代码]

ResNet backbone.

参数
  • depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.

  • num_stages (int) – Resnet stages, normally 4.

  • strides (Sequence[int]) – Strides of the first block of each stage.

  • dilations (Sequence[int]) – Dilation of each stage.

  • out_indices (Sequence[int]) – Output from which stages.

  • style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.

  • frozen_stages (int) – Stages to be frozen (all param fixed). -1 means not freezing any parameters.

  • bn_eval (bool) – Whether to set BN layers as eval mode, namely, freeze running stats (mean and var).

  • bn_frozen (bool) – Whether to freeze weight and bias of BN layers.

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

train(mode=True)[源代码]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

参数

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

返回

self

返回类型

Module

class mmcv.cnn.Scale(scale=1.0)[源代码]

A learnable scale parameter.

This layer scales the input by a learnable factor. It multiplies a learnable scale parameter of shape (1,) with input of any shape.

参数

scale (float) – Initial value of scale factor. Default: 1.0

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.Swish[源代码]

Swish Module.

This module applies the swish function:

\[Swish(x) = x * Sigmoid(x)\]
返回

The output tensor.

返回类型

Tensor

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.cnn.TruncNormalInit(mean: float = 0, std: float = 1, a: float = - 2, b: float = 2, **kwargs)[源代码]

Initialize module parameters with the values drawn from the normal distribution \(\mathcal{N}(\text{mean}, \text{std}^2)\) with values outside \([a, b]\).

参数
  • mean (float) – the mean of the normal distribution. Defaults to 0.

  • std (float) – the standard deviation of the normal distribution. Defaults to 1.

  • a (float) – The minimum cutoff value.

  • b (float) – The maximum cutoff value.

  • bias (float) – the value to fill the bias. Defaults to 0.

  • bias_prob (float, optional) – the probability for bias initialization. Defaults to None.

  • layer (str | list[str], optional) – the layer will be initialized. Defaults to None.

class mmcv.cnn.UniformInit(a=0, b=1, **kwargs)[源代码]

Initialize module parameters with values drawn from the uniform distribution \(\mathcal{U}(a, b)\).

参数
  • a (int | float) – the lower bound of the uniform distribution. Defaults to 0.

  • b (int | float) – the upper bound of the uniform distribution. Defaults to 1.

  • bias (int | float) – the value to fill the bias. Defaults to 0.

  • bias_prob (float, optional) – the probability for bias initialization. Defaults to None.

  • layer (str | list[str], optional) – the layer will be initialized. Defaults to None.

class mmcv.cnn.VGG(depth, with_bn=False, num_classes=- 1, num_stages=5, dilations=(1, 1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), frozen_stages=- 1, bn_eval=True, bn_frozen=False, ceil_mode=False, with_last_pool=True)[源代码]

VGG backbone.

参数
  • depth (int) – Depth of vgg, from {11, 13, 16, 19}.

  • with_bn (bool) – Use BatchNorm or not.

  • num_classes (int) – number of classes for classification.

  • num_stages (int) – VGG stages, normally 5.

  • dilations (Sequence[int]) – Dilation of each stage.

  • out_indices (Sequence[int]) – Output from which stages.

  • frozen_stages (int) – Stages to be frozen (all param fixed). -1 means not freezing any parameters.

  • bn_eval (bool) – Whether to set BN layers as eval mode, namely, freeze running stats (mean and var).

  • bn_frozen (bool) – Whether to freeze weight and bias of BN layers.

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

train(mode=True)[源代码]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

参数

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

返回

self

返回类型

Module

class mmcv.cnn.XavierInit(gain=1, distribution='normal', **kwargs)[源代码]

Initialize module parameters with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010).

参数
  • gain (int | float) – an optional scaling factor. Defaults to 1.

  • bias (int | float) – the value to fill the bias. Defaults to 0.

  • bias_prob (float, optional) – the probability for bias initialization. Defaults to None.

  • distribution (str) – distribution either be 'normal' or 'uniform'. Defaults to 'normal'.

  • layer (str | list[str], optional) – the layer will be initialized. Defaults to None.

mmcv.cnn.bias_init_with_prob(prior_prob)[源代码]

initialize conv/fc bias value according to a given probability value.

mmcv.cnn.build_activation_layer(cfg)[源代码]

Build activation layer.

参数

cfg (dict) –

The activation layer config, which should contain:

  • type (str): Layer type.

  • layer args: Args needed to instantiate an activation layer.

返回

Created activation layer.

返回类型

nn.Module

mmcv.cnn.build_conv_layer(cfg, *args, **kwargs)[源代码]

Build convolution layer.

参数
  • cfg (None or dict) – The conv layer config, which should contain: - type (str): Layer type. - layer args: Args needed to instantiate an conv layer.

  • args (argument list) – Arguments passed to the __init__ method of the corresponding conv layer.

  • kwargs (keyword arguments) – Keyword arguments passed to the __init__ method of the corresponding conv layer.

返回

Created conv layer.

返回类型

nn.Module

mmcv.cnn.build_model_from_cfg(cfg, registry, default_args=None)[源代码]

Build a PyTorch model from config dict(s). Different from build_from_cfg, if cfg is a list, a nn.Sequential will be built.

参数
  • cfg (dict, list[dict]) – The config of modules, is is either a config dict or a list of config dicts. If cfg is a list, a the built modules will be wrapped with nn.Sequential.

  • registry (Registry) – A registry the module belongs to.

  • default_args (dict, optional) – Default arguments to build the module. Defaults to None.

返回

A built nn module.

返回类型

nn.Module

mmcv.cnn.build_norm_layer(cfg, num_features, postfix='')[源代码]

Build normalization layer.

参数
  • cfg (dict) –

    The norm layer config, which should contain:

    • type (str): Layer type.

    • layer args: Args needed to instantiate a norm layer.

    • requires_grad (bool, optional): Whether stop gradient updates.

  • num_features (int) – Number of input channels.

  • postfix (int | str) – The postfix to be appended into norm abbreviation to create named layer.

返回

The first element is the layer name consisting of abbreviation and postfix, e.g., bn1, gn. The second element is the created norm layer.

返回类型

tuple[str, nn.Module]

mmcv.cnn.build_padding_layer(cfg, *args, **kwargs)[源代码]

Build padding layer.

参数

cfg (None or dict) – The padding layer config, which should contain: - type (str): Layer type. - layer args: Args needed to instantiate a padding layer.

返回

Created padding layer.

返回类型

nn.Module

mmcv.cnn.build_plugin_layer(cfg, postfix='', **kwargs)[源代码]

Build plugin layer.

参数
  • cfg (None or dict) –

    cfg should contain:

    • type (str): identify plugin layer type.

    • layer args: args needed to instantiate a plugin layer.

  • postfix (int, str) – appended into norm abbreviation to create named layer. Default: ‘’.

返回

The first one is the concatenation of abbreviation and postfix. The second is the created plugin layer.

返回类型

tuple[str, nn.Module]

mmcv.cnn.build_upsample_layer(cfg, *args, **kwargs)[源代码]

Build upsample layer.

参数
  • cfg (dict) –

    The upsample layer config, which should contain:

    • type (str): Layer type.

    • scale_factor (int): Upsample ratio, which is not applicable to deconv.

    • layer args: Args needed to instantiate a upsample layer.

  • args (argument list) – Arguments passed to the __init__ method of the corresponding conv layer.

  • kwargs (keyword arguments) – Keyword arguments passed to the __init__ method of the corresponding conv layer.

返回

Created upsample layer.

返回类型

nn.Module

mmcv.cnn.fuse_conv_bn(module)[源代码]

Recursively fuse conv and bn in a module.

During inference, the functionary of batch norm layers is turned off but only the mean and var alone channels are used, which exposes the chance to fuse it with the preceding conv layers to save computations and simplify network structures.

参数

module (nn.Module) – Module to be fused.

返回

Fused module.

返回类型

nn.Module

mmcv.cnn.get_model_complexity_info(model, input_shape, print_per_layer_stat=True, as_strings=True, input_constructor=None, flush=False, ost=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[源代码]

Get complexity information of a model.

This method can calculate FLOPs and parameter counts of a model with corresponding input shape. It can also print complexity information for each layer in a model.

Supported layers are listed as below:
  • Convolutions: nn.Conv1d, nn.Conv2d, nn.Conv3d.

  • Activations: nn.ReLU, nn.PReLU, nn.ELU, nn.LeakyReLU, nn.ReLU6.

  • Poolings: nn.MaxPool1d, nn.MaxPool2d, nn.MaxPool3d, nn.AvgPool1d, nn.AvgPool2d, nn.AvgPool3d, nn.AdaptiveMaxPool1d, nn.AdaptiveMaxPool2d, nn.AdaptiveMaxPool3d, nn.AdaptiveAvgPool1d, nn.AdaptiveAvgPool2d, nn.AdaptiveAvgPool3d.

  • BatchNorms: nn.BatchNorm1d, nn.BatchNorm2d, nn.BatchNorm3d, nn.GroupNorm, nn.InstanceNorm1d, InstanceNorm2d, InstanceNorm3d, nn.LayerNorm.

  • Linear: nn.Linear.

  • Deconvolution: nn.ConvTranspose2d.

  • Upsample: nn.Upsample.

参数
  • model (nn.Module) – The model for complexity calculation.

  • input_shape (tuple) – Input shape used for calculation.

  • print_per_layer_stat (bool) – Whether to print complexity information for each layer in a model. Default: True.

  • as_strings (bool) – Output FLOPs and params counts in a string form. Default: True.

  • input_constructor (None | callable) – If specified, it takes a callable method that generates input. otherwise, it will generate a random tensor with input shape to calculate FLOPs. Default: None.

  • flush (bool) – same as that in print(). Default: False.

  • ost (stream) – same as file param in print(). Default: sys.stdout.

返回

If as_strings is set to True, it will return FLOPs and parameter counts in a string format. otherwise, it will return those in a float number format.

返回类型

tuple[float | str]

mmcv.cnn.initialize(module, init_cfg)[源代码]

Initialize a module.

参数
  • module (torch.nn.Module) – the module will be initialized.

  • init_cfg (dict | list[dict]) – initialization configuration dict to define initializer. OpenMMLab has implemented 6 initializers including Constant, Xavier, Normal, Uniform, Kaiming, and Pretrained.

示例

>>> module = nn.Linear(2, 3, bias=True)
>>> init_cfg = dict(type='Constant', layer='Linear', val =1 , bias =2)
>>> initialize(module, init_cfg)
>>> module = nn.Sequential(nn.Conv1d(3, 1, 3), nn.Linear(1,2))
>>> # define key ``'layer'`` for initializing layer with different
>>> # configuration
>>> init_cfg = [dict(type='Constant', layer='Conv1d', val=1),
        dict(type='Constant', layer='Linear', val=2)]
>>> initialize(module, init_cfg)
>>> # define key``'override'`` to initialize some specific part in
>>> # module
>>> class FooNet(nn.Module):
>>>     def __init__(self):
>>>         super().__init__()
>>>         self.feat = nn.Conv2d(3, 16, 3)
>>>         self.reg = nn.Conv2d(16, 10, 3)
>>>         self.cls = nn.Conv2d(16, 5, 3)
>>> model = FooNet()
>>> init_cfg = dict(type='Constant', val=1, bias=2, layer='Conv2d',
>>>     override=dict(type='Constant', name='reg', val=3, bias=4))
>>> initialize(model, init_cfg)
>>> model = ResNet(depth=50)
>>> # Initialize weights with the pretrained model.
>>> init_cfg = dict(type='Pretrained',
        checkpoint='torchvision://resnet50')
>>> initialize(model, init_cfg)
>>> # Initialize weights of a sub-module with the specific part of
>>> # a pretrained model by using "prefix".
>>> url = 'http://download.openmmlab.com/mmdetection/v2.0/retinanet/'\
>>>     'retinanet_r50_fpn_1x_coco/'\
>>>     'retinanet_r50_fpn_1x_coco_20200130-c2398f9e.pth'
>>> init_cfg = dict(type='Pretrained',
        checkpoint=url, prefix='backbone.')
mmcv.cnn.is_norm(layer, exclude=None)[源代码]

Check if a layer is a normalization layer.

参数
  • layer (nn.Module) – The layer to be checked.

  • exclude (type | tuple[type]) – Types to be excluded.

返回

Whether the layer is a norm layer.

返回类型

bool

runner

class mmcv.runner.BaseModule(init_cfg=None)[源代码]

Base module for all modules in openmmlab.

BaseModule is a wrapper of torch.nn.Module with additional functionality of parameter initialization. Compared with torch.nn.Module, BaseModule mainly adds three attributes.

  • init_cfg: the config to control the initialization.

  • init_weights: The function of parameter initialization and recording initialization information.

  • _params_init_info: Used to track the parameter initialization information. This attribute only exists during executing the init_weights.

参数

init_cfg (dict, optional) – Initialization config dict.

init_weights()[源代码]

Initialize the weights.

class mmcv.runner.BaseRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, max_iters=None, max_epochs=None)[源代码]

The base class of Runner, a training helper for PyTorch.

All subclasses should implement the following APIs:

  • run()

  • train()

  • val()

  • save_checkpoint()

参数
  • model (torch.nn.Module) – The model to be run.

  • batch_processor (callable) – A callable method that process a data batch. The interface of this method should be batch_processor(model, data, train_mode) -> dict

  • optimizer (dict or torch.optim.Optimizer) – It can be either an optimizer (in most cases) or a dict of optimizers (in models that requires more than one optimizer, e.g., GAN).

  • work_dir (str, optional) – The working directory to save checkpoints and logs. Defaults to None.

  • logger (logging.Logger) – Logger used during training. Defaults to None. (The default value is just for backward compatibility)

  • meta (dict | None) – A dict records some import information such as environment info and seed, which will be logged in logger hook. Defaults to None.

  • max_epochs (int, optional) – Total training epochs.

  • max_iters (int, optional) – Total training iterations.

call_hook(fn_name)[源代码]

Call all hooks.

参数

fn_name (str) – The function name in each hook to be called, such as “before_train_epoch”.

current_lr()[源代码]

Get current learning rates.

返回

Current learning rates of all param groups. If the runner has a dict of optimizers, this method will return a dict.

返回类型

list[float] | dict[str, list[float]]

current_momentum()[源代码]

Get current momentums.

返回

Current momentums of all param groups. If the runner has a dict of optimizers, this method will return a dict.

返回类型

list[float] | dict[str, list[float]]

property epoch

Current epoch.

Type

int

property hooks

A list of registered hooks.

Type

list[Hook]

property inner_iter

Iteration in an epoch.

Type

int

property iter

Current iteration.

Type

int

property max_epochs

Maximum training epochs.

Type

int

property max_iters

Maximum training iterations.

Type

int

property model_name

Name of the model, usually the module class name.

Type

str

property rank

Rank of current process. (distributed training)

Type

int

register_hook(hook, priority='NORMAL')[源代码]

Register a hook into the hook list.

The hook will be inserted into a priority queue, with the specified priority (See Priority for details of priorities). For hooks with the same priority, they will be triggered in the same order as they are registered.

参数
  • hook (Hook) – The hook to be registered.

  • priority (int or str or Priority) – Hook priority. Lower value means higher priority.

register_hook_from_cfg(hook_cfg)[源代码]

Register a hook from its cfg.

参数

hook_cfg (dict) – Hook config. It should have at least keys ‘type’ and ‘priority’ indicating its type and priority.

注解

The specific hook class to register should not use ‘type’ and ‘priority’ arguments during initialization.

register_training_hooks(lr_config, optimizer_config=None, checkpoint_config=None, log_config=None, momentum_config=None, timer_config={'type': 'IterTimerHook'}, custom_hooks_config=None)[源代码]

Register default and custom hooks for training.

Default and custom hooks include:

Hooks

Priority

LrUpdaterHook

VERY_HIGH (10)

MomentumUpdaterHook

HIGH (30)

OptimizerStepperHook

ABOVE_NORMAL (40)

CheckpointSaverHook

NORMAL (50)

IterTimerHook

LOW (70)

LoggerHook(s)

VERY_LOW (90)

CustomHook(s)

defaults to NORMAL (50)

If custom hooks have same priority with default hooks, custom hooks will be triggered after default hooks.

property world_size

Number of processes participating in the job. (distributed training)

Type

int

class mmcv.runner.CheckpointHook(interval=- 1, by_epoch=True, save_optimizer=True, out_dir=None, max_keep_ckpts=- 1, save_last=True, sync_buffer=False, file_client_args=None, **kwargs)[源代码]

Save checkpoints periodically.

参数
  • interval (int) – The saving period. If by_epoch=True, interval indicates epochs, otherwise it indicates iterations. Default: -1, which means “never”.

  • by_epoch (bool) – Saving checkpoints by epoch or by iteration. Default: True.

  • save_optimizer (bool) – Whether to save optimizer state_dict in the checkpoint. It is usually used for resuming experiments. Default: True.

  • out_dir (str, optional) – The root directory to save checkpoints. If not specified, runner.work_dir will be used by default. If specified, the out_dir will be the concatenation of out_dir and the last level directory of runner.work_dir. Changed in version 1.3.16.

  • max_keep_ckpts (int, optional) – The maximum checkpoints to keep. In some cases we want only the latest few checkpoints and would like to delete old ones to save the disk space. Default: -1, which means unlimited.

  • save_last (bool, optional) – Whether to force the last checkpoint to be saved regardless of interval. Default: True.

  • sync_buffer (bool, optional) – Whether to synchronize buffers in different gpus. Default: False.

  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None. New in version 1.3.16.

警告

Before v1.3.16, the out_dir argument indicates the path where the checkpoint is stored. However, since v1.3.16, out_dir indicates the root directory and the final path to save checkpoint is the concatenation of out_dir and the last level directory of runner.work_dir. Suppose the value of out_dir is “/path/of/A” and the value of runner.work_dir is “/path/of/B”, then the final path will be “/path/of/A/B”.

class mmcv.runner.CheckpointLoader[源代码]

A general checkpoint loader to manage all schemes.

classmethod load_checkpoint(filename, map_location=None, logger=None)[源代码]

load checkpoint through URL scheme path.

参数
  • filename (str) – checkpoint file name with given prefix

  • map_location (str, optional) – Same as torch.load(). Default: None

  • logger (logging.Logger, optional) – The logger for message. Default: None

返回

The loaded checkpoint.

返回类型

dict or OrderedDict

classmethod register_scheme(prefixes, loader=None, force=False)[源代码]

Register a loader to CheckpointLoader.

This method can be used as a normal class method or a decorator.

参数
  • prefixes (str or list[str] or tuple[str]) –

  • prefix of the registered loader. (The) –

  • loader (function, optional) – The loader function to be registered. When this method is used as a decorator, loader is None. Defaults to None.

  • force (bool, optional) – Whether to override the loader if the prefix has already been registered. Defaults to False.

class mmcv.runner.CosineAnnealingLrUpdaterHook(min_lr=None, min_lr_ratio=None, **kwargs)[源代码]
class mmcv.runner.CosineRestartLrUpdaterHook(periods, restart_weights=[1], min_lr=None, min_lr_ratio=None, **kwargs)[源代码]

Cosine annealing with restarts learning rate scheme.

参数
  • periods (list[int]) – Periods for each cosine anneling cycle.

  • restart_weights (list[float], optional) – Restart weights at each restart iteration. Default: [1].

  • min_lr (float, optional) – The minimum lr. Default: None.

  • min_lr_ratio (float, optional) – The ratio of minimum lr to the base lr. Either min_lr or min_lr_ratio should be specified. Default: None.

class mmcv.runner.CyclicLrUpdaterHook(by_epoch=False, target_ratio=(10, 0.0001), cyclic_times=1, step_ratio_up=0.4, anneal_strategy='cos', gamma=1, **kwargs)[源代码]

Cyclic LR Scheduler.

Implement the cyclical learning rate policy (CLR) described in https://arxiv.org/pdf/1506.01186.pdf

Different from the original paper, we use cosine annealing rather than triangular policy inside a cycle. This improves the performance in the 3D detection area.

参数
  • by_epoch (bool, optional) – Whether to update LR by epoch.

  • target_ratio (tuple[float], optional) – Relative ratio of the highest LR and the lowest LR to the initial LR.

  • cyclic_times (int, optional) – Number of cycles during training

  • step_ratio_up (float, optional) – The ratio of the increasing process of LR in the total cycle.

  • anneal_strategy (str, optional) – {‘cos’, ‘linear’} Specifies the annealing strategy: ‘cos’ for cosine annealing, ‘linear’ for linear annealing. Default: ‘cos’.

  • gamma (float, optional) – Cycle decay ratio. Default: 1. It takes values in the range (0, 1]. The difference between the maximum learning rate and the minimum learning rate decreases periodically when it is less than 1. New in version 1.4.4.

class mmcv.runner.CyclicMomentumUpdaterHook(by_epoch=False, target_ratio=(0.8947368421052632, 1), cyclic_times=1, step_ratio_up=0.4, anneal_strategy='cos', gamma=1, **kwargs)[源代码]

Cyclic momentum Scheduler.

Implement the cyclical momentum scheduler policy described in https://arxiv.org/pdf/1708.07120.pdf

This momentum scheduler usually used together with the CyclicLRUpdater to improve the performance in the 3D detection area.

参数
  • target_ratio (tuple[float]) – Relative ratio of the lowest momentum and the highest momentum to the initial momentum.

  • cyclic_times (int) – Number of cycles during training

  • step_ratio_up (float) – The ratio of the increasing process of momentum in the total cycle.

  • by_epoch (bool) – Whether to update momentum by epoch.

  • anneal_strategy (str, optional) – {‘cos’, ‘linear’} Specifies the annealing strategy: ‘cos’ for cosine annealing, ‘linear’ for linear annealing. Default: ‘cos’.

  • gamma (float, optional) – Cycle decay ratio. Default: 1. It takes values in the range (0, 1]. The difference between the maximum learning rate and the minimum learning rate decreases periodically when it is less than 1. New in version 1.4.4.

class mmcv.runner.DefaultOptimizerConstructor(optimizer_cfg, paramwise_cfg=None)[源代码]

Default constructor for optimizers.

By default each parameter share the same optimizer settings, and we provide an argument paramwise_cfg to specify parameter-wise settings. It is a dict and may contain the following fields:

  • custom_keys (dict): Specified parameters-wise settings by keys. If one of the keys in custom_keys is a substring of the name of one parameter, then the setting of the parameter will be specified by custom_keys[key] and other setting like bias_lr_mult etc. will be ignored. It should be noted that the aforementioned key is the longest key that is a substring of the name of the parameter. If there are multiple matched keys with the same length, then the key with lower alphabet order will be chosen. custom_keys[key] should be a dict and may contain fields lr_mult and decay_mult. See Example 2 below.

  • bias_lr_mult (float): It will be multiplied to the learning rate for all bias parameters (except for those in normalization layers and offset layers of DCN).

  • bias_decay_mult (float): It will be multiplied to the weight decay for all bias parameters (except for those in normalization layers, depthwise conv layers, offset layers of DCN).

  • norm_decay_mult (float): It will be multiplied to the weight decay for all weight and bias parameters of normalization layers.

  • dwconv_decay_mult (float): It will be multiplied to the weight decay for all weight and bias parameters of depthwise conv layers.

  • dcn_offset_lr_mult (float): It will be multiplied to the learning rate for parameters of offset layer in the deformable convs of a model.

  • bypass_duplicate (bool): If true, the duplicate parameters would not be added into optimizer. Default: False.

注解

1. If the option dcn_offset_lr_mult is used, the constructor will override the effect of bias_lr_mult in the bias of offset layer. So be careful when using both bias_lr_mult and dcn_offset_lr_mult. If you wish to apply both of them to the offset layer in deformable convs, set dcn_offset_lr_mult to the original dcn_offset_lr_mult * bias_lr_mult.

2. If the option dcn_offset_lr_mult is used, the constructor will apply it to all the DCN layers in the model. So be careful when the model contains multiple DCN layers in places other than backbone.

参数
  • model (nn.Module) – The model with parameters to be optimized.

  • optimizer_cfg (dict) –

    The config dict of the optimizer. Positional fields are

    • type: class name of the optimizer.

    Optional fields are

    • any arguments of the corresponding optimizer type, e.g., lr, weight_decay, momentum, etc.

  • paramwise_cfg (dict, optional) – Parameter-wise options.

Example 1:
>>> model = torch.nn.modules.Conv1d(1, 1, 1)
>>> optimizer_cfg = dict(type='SGD', lr=0.01, momentum=0.9,
>>>                      weight_decay=0.0001)
>>> paramwise_cfg = dict(norm_decay_mult=0.)
>>> optim_builder = DefaultOptimizerConstructor(
>>>     optimizer_cfg, paramwise_cfg)
>>> optimizer = optim_builder(model)
Example 2:
>>> # assume model have attribute model.backbone and model.cls_head
>>> optimizer_cfg = dict(type='SGD', lr=0.01, weight_decay=0.95)
>>> paramwise_cfg = dict(custom_keys={
        '.backbone': dict(lr_mult=0.1, decay_mult=0.9)})
>>> optim_builder = DefaultOptimizerConstructor(
>>>     optimizer_cfg, paramwise_cfg)
>>> optimizer = optim_builder(model)
>>> # Then the `lr` and `weight_decay` for model.backbone is
>>> # (0.01 * 0.1, 0.95 * 0.9). `lr` and `weight_decay` for
>>> # model.cls_head is (0.01, 0.95).
add_params(params, module, prefix='', is_dcn_module=None)[源代码]

Add all parameters of module to the params list.

The parameters of the given module will be added to the list of param groups, with specific rules defined by paramwise_cfg.

参数
  • params (list[dict]) – A list of param groups, it will be modified in place.

  • module (nn.Module) – The module to be added.

  • prefix (str) – The prefix of the module

  • is_dcn_module (int|float|None) – If the current module is a submodule of DCN, is_dcn_module will be passed to control conv_offset layer’s learning rate. Defaults to None.

class mmcv.runner.DefaultRunnerConstructor(runner_cfg, default_args=None)[源代码]

Default constructor for runners.

Custom existing Runner like EpocBasedRunner though RunnerConstructor. For example, We can inject some new properties and functions for Runner.

示例

>>> from mmcv.runner import RUNNER_BUILDERS, build_runner
>>> # Define a new RunnerReconstructor
>>> @RUNNER_BUILDERS.register_module()
>>> class MyRunnerConstructor:
...     def __init__(self, runner_cfg, default_args=None):
...         if not isinstance(runner_cfg, dict):
...             raise TypeError('runner_cfg should be a dict',
...                             f'but got {type(runner_cfg)}')
...         self.runner_cfg = runner_cfg
...         self.default_args = default_args
...
...     def __call__(self):
...         runner = RUNNERS.build(self.runner_cfg,
...                                default_args=self.default_args)
...         # Add new properties for existing runner
...         runner.my_name = 'my_runner'
...         runner.my_function = lambda self: print(self.my_name)
...         ...
>>> # build your runner
>>> runner_cfg = dict(type='EpochBasedRunner', max_epochs=40,
...                   constructor='MyRunnerConstructor')
>>> runner = build_runner(runner_cfg)
class mmcv.runner.DistEvalHook(dataloader, start=None, interval=1, by_epoch=True, save_best=None, rule=None, test_fn=None, greater_keys=None, less_keys=None, broadcast_bn_buffer=True, tmpdir=None, gpu_collect=False, out_dir=None, file_client_args=None, **eval_kwargs)[源代码]

Distributed evaluation hook.

This hook will regularly perform evaluation in a given interval when performing in distributed environment.

参数
  • dataloader (DataLoader) – A PyTorch dataloader, whose dataset has implemented evaluate function.

  • start (int | None, optional) – Evaluation starting epoch. It enables evaluation before the training starts if start <= the resuming epoch. If None, whether to evaluate is merely decided by interval. Default: None.

  • interval (int) – Evaluation interval. Default: 1.

  • by_epoch (bool) – Determine perform evaluation by epoch or by iteration. If set to True, it will perform by epoch. Otherwise, by iteration. default: True.

  • save_best (str, optional) – If a metric is specified, it would measure the best checkpoint during evaluation. The information about best checkpoint would be saved in runner.meta['hook_msgs'] to keep best score value and best checkpoint path, which will be also loaded when resume checkpoint. Options are the evaluation metrics on the test dataset. e.g., bbox_mAP, segm_mAP for bbox detection and instance segmentation. AR@100 for proposal recall. If save_best is auto, the first key of the returned OrderedDict result will be used. Default: None.

  • rule (str | None, optional) – Comparison rule for best score. If set to None, it will infer a reasonable rule. Keys such as ‘acc’, ‘top’ .etc will be inferred by ‘greater’ rule. Keys contain ‘loss’ will be inferred by ‘less’ rule. Options are ‘greater’, ‘less’, None. Default: None.

  • test_fn (callable, optional) – test a model with samples from a dataloader in a multi-gpu manner, and return the test results. If None, the default test function mmcv.engine.multi_gpu_test will be used. (default: None)

  • tmpdir (str | None) – Temporary directory to save the results of all processes. Default: None.

  • gpu_collect (bool) – Whether to use gpu or cpu to collect results. Default: False.

  • broadcast_bn_buffer (bool) – Whether to broadcast the buffer(running_mean and running_var) of rank 0 to other rank before evaluation. Default: True.

  • out_dir (str, optional) – The root directory to save checkpoints. If not specified, runner.work_dir will be used by default. If specified, the out_dir will be the concatenation of out_dir and the last level directory of runner.work_dir.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None.

  • **eval_kwargs – Evaluation arguments fed into the evaluate function of the dataset.

class mmcv.runner.DistSamplerSeedHook[源代码]

Data-loading sampler for distributed training.

When distributed training, it is only useful in conjunction with EpochBasedRunner, while IterBasedRunner achieves the same purpose with IterLoader.

class mmcv.runner.DvcliveLoggerHook(model_file=None, interval=10, ignore_last=True, reset_flag=False, by_epoch=True, **kwargs)[源代码]

Class to log metrics with dvclive.

It requires dvclive to be installed.

参数
  • model_file (str) – Default None. If not None, after each epoch the model will be saved to {model_file}.

  • interval (int) – Logging interval (every k iterations). Default 10.

  • ignore_last (bool) – Ignore the log of last iterations in each epoch if less than interval. Default: True.

  • reset_flag (bool) – Whether to clear the output buffer after logging. Default: False.

  • by_epoch (bool) – Whether EpochBasedRunner is used. Default: True.

  • kwargs – Arguments for instantiating Live.

class mmcv.runner.EMAHook(momentum=0.0002, interval=1, warm_up=100, resume_from=None)[源代码]

Exponential Moving Average Hook.

Use Exponential Moving Average on all parameters of model in training process. All parameters have a ema backup, which update by the formula as below. EMAHook takes priority over EvalHook and CheckpointSaverHook.

\[Xema\_{t+1} = (1 - \text{momentum}) \times Xema\_{t} + \text{momentum} \times X_t\]
参数
  • momentum (float) – The momentum used for updating ema parameter. Defaults to 0.0002.

  • interval (int) – Update ema parameter every interval iteration. Defaults to 1.

  • warm_up (int) – During first warm_up steps, we may use smaller momentum to update ema parameters more slowly. Defaults to 100.

  • resume_from (str) – The checkpoint path. Defaults to None.

after_train_epoch(runner)[源代码]

We load parameter values from ema backup to model before the EvalHook.

after_train_iter(runner)[源代码]

Update ema parameter every self.interval iterations.

before_run(runner)[源代码]

To resume model with it’s ema parameters more friendly.

Register ema parameter as named_buffer to model

before_train_epoch(runner)[源代码]

We recover model’s parameter from ema backup after last epoch’s EvalHook.

class mmcv.runner.EpochBasedRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, max_iters=None, max_epochs=None)[源代码]

Epoch-based Runner.

This runner train models epoch by epoch.

run(data_loaders, workflow, max_epochs=None, **kwargs)[源代码]

Start running.

参数
  • data_loaders (list[DataLoader]) – Dataloaders for training and validation.

  • workflow (list[tuple]) – A list of (phase, epochs) to specify the running order and epochs. E.g, [(‘train’, 2), (‘val’, 1)] means running 2 epochs for training and 1 epoch for validation, iteratively.

save_checkpoint(out_dir, filename_tmpl='epoch_{}.pth', save_optimizer=True, meta=None, create_symlink=True)[源代码]

Save the checkpoint.

参数
  • out_dir (str) – The directory that checkpoints are saved.

  • filename_tmpl (str, optional) – The checkpoint filename template, which contains a placeholder for the epoch number. Defaults to ‘epoch_{}.pth’.

  • save_optimizer (bool, optional) – Whether to save the optimizer to the checkpoint. Defaults to True.

  • meta (dict, optional) – The meta information to be saved in the checkpoint. Defaults to None.

  • create_symlink (bool, optional) – Whether to create a symlink “latest.pth” to point to the latest checkpoint. Defaults to True.

class mmcv.runner.EvalHook(dataloader, start=None, interval=1, by_epoch=True, save_best=None, rule=None, test_fn=None, greater_keys=None, less_keys=None, out_dir=None, file_client_args=None, **eval_kwargs)[源代码]

Non-Distributed evaluation hook.

This hook will regularly perform evaluation in a given interval when performing in non-distributed environment.

参数
  • dataloader (DataLoader) – A PyTorch dataloader, whose dataset has implemented evaluate function.

  • start (int | None, optional) – Evaluation starting epoch. It enables evaluation before the training starts if start <= the resuming epoch. If None, whether to evaluate is merely decided by interval. Default: None.

  • interval (int) – Evaluation interval. Default: 1.

  • by_epoch (bool) – Determine perform evaluation by epoch or by iteration. If set to True, it will perform by epoch. Otherwise, by iteration. Default: True.

  • save_best (str, optional) – If a metric is specified, it would measure the best checkpoint during evaluation. The information about best checkpoint would be saved in runner.meta['hook_msgs'] to keep best score value and best checkpoint path, which will be also loaded when resume checkpoint. Options are the evaluation metrics on the test dataset. e.g., bbox_mAP, segm_mAP for bbox detection and instance segmentation. AR@100 for proposal recall. If save_best is auto, the first key of the returned OrderedDict result will be used. Default: None.

  • rule (str | None, optional) – Comparison rule for best score. If set to None, it will infer a reasonable rule. Keys such as ‘acc’, ‘top’ .etc will be inferred by ‘greater’ rule. Keys contain ‘loss’ will be inferred by ‘less’ rule. Options are ‘greater’, ‘less’, None. Default: None.

  • test_fn (callable, optional) – test a model with samples from a dataloader, and return the test results. If None, the default test function mmcv.engine.single_gpu_test will be used. (default: None)

  • greater_keys (List[str] | None, optional) – Metric keys that will be inferred by ‘greater’ comparison rule. If None, _default_greater_keys will be used. (default: None)

  • less_keys (List[str] | None, optional) – Metric keys that will be inferred by ‘less’ comparison rule. If None, _default_less_keys will be used. (default: None)

  • out_dir (str, optional) – The root directory to save checkpoints. If not specified, runner.work_dir will be used by default. If specified, the out_dir will be the concatenation of out_dir and the last level directory of runner.work_dir. New in version 1.3.16.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None. New in version 1.3.16.

  • **eval_kwargs – Evaluation arguments fed into the evaluate function of the dataset.

注解

If new arguments are added for EvalHook, tools/test.py, tools/eval_metric.py may be affected.

after_train_epoch(runner)[源代码]

Called after every training epoch to evaluate the results.

after_train_iter(runner)[源代码]

Called after every training iter to evaluate the results.

before_train_epoch(runner)[源代码]

Evaluate the model only at the start of training by epoch.

before_train_iter(runner)[源代码]

Evaluate the model only at the start of training by iteration.

evaluate(runner, results)[源代码]

Evaluate the results.

参数
  • runner (mmcv.Runner) – The underlined training runner.

  • results (list) – Output results.

class mmcv.runner.ExpLrUpdaterHook(gamma, **kwargs)[源代码]
class mmcv.runner.FixedLrUpdaterHook(**kwargs)[源代码]
class mmcv.runner.FlatCosineAnnealingLrUpdaterHook(start_percent=0.75, min_lr=None, min_lr_ratio=None, **kwargs)[源代码]

Flat + Cosine lr schedule.

Modified from https://github.com/fastai/fastai/blob/master/fastai/callback/schedule.py#L128 # noqa: E501

参数
  • start_percent (float) – When to start annealing the learning rate after the percentage of the total training steps. The value should be in range [0, 1). Default: 0.75

  • min_lr (float, optional) – The minimum lr. Default: None.

  • min_lr_ratio (float, optional) – The ratio of minimum lr to the base lr. Either min_lr or min_lr_ratio should be specified. Default: None.

class mmcv.runner.Fp16OptimizerHook(grad_clip=None, coalesce=True, bucket_size_mb=- 1, loss_scale=512.0, distributed=True)[源代码]

FP16 optimizer hook (using PyTorch’s implementation).

If you are using PyTorch >= 1.6, torch.cuda.amp is used as the backend, to take care of the optimization procedure.

参数

loss_scale (float | str | dict) – Scale factor configuration. If loss_scale is a float, static loss scaling will be used with the specified scale. If loss_scale is a string, it must be ‘dynamic’, then dynamic loss scaling will be used. It can also be a dict containing arguments of GradScalar. Defaults to 512. For Pytorch >= 1.6, mmcv uses official implementation of GradScaler. If you use a dict version of loss_scale to create GradScaler, please refer to: https://pytorch.org/docs/stable/amp.html#torch.cuda.amp.GradScaler for the parameters.

实际案例

>>> loss_scale = dict(
...     init_scale=65536.0,
...     growth_factor=2.0,
...     backoff_factor=0.5,
...     growth_interval=2000
... )
>>> optimizer_hook = Fp16OptimizerHook(loss_scale=loss_scale)
after_train_iter(runner)[源代码]

Backward optimization steps for Mixed Precision Training. For dynamic loss scaling, please refer to https://pytorch.org/docs/stable/amp.html#torch.cuda.amp.GradScaler.

  1. Scale the loss by a scale factor.

  2. Backward the loss to obtain the gradients.

  3. Unscale the optimizer’s gradient tensors.

  4. Call optimizer.step() and update scale factor.

  5. Save loss_scaler state_dict for resume purpose.

before_run(runner)[源代码]

Preparing steps before Mixed Precision Training.

copy_grads_to_fp32(fp16_net, fp32_weights)[源代码]

Copy gradients from fp16 model to fp32 weight copy.

copy_params_to_fp16(fp16_net, fp32_weights)[源代码]

Copy updated params from fp32 weight copy to fp16 model.

class mmcv.runner.GradientCumulativeFp16OptimizerHook(*args, **kwargs)[源代码]

Fp16 optimizer Hook (using PyTorch’s implementation) implements multi-iters gradient cumulating.

If you are using PyTorch >= 1.6, torch.cuda.amp is used as the backend, to take care of the optimization procedure.

after_train_iter(runner)[源代码]

Backward optimization steps for Mixed Precision Training. For dynamic loss scaling, please refer to https://pytorch.org/docs/stable/amp.html#torch.cuda.amp.GradScaler.

  1. Scale the loss by a scale factor.

  2. Backward the loss to obtain the gradients.

  3. Unscale the optimizer’s gradient tensors.

  4. Call optimizer.step() and update scale factor.

  5. Save loss_scaler state_dict for resume purpose.

class mmcv.runner.GradientCumulativeOptimizerHook(cumulative_iters=1, **kwargs)[源代码]

Optimizer Hook implements multi-iters gradient cumulating.

参数

cumulative_iters (int, optional) – Num of gradient cumulative iters. The optimizer will step every cumulative_iters iters. Defaults to 1.

实际案例

>>> # Use cumulative_iters to simulate a large batch size
>>> # It is helpful when the hardware cannot handle a large batch size.
>>> loader = DataLoader(data, batch_size=64)
>>> optim_hook = GradientCumulativeOptimizerHook(cumulative_iters=4)
>>> # almost equals to
>>> loader = DataLoader(data, batch_size=256)
>>> optim_hook = OptimizerHook()
class mmcv.runner.InvLrUpdaterHook(gamma, power=1.0, **kwargs)[源代码]
class mmcv.runner.IterBasedRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, max_iters=None, max_epochs=None)[源代码]

Iteration-based Runner.

This runner train models iteration by iteration.

register_training_hooks(lr_config, optimizer_config=None, checkpoint_config=None, log_config=None, momentum_config=None, custom_hooks_config=None)[源代码]

Register default hooks for iter-based training.

Checkpoint hook, optimizer stepper hook and logger hooks will be set to by_epoch=False by default.

Default hooks include:

Hooks

Priority

LrUpdaterHook

VERY_HIGH (10)

MomentumUpdaterHook

HIGH (30)

OptimizerStepperHook

ABOVE_NORMAL (40)

CheckpointSaverHook

NORMAL (50)

IterTimerHook

LOW (70)

LoggerHook(s)

VERY_LOW (90)

CustomHook(s)

defaults to NORMAL (50)

If custom hooks have same priority with default hooks, custom hooks will be triggered after default hooks.

resume(checkpoint, resume_optimizer=True, map_location='default')[源代码]

Resume model from checkpoint.

参数
  • checkpoint (str) – Checkpoint to resume from.

  • resume_optimizer (bool, optional) – Whether resume the optimizer(s) if the checkpoint file includes optimizer(s). Default to True.

  • map_location (str, optional) – Same as torch.load(). Default to ‘default’.

run(data_loaders, workflow, max_iters=None, **kwargs)[源代码]

Start running.

参数
  • data_loaders (list[DataLoader]) – Dataloaders for training and validation.

  • workflow (list[tuple]) – A list of (phase, iters) to specify the running order and iterations. E.g, [(‘train’, 10000), (‘val’, 1000)] means running 10000 iterations for training and 1000 iterations for validation, iteratively.

save_checkpoint(out_dir, filename_tmpl='iter_{}.pth', meta=None, save_optimizer=True, create_symlink=True)[源代码]

Save checkpoint to file.

参数
  • out_dir (str) – Directory to save checkpoint files.

  • filename_tmpl (str, optional) – Checkpoint file template. Defaults to ‘iter_{}.pth’.

  • meta (dict, optional) – Metadata to be saved in checkpoint. Defaults to None.

  • save_optimizer (bool, optional) – Whether save optimizer. Defaults to True.

  • create_symlink (bool, optional) – Whether create symlink to the latest checkpoint file. Defaults to True.

class mmcv.runner.LoggerHook(interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[源代码]

Base class for logger hooks.

参数
  • interval (int) – Logging interval (every k iterations). Default 10.

  • ignore_last (bool) – Ignore the log of last iterations in each epoch if less than interval. Default True.

  • reset_flag (bool) – Whether to clear the output buffer after logging. Default False.

  • by_epoch (bool) – Whether EpochBasedRunner is used. Default True.

get_iter(runner, inner_iter=False)[源代码]

Get the current training iteration step.

static is_scalar(val, include_np=True, include_torch=True)[源代码]

Tell the input variable is a scalar or not.

参数
  • val – Input variable.

  • include_np (bool) – Whether include 0-d np.ndarray as a scalar.

  • include_torch (bool) – Whether include 0-d torch.Tensor as a scalar.

返回

True or False.

返回类型

bool

class mmcv.runner.LossScaler(init_scale=4294967296, mode='dynamic', scale_factor=2.0, scale_window=1000)[源代码]

Class that manages loss scaling in mixed precision training which supports both dynamic or static mode.

The implementation refers to https://github.com/NVIDIA/apex/blob/master/apex/fp16_utils/loss_scaler.py. Indirectly, by supplying mode='dynamic' for dynamic loss scaling. It’s important to understand how LossScaler operates. Loss scaling is designed to combat the problem of underflowing gradients encountered at long times when training fp16 networks. Dynamic loss scaling begins by attempting a very high loss scale. Ironically, this may result in OVERflowing gradients. If overflowing gradients are encountered, FP16_Optimizer then skips the update step for this particular iteration/minibatch, and LossScaler adjusts the loss scale to a lower value. If a certain number of iterations occur without overflowing gradients detected,:class:LossScaler increases the loss scale once more. In this way LossScaler attempts to “ride the edge” of always using the highest loss scale possible without incurring overflow.

参数
  • init_scale (float) – Initial loss scale value, default: 2**32.

  • scale_factor (float) – Factor used when adjusting the loss scale. Default: 2.

  • mode (str) – Loss scaling mode. ‘dynamic’ or ‘static’

  • scale_window (int) – Number of consecutive iterations without an overflow to wait before increasing the loss scale. Default: 1000.

has_overflow(params)[源代码]

Check if params contain overflow.

load_state_dict(state_dict)[源代码]

Loads the loss_scaler state dict.

参数

state_dict (dict) – scaler state.

state_dict()[源代码]

Returns the state of the scaler as a dict.

update_scale(overflow)[源代码]

update the current loss scale value when overflow happens.

class mmcv.runner.LrUpdaterHook(by_epoch=True, warmup=None, warmup_iters=0, warmup_ratio=0.1, warmup_by_epoch=False)[源代码]

LR Scheduler in MMCV.

参数
  • by_epoch (bool) – LR changes epoch by epoch

  • warmup (string) – Type of warmup used. It can be None(use no warmup), ‘constant’, ‘linear’ or ‘exp’

  • warmup_iters (int) – The number of iterations or epochs that warmup lasts

  • warmup_ratio (float) – LR used at the beginning of warmup equals to warmup_ratio * initial_lr

  • warmup_by_epoch (bool) – When warmup_by_epoch == True, warmup_iters means the number of epochs that warmup lasts, otherwise means the number of iteration that warmup lasts

class mmcv.runner.MlflowLoggerHook(exp_name=None, tags=None, log_model=True, interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[源代码]

Class to log metrics and (optionally) a trained model to MLflow.

It requires MLflow to be installed.

参数
  • exp_name (str, optional) – Name of the experiment to be used. Default None. If not None, set the active experiment. If experiment does not exist, an experiment with provided name will be created.

  • tags (Dict[str], optional) – Tags for the current run. Default None. If not None, set tags for the current run.

  • log_model (bool, optional) – Whether to log an MLflow artifact. Default True. If True, log runner.model as an MLflow artifact for the current run.

  • interval (int) – Logging interval (every k iterations). Default: 10.

  • ignore_last (bool) – Ignore the log of last iterations in each epoch if less than interval. Default: True.

  • reset_flag (bool) – Whether to clear the output buffer after logging. Default: False.

  • by_epoch (bool) – Whether EpochBasedRunner is used. Default: True.

class mmcv.runner.ModuleDict(modules=None, init_cfg=None)[源代码]

ModuleDict in openmmlab.

参数
  • modules (dict, optional) – a mapping (dictionary) of (string: module) or an iterable of key-value pairs of type (string, module).

  • init_cfg (dict, optional) – Initialization config dict.

class mmcv.runner.ModuleList(modules=None, init_cfg=None)[源代码]

ModuleList in openmmlab.

参数
  • modules (iterable, optional) – an iterable of modules to add.

  • init_cfg (dict, optional) – Initialization config dict.

class mmcv.runner.NeptuneLoggerHook(init_kwargs=None, interval=10, ignore_last=True, reset_flag=True, with_step=True, by_epoch=True)[源代码]

Class to log metrics to NeptuneAI.

It requires Neptune to be installed.

参数
  • init_kwargs (dict) –

    a dict contains the initialization keys as below:

    • project (str): Name of a project in a form of namespace/project_name. If None, the value of NEPTUNE_PROJECT environment variable will be taken.

    • api_token (str): User’s API token. If None, the value of NEPTUNE_API_TOKEN environment variable will be taken. Note: It is strongly recommended to use NEPTUNE_API_TOKEN environment variable rather than placing your API token in plain text in your source code.

    • name (str, optional, default is ‘Untitled’): Editable name of the run. Name is displayed in the run’s Details and in Runs table as a column.

    Check https://docs.neptune.ai/api-reference/neptune#init for more init arguments.

  • interval (int) – Logging interval (every k iterations). Default: 10.

  • ignore_last (bool) – Ignore the log of last iterations in each epoch if less than interval. Default: True.

  • reset_flag (bool) – Whether to clear the output buffer after logging. Default: True.

  • with_step (bool) – If True, the step will be logged from self.get_iters. Otherwise, step will not be logged. Default: True.

  • by_epoch (bool) – Whether EpochBasedRunner is used. Default: True.

class mmcv.runner.OneCycleLrUpdaterHook(max_lr, total_steps=None, pct_start=0.3, anneal_strategy='cos', div_factor=25, final_div_factor=10000.0, three_phase=False, **kwargs)[源代码]

One Cycle LR Scheduler.

The 1cycle learning rate policy changes the learning rate after every batch. The one cycle learning rate policy is described in https://arxiv.org/pdf/1708.07120.pdf

参数
  • max_lr (float or list) – Upper learning rate boundaries in the cycle for each parameter group.

  • total_steps (int, optional) – The total number of steps in the cycle. Note that if a value is not provided here, it will be the max_iter of runner. Default: None.

  • pct_start (float) – The percentage of the cycle (in number of steps) spent increasing the learning rate. Default: 0.3

  • anneal_strategy (str) – {‘cos’, ‘linear’} Specifies the annealing strategy: ‘cos’ for cosine annealing, ‘linear’ for linear annealing. Default: ‘cos’

  • div_factor (float) – Determines the initial learning rate via initial_lr = max_lr/div_factor Default: 25

  • final_div_factor (float) – Determines the minimum learning rate via min_lr = initial_lr/final_div_factor Default: 1e4

  • three_phase (bool) – If three_phase is True, use a third phase of the schedule to annihilate the learning rate according to final_div_factor instead of modifying the second phase (the first two phases will be symmetrical about the step indicated by pct_start). Default: False

class mmcv.runner.OneCycleMomentumUpdaterHook(base_momentum=0.85, max_momentum=0.95, pct_start=0.3, anneal_strategy='cos', three_phase=False, **kwargs)[源代码]

OneCycle momentum Scheduler.

This momentum scheduler usually used together with the OneCycleLrUpdater to improve the performance.

参数
  • base_momentum (float or list) – Lower momentum boundaries in the cycle for each parameter group. Note that momentum is cycled inversely to learning rate; at the peak of a cycle, momentum is ‘base_momentum’ and learning rate is ‘max_lr’. Default: 0.85

  • max_momentum (float or list) – Upper momentum boundaries in the cycle for each parameter group. Functionally, it defines the cycle amplitude (max_momentum - base_momentum). Note that momentum is cycled inversely to learning rate; at the start of a cycle, momentum is ‘max_momentum’ and learning rate is ‘base_lr’ Default: 0.95

  • pct_start (float) – The percentage of the cycle (in number of steps) spent increasing the learning rate. Default: 0.3

  • anneal_strategy (str) – {‘cos’, ‘linear’} Specifies the annealing strategy: ‘cos’ for cosine annealing, ‘linear’ for linear annealing. Default: ‘cos’

  • three_phase (bool) – If three_phase is True, use a third phase of the schedule to annihilate the learning rate according to final_div_factor instead of modifying the second phase (the first two phases will be symmetrical about the step indicated by pct_start). Default: False

class mmcv.runner.OptimizerHook(grad_clip=None, detect_anomalous_params=False)[源代码]

A hook contains custom operations for the optimizer.

参数
  • grad_clip (dict, optional) – A config dict to control the clip_grad. Default: None.

  • detect_anomalous_params (bool) –

    This option is only used for debugging which will slow down the training speed. Detect anomalous parameters that are not included in the computational graph with loss as the root. There are two cases

    • Parameters were not used during forward pass.

    • Parameters were not used to produce loss.

    Default: False.

class mmcv.runner.PaviLoggerHook(init_kwargs=None, add_graph=False, add_last_ckpt=False, interval=10, ignore_last=True, reset_flag=False, by_epoch=True, img_key='img_info')[源代码]

Class to visual model, log metrics (for internal use).

参数
  • init_kwargs (dict) – A dict contains the initialization keys.

  • add_graph (bool) – Whether to visual model. Default: False.

  • add_last_ckpt (bool) – Whether to save checkpoint after run. Default: False.

  • interval (int) – Logging interval (every k iterations). Default: True.

  • ignore_last (bool) – Ignore the log of last iterations in each epoch if less than interval. Default: True.

  • reset_flag (bool) – Whether to clear the output buffer after logging. Default: False.

  • by_epoch (bool) – Whether EpochBasedRunner is used. Default: True.

  • img_key (string) – Get image data from Dataset. Default: ‘img_info’.

get_step(runner)[源代码]

Get the total training step/epoch.

class mmcv.runner.PolyLrUpdaterHook(power=1.0, min_lr=0.0, **kwargs)[源代码]
class mmcv.runner.Priority(value)[源代码]

Hook priority levels.

Level

Value

HIGHEST

0

VERY_HIGH

10

HIGH

30

ABOVE_NORMAL

40

NORMAL

50

BELOW_NORMAL

60

LOW

70

VERY_LOW

90

LOWEST

100

class mmcv.runner.Runner(*args, **kwargs)[源代码]

Deprecated name of EpochBasedRunner.

class mmcv.runner.Sequential(*args, init_cfg=None)[源代码]

Sequential module in openmmlab.

参数

init_cfg (dict, optional) – Initialization config dict.

class mmcv.runner.StepLrUpdaterHook(step, gamma=0.1, min_lr=None, **kwargs)[源代码]

Step LR scheduler with min_lr clipping.

参数
  • step (int | list[int]) – Step to decay the LR. If an int value is given, regard it as the decay interval. If a list is given, decay LR at these steps.

  • gamma (float, optional) – Decay LR ratio. Default: 0.1.

  • min_lr (float, optional) – Minimum LR value to keep. If LR after decay is lower than min_lr, it will be clipped to this value. If None is given, we don’t perform lr clipping. Default: None.

class mmcv.runner.StepMomentumUpdaterHook(step, gamma=0.5, min_momentum=None, **kwargs)[源代码]

Step momentum scheduler with min value clipping.

参数
  • step (int | list[int]) – Step to decay the momentum. If an int value is given, regard it as the decay interval. If a list is given, decay momentum at these steps.

  • gamma (float, optional) – Decay momentum ratio. Default: 0.5.

  • min_momentum (float, optional) – Minimum momentum value to keep. If momentum after decay is lower than this value, it will be clipped accordingly. If None is given, we don’t perform lr clipping. Default: None.

class mmcv.runner.SyncBuffersHook(distributed=True)[源代码]

Synchronize model buffers such as running_mean and running_var in BN at the end of each epoch.

参数

distributed (bool) – Whether distributed training is used. It is effective only for distributed training. Defaults to True.

after_epoch(runner)[源代码]

All-reduce model buffers at the end of each epoch.

class mmcv.runner.TensorboardLoggerHook(log_dir=None, interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[源代码]

Class to log metrics to Tensorboard.

参数
  • log_dir (string) – Save directory location. Default: None. If default values are used, directory location is runner.work_dir/tf_logs.

  • interval (int) – Logging interval (every k iterations). Default: True.

  • ignore_last (bool) – Ignore the log of last iterations in each epoch if less than interval. Default: True.

  • reset_flag (bool) – Whether to clear the output buffer after logging. Default: False.

  • by_epoch (bool) – Whether EpochBasedRunner is used. Default: True.

class mmcv.runner.TextLoggerHook(by_epoch=True, interval=10, ignore_last=True, reset_flag=False, interval_exp_name=1000, out_dir=None, out_suffix=('.log.json', '.log', '.py'), keep_local=True, file_client_args=None)[源代码]

Logger hook in text.

In this logger hook, the information will be printed on terminal and saved in json file.

参数
  • by_epoch (bool, optional) – Whether EpochBasedRunner is used. Default: True.

  • interval (int, optional) – Logging interval (every k iterations). Default: 10.

  • ignore_last (bool, optional) – Ignore the log of last iterations in each epoch if less than interval. Default: True.

  • reset_flag (bool, optional) – Whether to clear the output buffer after logging. Default: False.

  • interval_exp_name (int, optional) – Logging interval for experiment name. This feature is to help users conveniently get the experiment information from screen or log file. Default: 1000.

  • out_dir (str, optional) – Logs are saved in runner.work_dir default. If out_dir is specified, logs will be copied to a new directory which is the concatenation of out_dir and the last level directory of runner.work_dir. Default: None. New in version 1.3.16.

  • out_suffix (str or tuple[str], optional) – Those filenames ending with out_suffix will be copied to out_dir. Default: (‘.log.json’, ‘.log’, ‘.py’). New in version 1.3.16.

  • keep_local (bool, optional) – Whether to keep local log when out_dir is specified. If False, the local log will be removed. Default: True. New in version 1.3.16.

  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None. New in version 1.3.16.

class mmcv.runner.WandbLoggerHook(init_kwargs=None, interval=10, ignore_last=True, reset_flag=False, commit=True, by_epoch=True, with_step=True, log_artifact=True, out_suffix=('.log.json', '.log', '.py'))[源代码]

Class to log metrics with wandb.

It requires wandb to be installed.

参数
  • init_kwargs (dict) – A dict contains the initialization keys. Check https://docs.wandb.ai/ref/python/init for more init arguments.

  • interval (int) – Logging interval (every k iterations). Default 10.

  • ignore_last (bool) – Ignore the log of last iterations in each epoch if less than interval. Default: True.

  • reset_flag (bool) – Whether to clear the output buffer after logging. Default: False.

  • commit (bool) – Save the metrics dict to the wandb server and increment the step. If false wandb.log just updates the current metrics dict with the row argument and metrics won’t be saved until wandb.log is called with commit=True. Default: True.

  • by_epoch (bool) – Whether EpochBasedRunner is used. Default: True.

  • with_step (bool) – If True, the step will be logged from self.get_iters. Otherwise, step will not be logged. Default: True.

  • log_artifact (bool) – If True, artifacts in {work_dir} will be uploaded to wandb after training ends. Default: True New in version 1.4.3.

  • out_suffix (str or tuple[str], optional) – Those filenames ending with out_suffix will be uploaded to wandb. Default: (‘.log.json’, ‘.log’, ‘.py’). New in version 1.4.3.

mmcv.runner.allreduce_grads(params, coalesce=True, bucket_size_mb=- 1)[源代码]

Allreduce gradients.

参数
  • params (list[torch.Parameters]) – List of parameters of a model

  • coalesce (bool, optional) – Whether allreduce parameters as a whole. Defaults to True.

  • bucket_size_mb (int, optional) – Size of bucket, the unit is MB. Defaults to -1.

mmcv.runner.allreduce_params(params, coalesce=True, bucket_size_mb=- 1)[源代码]

Allreduce parameters.

参数
  • params (list[torch.Parameters]) – List of parameters or buffers of a model.

  • coalesce (bool, optional) – Whether allreduce parameters as a whole. Defaults to True.

  • bucket_size_mb (int, optional) – Size of bucket, the unit is MB. Defaults to -1.

mmcv.runner.auto_fp16(apply_to=None, out_fp32=False)[源代码]

Decorator to enable fp16 training automatically.

This decorator is useful when you write custom modules and want to support mixed precision training. If inputs arguments are fp32 tensors, they will be converted to fp16 automatically. Arguments other than fp32 tensors are ignored. If you are using PyTorch >= 1.6, torch.cuda.amp is used as the backend, otherwise, original mmcv implementation will be adopted.

参数
  • apply_to (Iterable, optional) – The argument names to be converted. None indicates all arguments.

  • out_fp32 (bool) – Whether to convert the output back to fp32.

示例

>>> import torch.nn as nn
>>> class MyModule1(nn.Module):
>>>
>>>     # Convert x and y to fp16
>>>     @auto_fp16()
>>>     def forward(self, x, y):
>>>         pass
>>> import torch.nn as nn
>>> class MyModule2(nn.Module):
>>>
>>>     # convert pred to fp16
>>>     @auto_fp16(apply_to=('pred', ))
>>>     def do_something(self, pred, others):
>>>         pass
mmcv.runner.force_fp32(apply_to=None, out_fp16=False)[源代码]

Decorator to convert input arguments to fp32 in force.

This decorator is useful when you write custom modules and want to support mixed precision training. If there are some inputs that must be processed in fp32 mode, then this decorator can handle it. If inputs arguments are fp16 tensors, they will be converted to fp32 automatically. Arguments other than fp16 tensors are ignored. If you are using PyTorch >= 1.6, torch.cuda.amp is used as the backend, otherwise, original mmcv implementation will be adopted.

参数
  • apply_to (Iterable, optional) – The argument names to be converted. None indicates all arguments.

  • out_fp16 (bool) – Whether to convert the output back to fp16.

示例

>>> import torch.nn as nn
>>> class MyModule1(nn.Module):
>>>
>>>     # Convert x and y to fp32
>>>     @force_fp32()
>>>     def loss(self, x, y):
>>>         pass
>>> import torch.nn as nn
>>> class MyModule2(nn.Module):
>>>
>>>     # convert pred to fp32
>>>     @force_fp32(apply_to=('pred', ))
>>>     def post_process(self, pred, others):
>>>         pass
mmcv.runner.get_host_info()[源代码]

Get hostname and username.

Return empty string if exception raised, e.g. getpass.getuser() will lead to error in docker container

mmcv.runner.get_priority(priority)[源代码]

Get priority value.

参数

priority (int or str or Priority) – Priority.

返回

The priority value.

返回类型

int

mmcv.runner.load_checkpoint(model, filename, map_location=None, strict=False, logger=None, revise_keys=[('^module\\.', '')])[源代码]

Load checkpoint from a file or URI.

参数
  • model (Module) – Module to load checkpoint.

  • filename (str) – Accept local filepath, URL, torchvision://xxx, open-mmlab://xxx. Please refer to docs/model_zoo.md for details.

  • map_location (str) – Same as torch.load().

  • strict (bool) – Whether to allow different params for the model and checkpoint.

  • logger (logging.Logger or None) – The logger for error message.

  • revise_keys (list) – A list of customized keywords to modify the state_dict in checkpoint. Each item is a (pattern, replacement) pair of the regular expression operations. Default: strip the prefix ‘module.’ by [(r’^module.’, ‘’)].

返回

The loaded checkpoint.

返回类型

dict or OrderedDict

mmcv.runner.load_state_dict(module, state_dict, strict=False, logger=None)[源代码]

Load state_dict to a module.

This method is modified from torch.nn.Module.load_state_dict(). Default value for strict is set to False and the message for param mismatch will be shown even if strict is False.

参数
  • module (Module) – Module that receives the state_dict.

  • state_dict (OrderedDict) – Weights.

  • strict (bool) – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function. Default: False.

  • logger (logging.Logger, optional) – Logger to log the error message. If not specified, print function will be used.

mmcv.runner.obj_from_dict(info, parent=None, default_args=None)[源代码]

Initialize an object from dict.

The dict must contain the key “type”, which indicates the object type, it can be either a string or type, such as “list” or list. Remaining fields are treated as the arguments for constructing the object.

参数
  • info (dict) – Object types and arguments.

  • parent (module) – Module which may containing expected object classes.

  • default_args (dict, optional) – Default arguments for initializing the object.

返回

Object built from the dict.

返回类型

any type

mmcv.runner.save_checkpoint(model, filename, optimizer=None, meta=None, file_client_args=None)[源代码]

Save checkpoint to file.

The checkpoint will have 3 fields: meta, state_dict and optimizer. By default meta will contain version and time info.

参数
  • model (Module) – Module whose params are to be saved.

  • filename (str) – Checkpoint filename.

  • optimizer (Optimizer, optional) – Optimizer to be saved.

  • meta (dict, optional) – Metadata to be saved in checkpoint.

  • file_client_args (dict, optional) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Default: None. New in version 1.3.16.

mmcv.runner.set_random_seed(seed, deterministic=False, use_rank_shift=False)[源代码]

Set random seed.

参数
  • seed (int) – Seed to be used.

  • deterministic (bool) – Whether to set the deterministic option for CUDNN backend, i.e., set torch.backends.cudnn.deterministic to True and torch.backends.cudnn.benchmark to False. Default: False.

  • rank_shift (bool) – Whether to add rank number to the random seed to have different random seed in different threads. Default: False.

mmcv.runner.weights_to_cpu(state_dict)[源代码]

Copy a model state_dict to cpu.

参数

state_dict (OrderedDict) – Model weights on GPU.

返回

Model weights on GPU.

返回类型

OrderedDict

mmcv.runner.wrap_fp16_model(model)[源代码]

Wrap the FP32 model to FP16.

If you are using PyTorch >= 1.6, torch.cuda.amp is used as the backend, otherwise, original mmcv implementation will be adopted.

For PyTorch >= 1.6, this function will 1. Set fp16 flag inside the model to True.

Otherwise: 1. Convert FP32 model to FP16. 2. Remain some necessary layers to be FP32, e.g., normalization layers. 3. Set fp16_enabled flag inside the model to True.

参数

model (nn.Module) – Model in FP32.

engine

mmcv.engine.collect_results_cpu(result_part, size, tmpdir=None)[源代码]

Collect results under cpu mode.

On cpu mode, this function will save the results on different gpus to tmpdir and collect them by the rank 0 worker.

参数
  • result_part (list) – Result list containing result parts to be collected.

  • size (int) – Size of the results, commonly equal to length of the results.

  • tmpdir (str | None) – temporal directory for collected results to store. If set to None, it will create a random temporal directory for it.

返回

The collected results.

返回类型

list

mmcv.engine.collect_results_gpu(result_part, size)[源代码]

Collect results under gpu mode.

On gpu mode, this function will encode results to gpu tensors and use gpu communication for results collection.

参数
  • result_part (list) – Result list containing result parts to be collected.

  • size (int) – Size of the results, commonly equal to length of the results.

返回

The collected results.

返回类型

list

mmcv.engine.multi_gpu_test(model, data_loader, tmpdir=None, gpu_collect=False)[源代码]

Test model with multiple gpus.

This method tests model with multiple gpus and collects the results under two different modes: gpu and cpu modes. By setting gpu_collect=True, it encodes results to gpu tensors and use gpu communication for results collection. On cpu mode it saves the results on different gpus to tmpdir and collects them by the rank 0 worker.

参数
  • model (nn.Module) – Model to be tested.

  • data_loader (nn.Dataloader) – Pytorch data loader.

  • tmpdir (str) – Path of directory to save the temporary results from different gpus under cpu mode.

  • gpu_collect (bool) – Option to use either gpu or cpu to collect results.

返回

The prediction results.

返回类型

list

mmcv.engine.single_gpu_test(model, data_loader)[源代码]

Test model with a single gpu.

This method tests model with a single gpu and displays test progress bar.

参数
  • model (nn.Module) – Model to be tested.

  • data_loader (nn.Dataloader) – Pytorch data loader.

返回

The prediction results.

返回类型

list

ops

class mmcv.ops.BorderAlign(pool_size)[源代码]

Border align pooling layer.

Applies border_align over the input feature based on predicted bboxes. The details were described in the paper BorderDet: Border Feature for Dense Object Detection.

For each border line (e.g. top, left, bottom or right) of each box, border_align does the following:

  1. uniformly samples pool_size +1 positions on this line, involving the start and end points.

  2. the corresponding features on these points are computed by bilinear interpolation.

  3. max pooling over all the pool_size +1 positions are used for computing pooled feature.

参数

pool_size (int) – number of positions sampled over the boxes’ borders (e.g. top, bottom, left, right).

forward(input, boxes)[源代码]
参数
  • input – Features with shape [N,4C,H,W]. Channels ranged in [0,C), [C,2C), [2C,3C), [3C,4C) represent the top, left, bottom, right features respectively.

  • boxes – Boxes with shape [N,H*W,4]. Coordinate format (x1,y1,x2,y2).

返回

Pooled features with shape [N,C,H*W,4]. The order is (top,left,bottom,right) for the last dimension.

返回类型

torch.Tensor

class mmcv.ops.CARAFE(kernel_size, group_size, scale_factor)[源代码]

CARAFE: Content-Aware ReAssembly of FEatures

Please refer to CARAFE: Content-Aware ReAssembly of FEatures for more details.

参数
  • kernel_size (int) – reassemble kernel size

  • group_size (int) – reassemble group size

  • scale_factor (int) – upsample ratio

返回

upsampled feature map

forward(features, masks)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.CARAFENaive(kernel_size, group_size, scale_factor)[源代码]
forward(features, masks)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.CARAFEPack(channels, scale_factor, up_kernel=5, up_group=1, encoder_kernel=3, encoder_dilation=1, compressed_channels=64)[源代码]

A unified package of CARAFE upsampler that contains: 1) channel compressor 2) content encoder 3) CARAFE op.

Official implementation of ICCV 2019 paper CARAFE: Content-Aware ReAssembly of FEatures.

参数
  • channels (int) – input feature channels

  • scale_factor (int) – upsample ratio

  • up_kernel (int) – kernel size of CARAFE op

  • up_group (int) – group size of CARAFE op

  • encoder_kernel (int) – kernel size of content encoder

  • encoder_dilation (int) – dilation of content encoder

  • compressed_channels (int) – output channels of channels compressor

返回

upsampled feature map

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

mmcv.ops.Conv2d

alias of mmcv.ops.deprecated_wrappers.Conv2d_deprecated

mmcv.ops.ConvTranspose2d

alias of mmcv.ops.deprecated_wrappers.ConvTranspose2d_deprecated

class mmcv.ops.CornerPool(mode)[源代码]

Corner Pooling.

Corner Pooling is a new type of pooling layer that helps a convolutional network better localize corners of bounding boxes.

Please refer to CornerNet: Detecting Objects as Paired Keypoints for more details.

Code is modified from https://github.com/princeton-vl/CornerNet-Lite.

参数

mode (str) –

Pooling orientation for the pooling layer

  • ’bottom’: Bottom Pooling

  • ’left’: Left Pooling

  • ’right’: Right Pooling

  • ’top’: Top Pooling

返回

Feature map after pooling.

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.Correlation(kernel_size: int = 1, max_displacement: int = 1, stride: int = 1, padding: int = 0, dilation: int = 1, dilation_patch: int = 1)[源代码]

Correlation operator

This correlation operator works for optical flow correlation computation.

There are two batched tensors with shape \((N, C, H, W)\), and the correlation output’s shape is \((N, max\_displacement \times 2 + 1, max\_displacement * 2 + 1, H_{out}, W_{out})\)

where

\[H_{out} = \left\lfloor\frac{H_{in} + 2 \times padding - dilation \times (kernel\_size - 1) - 1} {stride} + 1\right\rfloor\]
\[W_{out} = \left\lfloor\frac{W_{in} + 2 \times padding - dilation \times (kernel\_size - 1) - 1} {stride} + 1\right\rfloor\]

the correlation item \((N_i, dy, dx)\) is formed by taking the sliding window convolution between input1 and shifted input2,

\[Corr(N_i, dx, dy) = \sum_{c=0}^{C-1} input1(N_i, c) \star \mathcal{S}(input2(N_i, c), dy, dx)\]

where \(\star\) is the valid 2d sliding window convolution operator, and \(\mathcal{S}\) means shifting the input features (auto-complete zero marginal), and \(dx, dy\) are shifting distance, \(dx, dy \in [-max\_displacement \times dilation\_patch, max\_displacement \times dilation\_patch]\).

参数
  • kernel_size (int) – The size of sliding window i.e. local neighborhood representing the center points and involved in correlation computation. Defaults to 1.

  • max_displacement (int) – The radius for computing correlation volume, but the actual working space can be dilated by dilation_patch. Defaults to 1.

  • stride (int) – The stride of the sliding blocks in the input spatial dimensions. Defaults to 1.

  • padding (int) – Zero padding added to all four sides of the input1. Defaults to 0.

  • dilation (int) – The spacing of local neighborhood that will involved in correlation. Defaults to 1.

  • dilation_patch (int) – The spacing between position need to compute correlation. Defaults to 1.

forward(input1: torch.Tensor, input2: torch.Tensor)torch.Tensor[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.CrissCrossAttention(in_channels)[源代码]

Criss-Cross Attention Module.

注解

Before v1.3.13, we use a CUDA op. Since v1.3.13, we switch to a pure PyTorch and equivalent implementation. For more details, please refer to https://github.com/open-mmlab/mmcv/pull/1201.

Speed comparison for one forward pass

  • Input size: [2,512,97,97]

  • Device: 1 NVIDIA GeForce RTX 2080 Ti

PyTorch version

CUDA version

Relative speed

with torch.no_grad()

0.00554402 s

0.0299619 s

5.4x

no with torch.no_grad()

0.00562803 s

0.0301349 s

5.4x

参数

in_channels (int) – Channels of the input feature map.

forward(x)[源代码]

forward function of Criss-Cross Attention.

参数

x (torch.Tensor) – Input feature with the shape of (batch_size, in_channels, height, width).

返回

Output of the layer, with the shape of (batch_size, in_channels, height, width)

返回类型

torch.Tensor

class mmcv.ops.DeformConv2d(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, ...]], stride: Union[int, Tuple[int, ...]] = 1, padding: Union[int, Tuple[int, ...]] = 0, dilation: Union[int, Tuple[int, ...]] = 1, groups: int = 1, deform_groups: int = 1, bias: bool = False, im2col_step: int = 32)[源代码]

Deformable 2D convolution.

Applies a deformable 2D convolution over an input signal composed of several input planes. DeformConv2d was described in the paper Deformable Convolutional Networks

注解

The argument im2col_step was added in version 1.3.17, which means number of samples processed by the im2col_cuda_kernel per call. It enables users to define batch_size and im2col_step more flexibly and solved issue mmcv#1440.

参数
  • in_channels (int) – Number of channels in the input image.

  • out_channels (int) – Number of channels produced by the convolution.

  • kernel_size (int, tuple) – Size of the convolving kernel.

  • stride (int, tuple) – Stride of the convolution. Default: 1.

  • padding (int or tuple) – Zero-padding added to both sides of the input. Default: 0.

  • dilation (int or tuple) – Spacing between kernel elements. Default: 1.

  • groups (int) – Number of blocked connections from input. channels to output channels. Default: 1.

  • deform_groups (int) – Number of deformable group partitions.

  • bias (bool) – If True, adds a learnable bias to the output. Default: False.

  • im2col_step (int) – Number of samples processed by im2col_cuda_kernel per call. It will work when batch_size > im2col_step, but batch_size must be divisible by im2col_step. Default: 32. New in version 1.3.17.

forward(x: torch.Tensor, offset: torch.Tensor)torch.Tensor[源代码]

Deformable Convolutional forward function.

参数
  • x (Tensor) – Input feature, shape (B, C_in, H_in, W_in)

  • offset (Tensor) –

    Offset for deformable convolution, shape (B, deform_groups*kernel_size[0]*kernel_size[1]*2, H_out, W_out), H_out, W_out are equal to the output’s.

    An offset is like [y0, x0, y1, x1, y2, x2, …, y8, x8]. The spatial arrangement is like:

    (x0, y0) (x1, y1) (x2, y2)
    (x3, y3) (x4, y4) (x5, y5)
    (x6, y6) (x7, y7) (x8, y8)
    

返回

Output of the layer.

返回类型

Tensor

class mmcv.ops.DeformConv2dPack(*args, **kwargs)[源代码]

A Deformable Conv Encapsulation that acts as normal Conv layers.

The offset tensor is like [y0, x0, y1, x1, y2, x2, …, y8, x8]. The spatial arrangement is like:

(x0, y0) (x1, y1) (x2, y2)
(x3, y3) (x4, y4) (x5, y5)
(x6, y6) (x7, y7) (x8, y8)
参数
  • in_channels (int) – Same as nn.Conv2d.

  • out_channels (int) – Same as nn.Conv2d.

  • kernel_size (int or tuple[int]) – Same as nn.Conv2d.

  • stride (int or tuple[int]) – Same as nn.Conv2d.

  • padding (int or tuple[int]) – Same as nn.Conv2d.

  • dilation (int or tuple[int]) – Same as nn.Conv2d.

  • groups (int) – Same as nn.Conv2d.

  • bias (bool or str) – If specified as auto, it will be decided by the norm_cfg. Bias will be set as True if norm_cfg is None, otherwise False.

forward(x)[源代码]

Deformable Convolutional forward function.

参数
  • x (Tensor) – Input feature, shape (B, C_in, H_in, W_in)

  • offset (Tensor) –

    Offset for deformable convolution, shape (B, deform_groups*kernel_size[0]*kernel_size[1]*2, H_out, W_out), H_out, W_out are equal to the output’s.

    An offset is like [y0, x0, y1, x1, y2, x2, …, y8, x8]. The spatial arrangement is like:

    (x0, y0) (x1, y1) (x2, y2)
    (x3, y3) (x4, y4) (x5, y5)
    (x6, y6) (x7, y7) (x8, y8)
    

返回

Output of the layer.

返回类型

Tensor

class mmcv.ops.DeformRoIPool(output_size, spatial_scale=1.0, sampling_ratio=0, gamma=0.1)[源代码]
forward(input, rois, offset=None)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.DeformRoIPoolPack(output_size, output_channels, deform_fc_channels=1024, spatial_scale=1.0, sampling_ratio=0, gamma=0.1)[源代码]
forward(input, rois)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.DynamicScatter(voxel_size, point_cloud_range, average_points: bool)[源代码]

Scatters points into voxels, used in the voxel encoder with dynamic voxelization.

注解

The CPU and GPU implementation get the same output, but have numerical difference after summation and division (e.g., 5e-7).

参数
  • voxel_size (list) – list [x, y, z] size of three dimension.

  • point_cloud_range (list) – The coordinate range of points, [x_min, y_min, z_min, x_max, y_max, z_max].

  • average_points (bool) – whether to use avg pooling to scatter points into voxel.

forward(points, coors)[源代码]

Scatters points/features into voxels.

参数
  • points (torch.Tensor) – Points to be reduced into voxels.

  • coors (torch.Tensor) – Corresponding voxel coordinates (specifically multi-dim voxel index) of each points.

返回

A tuple contains two elements. The first one is the voxel features with shape [M, C] which are respectively reduced from input features that share the same voxel coordinates. The second is voxel coordinates with shape [M, ndim].

返回类型

tuple[torch.Tensor]

forward_single(points, coors)[源代码]

Scatters points into voxels.

参数
  • points (torch.Tensor) – Points to be reduced into voxels.

  • coors (torch.Tensor) – Corresponding voxel coordinates (specifically multi-dim voxel index) of each points.

返回

A tuple contains two elements. The first one is the voxel features with shape [M, C] which are respectively reduced from input features that share the same voxel coordinates. The second is voxel coordinates with shape [M, ndim].

返回类型

tuple[torch.Tensor]

class mmcv.ops.FusedBiasLeakyReLU(num_channels, negative_slope=0.2, scale=1.4142135623730951)[源代码]

Fused bias leaky ReLU.

This function is introduced in the StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN

The bias term comes from the convolution operation. In addition, to keep the variance of the feature map or gradients unchanged, they also adopt a scale similarly with Kaiming initialization. However, since the \(1+{alpha}^2\) is too small, we can just ignore it. Therefore, the final scale is just \(\sqrt{2}\). Of course, you may change it with your own scale.

TODO: Implement the CPU version.

参数
  • channel (int) – The channel number of the feature map.

  • negative_slope (float, optional) – Same as nn.LeakyRelu. Defaults to 0.2.

  • scale (float, optional) – A scalar to adjust the variance of the feature map. Defaults to 2**0.5.

forward(input)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.GroupAll(use_xyz: bool = True)[源代码]

Group xyz with feature.

参数

use_xyz (bool) – Whether to use xyz.

forward(xyz: torch.Tensor, new_xyz: torch.Tensor, features: Optional[torch.Tensor] = None)[源代码]
参数
  • xyz (Tensor) – (B, N, 3) xyz coordinates of the features.

  • new_xyz (Tensor) – new xyz coordinates of the features.

  • features (Tensor) – (B, C, N) features to group.

返回

(B, C + 3, 1, N) Grouped feature.

返回类型

Tensor

mmcv.ops.Linear

alias of mmcv.ops.deprecated_wrappers.Linear_deprecated

class mmcv.ops.MaskedConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)[源代码]

A MaskedConv2d which inherits the official Conv2d.

The masked forward doesn’t implement the backward function and only supports the stride parameter to be 1 currently.

forward(input, mask=None)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

mmcv.ops.MaxPool2d

alias of mmcv.ops.deprecated_wrappers.MaxPool2d_deprecated

class mmcv.ops.ModulatedDeformConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, deform_groups=1, bias=True)[源代码]
forward(x, offset, mask)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.ModulatedDeformConv2dPack(*args, **kwargs)[源代码]

A ModulatedDeformable Conv Encapsulation that acts as normal Conv layers.

参数
  • in_channels (int) – Same as nn.Conv2d.

  • out_channels (int) – Same as nn.Conv2d.

  • kernel_size (int or tuple[int]) – Same as nn.Conv2d.

  • stride (int) – Same as nn.Conv2d, while tuple is not supported.

  • padding (int) – Same as nn.Conv2d, while tuple is not supported.

  • dilation (int) – Same as nn.Conv2d, while tuple is not supported.

  • groups (int) – Same as nn.Conv2d.

  • bias (bool or str) – If specified as auto, it will be decided by the norm_cfg. Bias will be set as True if norm_cfg is None, otherwise False.

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.ModulatedDeformRoIPoolPack(output_size, output_channels, deform_fc_channels=1024, spatial_scale=1.0, sampling_ratio=0, gamma=0.1)[源代码]
forward(input, rois)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.MultiScaleDeformableAttention(embed_dims=256, num_heads=8, num_levels=4, num_points=4, im2col_step=64, dropout=0.1, batch_first=False, norm_cfg=None, init_cfg=None)[源代码]

An attention module used in Deformable-Detr.

Deformable DETR: Deformable Transformers for End-to-End Object Detection..

参数
  • embed_dims (int) – The embedding dimension of Attention. Default: 256.

  • num_heads (int) – Parallel attention heads. Default: 64.

  • num_levels (int) – The number of feature map used in Attention. Default: 4.

  • num_points (int) – The number of sampling points for each query in each head. Default: 4.

  • im2col_step (int) – The step used in image_to_column. Default: 64.

  • dropout (float) – A Dropout layer on inp_identity. Default: 0.1.

  • batch_first (bool) – Key, Query and Value are shape of (batch, n, embed_dim) or (n, batch, embed_dim). Default to False.

  • norm_cfg (dict) – Config dict for normalization layer. Default: None.

  • (obj (init_cfg) – mmcv.ConfigDict): The Config for initialization. Default: None.

forward(query, key=None, value=None, identity=None, query_pos=None, key_padding_mask=None, reference_points=None, spatial_shapes=None, level_start_index=None, **kwargs)[源代码]

Forward Function of MultiScaleDeformAttention.

参数
  • query (torch.Tensor) – Query of Transformer with shape (num_query, bs, embed_dims).

  • key (torch.Tensor) – The key tensor with shape (num_key, bs, embed_dims).

  • value (torch.Tensor) – The value tensor with shape (num_key, bs, embed_dims).

  • identity (torch.Tensor) – The tensor used for addition, with the same shape as query. Default None. If None, query will be used.

  • query_pos (torch.Tensor) – The positional encoding for query. Default: None.

  • key_pos (torch.Tensor) – The positional encoding for key. Default None.

  • reference_points (torch.Tensor) – The normalized reference points with shape (bs, num_query, num_levels, 2), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.

  • key_padding_mask (torch.Tensor) – ByteTensor for query, with shape [bs, num_key].

  • spatial_shapes (torch.Tensor) – Spatial shape of features in different levels. With shape (num_levels, 2), last dimension represents (h, w).

  • level_start_index (torch.Tensor) – The start index of each level. A tensor has shape (num_levels, ) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].

返回

forwarded results with shape [num_query, bs, embed_dims].

返回类型

torch.Tensor

init_weights()[源代码]

Default initialization for Parameters of Module.

class mmcv.ops.PSAMask(psa_type, mask_size=None)[源代码]
forward(input)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.PointsSampler(num_point: List[int], fps_mod_list: List[str] = ['D-FPS'], fps_sample_range_list: List[int] = [- 1])[源代码]

Points sampling.

参数
  • num_point (list[int]) – Number of sample points.

  • fps_mod_list (list[str], optional) – Type of FPS method, valid mod [‘F-FPS’, ‘D-FPS’, ‘FS’], Default: [‘D-FPS’]. F-FPS: using feature distances for FPS. D-FPS: using Euclidean distances of points for FPS. FS: using F-FPS and D-FPS simultaneously.

  • fps_sample_range_list (list[int], optional) – Range of points to apply FPS. Default: [-1].

forward(points_xyz, features)[源代码]
参数
  • points_xyz (torch.Tensor) – (B, N, 3) xyz coordinates of the points.

  • features (torch.Tensor) – (B, C, N) features of the points.

返回

(B, npoint, sample_num) Indices of sampled points.

返回类型

torch.Tensor

class mmcv.ops.QueryAndGroup(max_radius, sample_num, min_radius=0, use_xyz=True, return_grouped_xyz=False, normalize_xyz=False, uniform_sample=False, return_unique_cnt=False, return_grouped_idx=False)[源代码]

Groups points with a ball query of radius.

参数
  • max_radius (float) – The maximum radius of the balls. If None is given, we will use kNN sampling instead of ball query.

  • sample_num (int) – Maximum number of features to gather in the ball.

  • min_radius (float, optional) – The minimum radius of the balls. Default: 0.

  • use_xyz (bool, optional) – Whether to use xyz. Default: True.

  • return_grouped_xyz (bool, optional) – Whether to return grouped xyz. Default: False.

  • normalize_xyz (bool, optional) – Whether to normalize xyz. Default: False.

  • uniform_sample (bool, optional) – Whether to sample uniformly. Default: False

  • return_unique_cnt (bool, optional) – Whether to return the count of unique samples. Default: False.

  • return_grouped_idx (bool, optional) – Whether to return grouped idx. Default: False.

forward(points_xyz, center_xyz, features=None)[源代码]
参数
  • points_xyz (torch.Tensor) – (B, N, 3) xyz coordinates of the points.

  • center_xyz (torch.Tensor) – (B, npoint, 3) coordinates of the centriods.

  • features (torch.Tensor) – (B, C, N) The features of grouped points.

返回

(B, 3 + C, npoint, sample_num) Grouped concatenated coordinates and features of points.

返回类型

torch.Tensor

class mmcv.ops.RiRoIAlignRotated(out_size, spatial_scale, num_samples=0, num_orientations=8, clockwise=False)[源代码]

Rotation-invariant RoI align pooling layer for rotated proposals.

It accepts a feature map of shape (N, C, H, W) and rois with shape (n, 6) with each roi decoded as (batch_index, center_x, center_y, w, h, angle). The angle is in radian.

The details are described in the paper ReDet: A Rotation-equivariant Detector for Aerial Object Detection.

参数
  • out_size (tuple) – fixed dimensional RoI output with shape (h, w).

  • spatial_scale (float) – scale the input boxes by this number

  • num_samples (int) – number of inputs samples to take for each output sample. 0 to take samples densely for current models.

  • num_orientations (int) – number of oriented channels.

  • clockwise (bool) – If True, the angle in each proposal follows a clockwise fashion in image space, otherwise, the angle is counterclockwise. Default: False.

forward(features, rois)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.RoIAlign(output_size, spatial_scale=1.0, sampling_ratio=0, pool_mode='avg', aligned=True, use_torchvision=False)[源代码]

RoI align pooling layer.

参数
  • output_size (tuple) – h, w

  • spatial_scale (float) – scale the input boxes by this number

  • sampling_ratio (int) – number of inputs samples to take for each output sample. 0 to take samples densely for current models.

  • pool_mode (str, 'avg' or 'max') – pooling mode in each bin.

  • aligned (bool) – if False, use the legacy implementation in MMDetection. If True, align the results more perfectly.

  • use_torchvision (bool) – whether to use roi_align from torchvision.

注解

The implementation of RoIAlign when aligned=True is modified from https://github.com/facebookresearch/detectron2/

The meaning of aligned=True:

Given a continuous coordinate c, its two neighboring pixel indices (in our pixel model) are computed by floor(c - 0.5) and ceil(c - 0.5). For example, c=1.3 has pixel neighbors with discrete indices [0] and [1] (which are sampled from the underlying signal at continuous coordinates 0.5 and 1.5). But the original roi_align (aligned=False) does not subtract the 0.5 when computing neighboring pixel indices and therefore it uses pixels with a slightly incorrect alignment (relative to our pixel model) when performing bilinear interpolation.

With aligned=True, we first appropriately scale the ROI and then shift it by -0.5 prior to calling roi_align. This produces the correct neighbors;

The difference does not make a difference to the model’s performance if ROIAlign is used together with conv layers.

forward(input, rois)[源代码]
参数
  • input – NCHW images

  • rois – Bx5 boxes. First column is the index into N. The other 4 columns are xyxy.

class mmcv.ops.RoIAlignRotated(out_size, spatial_scale, sample_num=0, aligned=True, clockwise=False)[源代码]

RoI align pooling layer for rotated proposals.

It accepts a feature map of shape (N, C, H, W) and rois with shape (n, 6) with each roi decoded as (batch_index, center_x, center_y, w, h, angle). The angle is in radian.

参数
  • out_size (tuple) – h, w

  • spatial_scale (float) – scale the input boxes by this number

  • sample_num (int) – number of inputs samples to take for each output sample. 0 to take samples densely for current models.

  • aligned (bool) – if False, use the legacy implementation in MMDetection. If True, align the results more perfectly. Default: True.

  • clockwise (bool) – If True, the angle in each proposal follows a clockwise fashion in image space, otherwise, the angle is counterclockwise. Default: False.

注解

The implementation of RoIAlign when aligned=True is modified from https://github.com/facebookresearch/detectron2/

The meaning of aligned=True:

Given a continuous coordinate c, its two neighboring pixel indices (in our pixel model) are computed by floor(c - 0.5) and ceil(c - 0.5). For example, c=1.3 has pixel neighbors with discrete indices [0] and [1] (which are sampled from the underlying signal at continuous coordinates 0.5 and 1.5). But the original roi_align (aligned=False) does not subtract the 0.5 when computing neighboring pixel indices and therefore it uses pixels with a slightly incorrect alignment (relative to our pixel model) when performing bilinear interpolation.

With aligned=True, we first appropriately scale the ROI and then shift it by -0.5 prior to calling roi_align. This produces the correct neighbors;

The difference does not make a difference to the model’s performance if ROIAlign is used together with conv layers.

forward(features, rois)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.RoIAwarePool3d(out_size, max_pts_per_voxel=128, mode='max')[源代码]

Encode the geometry-specific features of each 3D proposal.

Please refer to PartA2 for more details.

参数
  • out_size (int or tuple) – The size of output features. n or [n1, n2, n3].

  • max_pts_per_voxel (int, optional) – The maximum number of points per voxel. Default: 128.

  • mode (str, optional) – Pooling method of RoIAware, ‘max’ or ‘avg’. Default: ‘max’.

forward(rois, pts, pts_feature)[源代码]
参数
  • rois (torch.Tensor) – [N, 7], in LiDAR coordinate, (x, y, z) is the bottom center of rois.

  • pts (torch.Tensor) – [npoints, 3], coordinates of input points.

  • pts_feature (torch.Tensor) – [npoints, C], features of input points.

返回

Pooled features whose shape is [N, out_x, out_y, out_z, C].

返回类型

torch.Tensor

class mmcv.ops.RoIPointPool3d(num_sampled_points=512)[源代码]

Encode the geometry-specific features of each 3D proposal.

Please refer to Paper of PartA2 for more details.

参数

num_sampled_points (int, optional) – Number of samples in each roi. Default: 512.

forward(points, point_features, boxes3d)[源代码]
参数
  • points (torch.Tensor) – Input points whose shape is (B, N, C).

  • point_features (torch.Tensor) – Features of input points whose shape is (B, N, C).

  • boxes3d (B, M, 7), Input bounding boxes whose shape is (B, M, 7) –

返回

A tuple contains two elements. The first one is the pooled features whose shape is (B, M, 512, 3 + C). The second is an empty flag whose shape is (B, M).

返回类型

tuple[torch.Tensor]

class mmcv.ops.RoIPool(output_size, spatial_scale=1.0)[源代码]
forward(input, rois)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.SAConv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, use_deform=False)[源代码]

SAC (Switchable Atrous Convolution)

This is an implementation of DetectoRS: Detecting Objects with Recursive Feature Pyramid and Switchable Atrous Convolution.

参数
  • in_channels (int) – Number of channels in the input image

  • out_channels (int) – Number of channels produced by the convolution

  • kernel_size (int or tuple) – Size of the convolving kernel

  • stride (int or tuple, optional) – Stride of the convolution. Default: 1

  • padding (int or tuple, optional) – Zero-padding added to both sides of the input. Default: 0

  • padding_mode (string, optional) – 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros'

  • dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1

  • groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1

  • bias (bool, optional) – If True, adds a learnable bias to the output. Default: True

  • use_deform – If True, replace convolution with deformable convolution. Default: False.

forward(x)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.SigmoidFocalLoss(gamma, alpha, weight=None, reduction='mean')[源代码]
forward(input, target)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.SimpleRoIAlign(output_size, spatial_scale, aligned=True)[源代码]
forward(features, rois)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.SoftmaxFocalLoss(gamma, alpha, weight=None, reduction='mean')[源代码]
forward(input, target)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.SyncBatchNorm(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, group=None, stats_mode='default')[源代码]

Synchronized Batch Normalization.

参数
  • num_features (int) – number of features/chennels in input tensor

  • eps (float, optional) – a value added to the denominator for numerical stability. Defaults to 1e-5.

  • momentum (float, optional) – the value used for the running_mean and running_var computation. Defaults to 0.1.

  • affine (bool, optional) – whether to use learnable affine parameters. Defaults to True.

  • track_running_stats (bool, optional) – whether to track the running mean and variance during training. When set to False, this module does not track such statistics, and initializes statistics buffers running_mean and running_var as None. When these buffers are None, this module always uses batch statistics in both training and eval modes. Defaults to True.

  • group (int, optional) – synchronization of stats happen within each process group individually. By default it is synchronization across the whole world. Defaults to None.

  • stats_mode (str, optional) – The statistical mode. Available options includes 'default' and 'N'. Defaults to ‘default’. When stats_mode=='default', it computes the overall statistics using those from each worker with equal weight, i.e., the statistics are synchronized and simply divied by group. This mode will produce inaccurate statistics when empty tensors occur. When stats_mode=='N', it compute the overall statistics using the total number of batches in each worker ignoring the number of group, i.e., the statistics are synchronized and then divied by the total batch N. This mode is beneficial when empty tensors occur during training, as it average the total mean by the real number of batch.

forward(input)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class mmcv.ops.TINShift[源代码]

Temporal Interlace Shift.

Temporal Interlace shift is a differentiable temporal-wise frame shifting which is proposed in “Temporal Interlacing Network”

Please refer to Temporal Interlacing Network for more details.

Code is modified from https://github.com/mit-han-lab/temporal-shift-module

forward(input, shift)[源代码]

Perform temporal interlace shift.

参数
  • input (torch.Tensor) – Feature map with shape [N, num_segments, C, H * W].

  • shift (torch.Tensor) – Shift tensor with shape [N, num_segments].

返回

Feature map after temporal interlace shift.

class mmcv.ops.Voxelization(voxel_size, point_cloud_range, max_num_points, max_voxels=20000)[源代码]

Convert kitti points(N, >=3) to voxels.

Please refer to Point-Voxel CNN for Efficient 3D Deep Learning for more details.

参数
  • voxel_size (tuple or float) – The size of voxel with the shape of [3].

  • point_cloud_range (tuple or float) – The coordinate range of voxel with the shape of [6].

  • max_num_points (int) – maximum points contained in a voxel. if max_points=-1, it means using dynamic_voxelize.

  • max_voxels (int, optional) – maximum voxels this function create. for second, 20000 is a good choice. Users should shuffle points before call this function because max_voxels may drop points. Default: 20000.

forward(input)[源代码]

Defines the computation performed at every call.

Should be overridden by all subclasses.

注解

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

mmcv.ops.batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)[源代码]

Performs non-maximum suppression in a batched fashion.

Modified from torchvision/ops/boxes.py#L39. In order to perform NMS independently per class, we add an offset to all the boxes. The offset is dependent only on the class idx, and is large enough so that boxes from different classes do not overlap.

注解

In v1.4.1 and later, batched_nms supports skipping the NMS and returns sorted raw results when nms_cfg is None.

参数
  • boxes (torch.Tensor) – boxes in shape (N, 4).

  • scores (torch.Tensor) – scores in shape (N, ).

  • idxs (torch.Tensor) – each index value correspond to a bbox cluster, and NMS will not be applied between elements of different idxs, shape (N, ).

  • nms_cfg (dict | None) –

    Supports skipping the nms when nms_cfg is None, otherwise it should specify nms type and other parameters like iou_thr. Possible keys includes the following.

    • iou_thr (float): IoU threshold used for NMS.

    • split_thr (float): threshold number of boxes. In some cases the number of boxes is large (e.g., 200k). To avoid OOM during training, the users could set split_thr to a small value. If the number of boxes is greater than the threshold, it will perform NMS on each group of boxes separately and sequentially. Defaults to 10000.

  • class_agnostic (bool) – if true, nms is class agnostic, i.e. IoU thresholding happens over all boxes, regardless of the predicted class.

返回

kept dets and indice.

  • boxes (Tensor): Bboxes with score after nms, has shape (num_bboxes, 5). last dimension 5 arrange as (x1, y1, x2, y2, score)

  • keep (Tensor): The indices of remaining boxes in input boxes.

返回类型

tuple

mmcv.ops.bbox_overlaps(bboxes1, bboxes2, mode='iou', aligned=False, offset=0)[源代码]

Calculate overlap between two set of bboxes.

If aligned is False, then calculate the ious between each bbox of bboxes1 and bboxes2, otherwise the ious between each aligned pair of bboxes1 and bboxes2.

参数
  • bboxes1 (torch.Tensor) – shape (m, 4) in <x1, y1, x2, y2> format or empty.

  • bboxes2 (torch.Tensor) – shape (n, 4) in <x1, y1, x2, y2> format or empty. If aligned is True, then m and n must be equal.

  • mode (str) – “iou” (intersection over union) or iof (intersection over foreground).

返回

Return the ious betweens boxes. If aligned is False, the shape of ious is (m, n) else (m, 1).

返回类型

torch.Tensor

示例

>>> bboxes1 = torch.FloatTensor([
>>>     [0, 0, 10, 10],
>>>     [10, 10, 20, 20],
>>>     [32, 32, 38, 42],
>>> ])
>>> bboxes2 = torch.FloatTensor([
>>>     [0, 0, 10, 20],
>>>     [0, 10, 10, 19],
>>>     [10, 10, 20, 20],
>>> ])
>>> bbox_overlaps(bboxes1, bboxes2)
tensor([[0.5000, 0.0000, 0.0000],
        [0.0000, 0.0000, 1.0000],
        [0.0000, 0.0000, 0.0000]])

示例

>>> empty = torch.FloatTensor([])
>>> nonempty = torch.FloatTensor([
>>>     [0, 0, 10, 9],
>>> ])
>>> assert tuple(bbox_overlaps(empty, nonempty).shape) == (0, 1)
>>> assert tuple(bbox_overlaps(nonempty, empty).shape) == (1, 0)
>>> assert tuple(bbox_overlaps(empty, empty).shape) == (0, 0)
mmcv.ops.box_iou_rotated(bboxes1, bboxes2, mode='iou', aligned=False, clockwise=True)[源代码]

Return intersection-over-union (Jaccard index) of boxes.

Both sets of boxes are expected to be in (x_center, y_center, width, height, angle) format.

If aligned is False, then calculate the ious between each bbox of bboxes1 and bboxes2, otherwise the ious between each aligned pair of bboxes1 and bboxes2.

注解

The operator assumes:

  1. The positive direction along x axis is left -> right.

  2. The positive direction along y axis is top -> down.

  3. The w border is in parallel with x axis when angle = 0.

However, there are 2 opposite definitions of the positive angular direction, clockwise (CW) and counter-clockwise (CCW). MMCV supports both definitions and uses CW by default.

Please set clockwise=False if you are using the CCW definition.

The coordinate system when clockwise is True (default)

0-------------------> x (0 rad)
|  A-------------B
|  |             |
|  |     box     h
|  |   angle=0   |
|  D------w------C
v
y (pi/2 rad)

In such coordination system the rotation matrix is

\[\begin{split}\begin{pmatrix} \cos\alpha & -\sin\alpha \\ \sin\alpha & \cos\alpha \end{pmatrix}\end{split}\]

The coordinates of the corner point A can be calculated as:

\[\begin{split}P_A= \begin{pmatrix} x_A \\ y_A\end{pmatrix} = \begin{pmatrix} x_{center} \\ y_{center}\end{pmatrix} + \begin{pmatrix}\cos\alpha & -\sin\alpha \\ \sin\alpha & \cos\alpha\end{pmatrix} \begin{pmatrix} -0.5w \\ -0.5h\end{pmatrix} \\ = \begin{pmatrix} x_{center}-0.5w\cos\alpha+0.5h\sin\alpha \\ y_{center}-0.5w\sin\alpha-0.5h\cos\alpha\end{pmatrix}\end{split}\]

The coordinate system when clockwise is False

0-------------------> x (0 rad)
|  A-------------B
|  |             |
|  |     box     h
|  |   angle=0   |
|  D------w------C
v
y (-pi/2 rad)

In such coordination system the rotation matrix is

\[\begin{split}\begin{pmatrix} \cos\alpha & \sin\alpha \\ -\sin\alpha & \cos\alpha \end{pmatrix}\end{split}\]

The coordinates of the corner point A can be calculated as:

\[\begin{split}P_A= \begin{pmatrix} x_A \\ y_A\end{pmatrix} = \begin{pmatrix} x_{center} \\ y_{center}\end{pmatrix} + \begin{pmatrix}\cos\alpha & \sin\alpha \\ -\sin\alpha & \cos\alpha\end{pmatrix} \begin{pmatrix} -0.5w \\ -0.5h\end{pmatrix} \\ = \begin{pmatrix} x_{center}-0.5w\cos\alpha-0.5h\sin\alpha \\ y_{center}+0.5w\sin\alpha-0.5h\cos\alpha\end{pmatrix}\end{split}\]
参数
  • boxes1 (torch.Tensor) – rotated bboxes 1. It has shape (N, 5), indicating (x, y, w, h, theta) for each row. Note that theta is in radian.

  • boxes2 (torch.Tensor) – rotated bboxes 2. It has shape (M, 5), indicating (x, y, w, h, theta) for each row. Note that theta is in radian.

  • mode (str) – “iou” (intersection over union) or iof (intersection over foreground).

  • clockwise (bool) – flag indicating whether the positive angular orientation is clockwise. default True. New in version 1.4.3.

返回

Return the ious betweens boxes. If aligned is False, the shape of ious is (N, M) else (N,).

返回类型

torch.Tensor

mmcv.ops.boxes_iou_bev(boxes_a, boxes_b)[源代码]

Calculate boxes IoU in the Bird’s Eye View.

参数
  • boxes_a (torch.Tensor) – Input boxes a with shape (M, 5).

  • boxes_b (torch.Tensor) – Input boxes b with shape (N, 5).

返回

IoU result with shape (M, N).

返回类型

torch.Tensor

mmcv.ops.contour_expand(kernel_mask, internal_kernel_label, min_kernel_area, kernel_num)[源代码]

Expand kernel contours so that foreground pixels are assigned into instances.

参数
  • kernel_mask (np.array or torch.Tensor) – The instance kernel mask with size hxw.

  • internal_kernel_label (np.array or torch.Tensor) – The instance internal kernel label with size hxw.

  • min_kernel_area (int) – The minimum kernel area.

  • kernel_num (int) – The instance kernel number.

返回

The instance index map with size hxw.

返回类型

list

mmcv.ops.convex_giou(pointsets, polygons)[源代码]

Return generalized intersection-over-union (Jaccard index) between point sets and polygons.

参数
  • pointsets (torch.Tensor) – It has shape (N, 18), indicating (x1, y1, x2, y2, …, x9, y9) for each row.

  • polygons (torch.Tensor) – It has shape (N, 8), indicating (x1, y1, x2, y2, x3, y3, x4, y4) for each row.

返回

The first element is the gious between point sets and polygons with the shape (N,). The second element is the gradient of point sets with the shape (N, 18).

返回类型

tuple[torch.Tensor, torch.Tensor]

mmcv.ops.convex_iou(pointsets, polygons)[源代码]

Return intersection-over-union (Jaccard index) between point sets and polygons.

参数
  • pointsets (torch.Tensor) – It has shape (N, 18), indicating (x1, y1, x2, y2, …, x9, y9) for each row.

  • polygons (torch.Tensor) – It has shape (K, 8), indicating (x1, y1, x2, y2, x3, y3, x4, y4) for each row.

返回

Return the ious between point sets and polygons with the shape (N, K).

返回类型

torch.Tensor

mmcv.ops.fused_bias_leakyrelu(input, bias, negative_slope=0.2, scale=1.4142135623730951)[源代码]

Fused bias leaky ReLU function.

This function is introduced in the StyleGAN2: Analyzing and Improving the Image Quality of StyleGAN

The bias term comes from the convolution operation. In addition, to keep the variance of the feature map or gradients unchanged, they also adopt a scale similarly with Kaiming initialization. However, since the \(1+{alpha}^2\) is too small, we can just ignore it. Therefore, the final scale is just \(\sqrt{2}\). Of course, you may change it with your own scale.

参数
  • input (torch.Tensor) – Input feature map.

  • bias (nn.Parameter) – The bias from convolution operation.

  • negative_slope (float, optional) – Same as nn.LeakyRelu. Defaults to 0.2.

  • scale (float, optional) – A scalar to adjust the variance of the feature map. Defaults to 2**0.5.

返回

Feature map after non-linear activation.

返回类型

torch.Tensor

mmcv.ops.min_area_polygons(pointsets)[源代码]

Find the smallest polygons that surrounds all points in the point sets.

参数

pointsets (Tensor) – point sets with shape (N, 18).

返回

Return the smallest polygons with shape (N, 8).

返回类型

torch.Tensor

mmcv.ops.nms(boxes, scores, iou_threshold, offset=0, score_threshold=0, max_num=- 1)[源代码]

Dispatch to either CPU or GPU NMS implementations.

The input can be either torch tensor or numpy array. GPU NMS will be used if the input is gpu tensor, otherwise CPU NMS will be used. The returned type will always be the same as inputs.

参数
  • boxes (torch.Tensor or np.ndarray) – boxes in shape (N, 4).

  • scores (torch.Tensor or np.ndarray) – scores in shape (N, ).

  • iou_threshold (float) – IoU threshold for NMS.

  • offset (int, 0 or 1) – boxes’ width or height is (x2 - x1 + offset).

  • score_threshold (float) – score threshold for NMS.

  • max_num (int) – maximum number of boxes after NMS.

返回

kept dets (boxes and scores) and indice, which always have the same data type as the input.

返回类型

tuple

示例

>>> boxes = np.array([[49.1, 32.4, 51.0, 35.9],
>>>                   [49.3, 32.9, 51.0, 35.3],
>>>                   [49.2, 31.8, 51.0, 35.4],
>>>                   [35.1, 11.5, 39.1, 15.7],
>>>                   [35.6, 11.8, 39.3, 14.2],
>>>                   [35.3, 11.5, 39.9, 14.5],
>>>                   [35.2, 11.7, 39.7, 15.7]], dtype=np.float32)
>>> scores = np.array([0.9, 0.9, 0.5, 0.5, 0.5, 0.4, 0.3],               dtype=np.float32)
>>> iou_threshold = 0.6
>>> dets, inds = nms(boxes, scores, iou_threshold)
>>> assert len(inds) == len(dets) == 3
mmcv.ops.nms_bev(boxes, scores, thresh, pre_max_size=None, post_max_size=None)[源代码]

NMS function GPU implementation (for BEV boxes). The overlap of two boxes for IoU calculation is defined as the exact overlapping area of the two boxes. In this function, one can also set pre_max_size and post_max_size.

参数
  • boxes (torch.Tensor) – Input boxes with the shape of [N, 5] ([x1, y1, x2, y2, ry]).

  • scores (torch.Tensor) – Scores of boxes with the shape of [N].

  • thresh (float) – Overlap threshold of NMS.

  • pre_max_size (int, optional) – Max size of boxes before NMS. Default: None.

  • post_max_size (int, optional) – Max size of boxes after NMS. Default: None.

返回

Indexes after NMS.

返回类型

torch.Tensor

mmcv.ops.nms_match(dets, iou_threshold)[源代码]

Matched dets into different groups by NMS.

NMS match is Similar to NMS but when a bbox is suppressed, nms match will record the indice of suppressed bbox and form a group with the indice of kept bbox. In each group, indice is sorted as score order.

参数
  • dets (torch.Tensor | np.ndarray) – Det boxes with scores, shape (N, 5).

  • iou_thr (float) – IoU thresh for NMS.

返回

The outer list corresponds different matched group, the inner Tensor corresponds the indices for a group in score order.

返回类型

list[torch.Tensor | np.ndarray]

mmcv.ops.nms_normal_bev(boxes, scores, thresh)[源代码]

Normal NMS function GPU implementation (for BEV boxes). The overlap of two boxes for IoU calculation is defined as the exact overlapping area of the two boxes WITH their yaw angle set to 0.

参数
  • boxes (torch.Tensor) – Input boxes with shape (N, 5).

  • scores (torch.Tensor) – Scores of predicted boxes with shape (N).

  • thresh (float) – Overlap threshold of NMS.

返回

Remaining indices with scores in descending order.

返回类型

torch.Tensor

mmcv.ops.nms_rotated(dets, scores, iou_threshold, labels=None, clockwise=True)[源代码]

Performs non-maximum suppression (NMS) on the rotated boxes according to their intersection-over-union (IoU).

Rotated NMS iteratively removes lower scoring rotated boxes which have an IoU greater than iou_threshold with another (higher scoring) rotated box.

参数
  • dets (Tensor) – Rotated boxes in shape (N, 5). They are expected to be in (x_ctr, y_ctr, width, height, angle_radian) format.

  • scores (Tensor) – scores in shape (N, ).

  • iou_threshold (float) – IoU thresh for NMS.

  • labels (Tensor) – boxes’ label in shape (N,).

  • clockwise (bool) – flag indicating whether the positive angular orientation is clockwise. default True. New in version 1.4.3.

返回

kept dets(boxes and scores) and indice, which is always the same data type as the input.

返回类型

tuple

mmcv.ops.pixel_group(score, mask, embedding, kernel_label, kernel_contour, kernel_region_num, distance_threshold)[源代码]

Group pixels into text instances, which is widely used text detection methods.

参数
  • score (np.array or torch.Tensor) – The foreground score with size hxw.

  • mask (np.array or Tensor) – The foreground mask with size hxw.

  • embedding (np.array or torch.Tensor) – The embedding with size hxwxc to distinguish instances.

  • kernel_label (np.array or torch.Tensor) – The instance kernel index with size hxw.

  • kernel_contour (np.array or torch.Tensor) – The kernel contour with size hxw.

  • kernel_region_num (int) – The instance kernel region number.

  • distance_threshold (float) – The embedding distance threshold between kernel and pixel in one instance.

返回

The instance coordinates and attributes list. Each element consists of averaged confidence, pixel number, and coordinates (x_i, y_i for all pixels) in order.

返回类型

list[list[float]]

mmcv.ops.point_sample(input, points, align_corners=False, **kwargs)[源代码]

A wrapper around grid_sample() to support 3D point_coords tensors Unlike torch.nn.functional.grid_sample() it assumes point_coords to lie inside [0, 1] x [0, 1] square.

参数
  • input (torch.Tensor) – Feature map, shape (N, C, H, W).

  • points (torch.Tensor) – Image based absolute point coordinates (normalized), range [0, 1] x [0, 1], shape (N, P, 2) or (N, Hgrid, Wgrid, 2).

  • align_corners (bool, optional) – Whether align_corners. Default: False

返回

Features of point on input, shape (N, C, P) or (N, C, Hgrid, Wgrid).

返回类型

torch.Tensor

mmcv.ops.points_in_boxes_all(points, boxes)[源代码]

Find all boxes in which each point is (CUDA).

参数
  • points (torch.Tensor) – [B, M, 3], [x, y, z] in LiDAR/DEPTH coordinate

  • boxes (torch.Tensor) – [B, T, 7], num_valid_boxes <= T, [x, y, z, x_size, y_size, z_size, rz], (x, y, z) is the bottom center.

返回

Return the box indices of points with the shape of (B, M, T). Default background = 0.

返回类型

torch.Tensor

mmcv.ops.points_in_boxes_cpu(points, boxes)[源代码]

Find all boxes in which each point is (CPU). The CPU version of points_in_boxes_all().

参数
  • points (torch.Tensor) – [B, M, 3], [x, y, z] in LiDAR/DEPTH coordinate

  • boxes (torch.Tensor) – [B, T, 7], num_valid_boxes <= T, [x, y, z, x_size, y_size, z_size, rz], (x, y, z) is the bottom center.

返回

Return the box indices of points with the shape of (B, M, T). Default background = 0.

返回类型

torch.Tensor

mmcv.ops.points_in_boxes_part(points, boxes)[源代码]

Find the box in which each point is (CUDA).

参数
  • points (torch.Tensor) – [B, M, 3], [x, y, z] in LiDAR/DEPTH coordinate.

  • boxes (torch.Tensor) – [B, T, 7], num_valid_boxes <= T, [x, y, z, x_size, y_size, z_size, rz] in LiDAR/DEPTH coordinate, (x, y, z) is the bottom center.

返回

Return the box indices of points with the shape of (B, M). Default background = -1.

返回类型

torch.Tensor

mmcv.ops.points_in_polygons(points, polygons)[源代码]

Judging whether points are inside polygons, which is used in the ATSS assignment for the rotated boxes.

It should be noted that when the point is just at the polygon boundary, the judgment will be inaccurate, but the effect on assignment is limited.

参数
  • points (torch.Tensor) – It has shape (B, 2), indicating (x, y). M means the number of predicted points.

  • polygons (torch.Tensor) – It has shape (M, 8), indicating (x1, y1, x2, y2, x3, y3, x4, y4). M means the number of ground truth polygons.

返回

Return the result with the shape of (B, M), 1 indicates that the point is inside the polygon, 0 indicates that the point is outside the polygon.

返回类型

torch.Tensor

mmcv.ops.rel_roi_point_to_rel_img_point(rois, rel_roi_points, img, spatial_scale=1.0)[源代码]

Convert roi based relative point coordinates to image based absolute point coordinates.

参数
  • rois (torch.Tensor) – RoIs or BBoxes, shape (N, 4) or (N, 5)

  • rel_roi_points (torch.Tensor) – Point coordinates inside RoI, relative to RoI, location, range (0, 1), shape (N, P, 2)

  • img (tuple or torch.Tensor) – (height, width) of image or feature map.

  • spatial_scale (float, optional) – Scale points by this factor. Default: 1.

返回

Image based relative point coordinates for sampling, shape (N, P, 2).

返回类型

torch.Tensor

mmcv.ops.soft_nms(boxes, scores, iou_threshold=0.3, sigma=0.5, min_score=0.001, method='linear', offset=0)[源代码]

Dispatch to only CPU Soft NMS implementations.

The input can be either a torch tensor or numpy array. The returned type will always be the same as inputs.

参数
  • boxes (torch.Tensor or np.ndarray) – boxes in shape (N, 4).

  • scores (torch.Tensor or np.ndarray) – scores in shape (N, ).

  • iou_threshold (float) – IoU threshold for NMS.

  • sigma (float) – hyperparameter for gaussian method

  • min_score (float) – score filter threshold

  • method (str) – either ‘linear’ or ‘gaussian’

  • offset (int, 0 or 1) – boxes’ width or height is (x2 - x1 + offset).

返回

kept dets (boxes and scores) and indice, which always have the same data type as the input.

返回类型

tuple

示例

>>> boxes = np.array([[4., 3., 5., 3.],
>>>                   [4., 3., 5., 4.],
>>>                   [3., 1., 3., 1.],
>>>                   [3., 1., 3., 1.],
>>>                   [3., 1., 3., 1.],
>>>                   [3., 1., 3., 1.]], dtype=np.float32)
>>> scores = np.array([0.9, 0.9, 0.5, 0.5, 0.4, 0.0], dtype=np.float32)
>>> iou_threshold = 0.6
>>> dets, inds = soft_nms(boxes, scores, iou_threshold, sigma=0.5)
>>> assert len(inds) == len(dets) == 5
mmcv.ops.upfirdn2d(input, kernel, up=1, down=1, pad=(0, 0))[源代码]

UpFRIDn for 2d features.

UpFIRDn is short for upsample, apply FIR filter and downsample. More details can be found in: https://www.mathworks.com/help/signal/ref/upfirdn.html

参数
  • input (torch.Tensor) – Tensor with shape of (n, c, h, w).

  • kernel (torch.Tensor) – Filter kernel.

  • up (int | tuple[int], optional) – Upsampling factor. If given a number, we will use this factor for the both height and width side. Defaults to 1.

  • down (int | tuple[int], optional) – Downsampling factor. If given a number, we will use this factor for the both height and width side. Defaults to 1.

  • pad (tuple[int], optional) – Padding for tensors, (x_pad, y_pad) or (x_pad_0, x_pad_1, y_pad_0, y_pad_1). Defaults to (0, 0).

返回

Tensor after UpFIRDn.

返回类型

torch.Tensor

Read the Docs v: v1.4.4
Versions
latest
stable
v1.4.4
v1.4.3
v1.4.2
v1.4.1
v1.4.0
v1.3.18
v1.3.17
v1.3.16
v1.3.15
v1.3.14
v1.3.13
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.