Skip to content

关于RandAugment实现的几个问题 #3359

@flintning

Description

@flintning

PaddleClas版本以及PaddlePaddle版本:PaddleClas release/2.6.0和PaddlePaddle 2.6.0
涉及的其他产品使用的版本号:Pillow 11.1.0
训练环境信息:
a. 操作系统: Linux64
b. Python版本号: Python3.10.12
c. CUDA/cuDNN版本: CUDA10.2/cuDNN 7.6.5等

https://github.com/PaddlePaddle/PaddleClas/blob/release/2.6/ppcls/data/preprocess/ops/randaugment.py 中RandAugment的实现和https://github.com/heartInsert/Randaugment/blob/master/Rand_Augment.py 的原始实现有差异。

  1. 随机性
    原始实现: 构造RandAugment时使用np.linspace给每种transform定义了长度为10的ranges数组,执行操作前,随机选择Numbers(默认转换列表的一半)种操作,随机生成Numbers个ranges索引,然后zip成(op,rangs[idx])对列表,因此每种操作都会有随机的Magnitude,最后逐一执行操作,此时大部分操作会随机选择Maginitude的符号决定方向。
    PaddleClas实现: 构造RandAugment时level_map仅给每种transform定义了固定的level,执行操作时仅随机选择num_layers种操作,每种操作的Magnitude就是固定的level,值没有随机性,最后部分操作(除了rotate)的magnitude符号还是会随机选择

我的问题是,RandAugment移除Magnitue取值随机性的处理是实际测试训练效果以后的处理吗? AutoAugment保持了原始实现的方式

  1. rotate
    两种实现都使用了rotate_with_fill(https://stackoverflow.com/questions/5252170/)实现旋转,这里有几个问题:
  • magnitude没有随机旋转方向,即只能逆时针旋转
  • 填充使用了固定值(128,)*4,处理灰度图片时不太合理,应根据构造RandAugment时传入的fillcolor
  • pillow版本5.2.0以后Image.rotate接口已经支持指定fillcolor,可以直接调用,不需要原来的复杂处理,fillcolor的问题也解决了

修改后的randaugment.py,测试灰度和彩色图片都没问题,Magnitue取值随机性的问题没有修改

class RandAugment(object):
    def __init__(self, num_layers=2, magnitude=5, fillcolor=(128, 128, 128)):
        self.num_layers = num_layers
        self.magnitude = magnitude
        self.max_level = 10

        abso_level = self.magnitude / self.max_level
        self.level_map = {
            "shearX": 0.3 * abso_level,
            "shearY": 0.3 * abso_level,
            "translateX": 150.0 / 331 * abso_level,
            "translateY": 150.0 / 331 * abso_level,
            "rotate": 30 * abso_level,
            "color": 0.9 * abso_level,
            "posterize": int(4.0 * abso_level),
            "solarize": 256.0 * abso_level,
            "contrast": 0.9 * abso_level,
            "sharpness": 0.9 * abso_level,
            "brightness": 0.9 * abso_level,
            "autocontrast": 0,
            "equalize": 0,
            "invert": 0
        }

        rnd_ch_op = random.choice

        self.func = {
            "shearX": lambda img, magnitude: img.transform(
                img.size,
                Image.AFFINE,
                (1, magnitude * rnd_ch_op([-1, 1]), 0, 0, 1, 0),
                Image.BICUBIC,
                fillcolor=fillcolor),
            "shearY": lambda img, magnitude: img.transform(
                img.size,
                Image.AFFINE,
                (1, 0, 0, magnitude * rnd_ch_op([-1, 1]), 1, 0),
                Image.BICUBIC,
                fillcolor=fillcolor),
            "translateX": lambda img, magnitude: img.transform(
                img.size,
                Image.AFFINE,
                (1, 0, magnitude * img.size[0] * rnd_ch_op([-1, 1]), 0, 1, 0),
                fillcolor=fillcolor),
            "translateY": lambda img, magnitude: img.transform(
                img.size,
                Image.AFFINE,
                (1, 0, 0, 0, 1, magnitude * img.size[1] * rnd_ch_op([-1, 1])),
                fillcolor=fillcolor),
            "rotate": lambda img, magnitude: img.rotate(                
                magnitude * rnd_ch_op([-1, 1]),
                Image.NEAREST,                
                fillcolor=fillcolor),
            "color": lambda img, magnitude: ImageEnhance.Color(img).enhance(
                1 + magnitude * rnd_ch_op([-1, 1])),
            "posterize": lambda img, magnitude:
                ImageOps.posterize(img, magnitude),
            "solarize": lambda img, magnitude:
                ImageOps.solarize(img, magnitude),
            "contrast": lambda img, magnitude:
                ImageEnhance.Contrast(img).enhance(
                    1 + magnitude * rnd_ch_op([-1, 1])),
            "sharpness": lambda img, magnitude:
                ImageEnhance.Sharpness(img).enhance(
                    1 + magnitude * rnd_ch_op([-1, 1])),
            "brightness": lambda img, magnitude:
                ImageEnhance.Brightness(img).enhance(
                    1 + magnitude * rnd_ch_op([-1, 1])),
            "autocontrast": lambda img, _:
                ImageOps.autocontrast(img),
            "equalize": lambda img, _: ImageOps.equalize(img),
            "invert": lambda img, _: ImageOps.invert(img)
        }

    def __call__(self, img):
        avaiable_op_names = list(self.level_map.keys())
        for layer_num in range(self.num_layers):
            op_name = np.random.choice(avaiable_op_names)
            img = self.func[op_name](img, self.level_map[op_name])
        return img

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions