保佑offer快快来

2019-07-27 10:27 已编辑兰州大学测试工程师

关注

Tensorflow 构建自己的目标检测与识别模型之数据增强（二）

Tensorflow 构建自己的目标检测与识别模型之数据增强（二）

上次的博客中对如何安装Tensorflow Object Detection API的步骤及所遇到的问题进行说明。见链接：https://blog.csdn.net/weixin_41644725/article/details/83007901
接下来，对图像数据进行图像增强。虽然在配置.config文件(后面会说到)时，其中会提到数据增强(data argumentation)，但是若是想手动实现，可参考本文，若不想则跳过即可。

1.用labelImage工具生成.xml文件。

该工具的界面如图所示，关于如何安装labelImage，可参考网上的相关博客，在windows和Linux下都有相应的安装过程，此处不叙述安装过程。其中“Open Dir”为打开存储所有图像文件的文件夹。“Change Save Dir”为将生成的.xml文件存储在指定文件夹下面。“Save”表示保存当前的.xml文件。

xml文件的格式如下图所示：

2. xml 转成csv文件

（1）将xml文件转成csv文件代码如下：

import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET

def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df

def main():
    xml_path = './xml'           #存储xml的文件夹
    xml_df = xml_to_csv(xml_path)
    xml_df.to_csv('./csv/class.csv', index=None)   #生成csv文件并存储在该路径下
    print('Successfully converted xml to csv.')
    
main()

（2）得到该图像中对应类的边界框（bounding box）,代码如下：

import os
import cv2
import pandas as pd
import matplotlib.pyplot as plt
def get_bbox(image_name,csv_path):
    full_labels = pd.read_csv(csv_path)
    selected_value = full_labels[full_labels.filename == image_name]
    images_bbox = []
    img_class = ''
    for index,row in selected_value.iterrows():
        list_bbox = []
        list_bbox.append(row['xmin'])
        list_bbox.append(row['ymin'])
        list_bbox.append(row['xmax'])
        list_bbox.append(row['ymax'])
        list_bbox.append(image_name)
        img_class = row['class']
        images_bbox.append(list_bbox)
    return images_bbox,img_class
    
 img_path = '023.jpg'
 csv_path = ''./csv/class.csv''
 img = cv2.imread(img_path)
 b, g, r = cv2.split(img)
 img = cv2.merge([r, g, b])
 image = cv2.GaussianBlur(img, (3, 3), 0)
 coords = get_bbox(img_path)
 coords = [coord[:4] for coord in coords]
 for i in range(len(coords)):
     bbox = coords[i]
     x_min = bbox[0]
     y_min = bbox[1]
     x_max = bbox[2]
     y_max = bbox[3]
     cv2.rectangle(image, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
 plt.subplot(111), plt.imshow(image), plt.title('original', fontsize='medium')
 plt.show()

输出结果如下：

3.图像数据增强

（1）调整图像亮度

代码如下：

 import os
 import cv2
 import pandas as pd
 import matplotlib.pyplot as plt
    '''调整亮度'''
 def changeLight(img,bboxes):
        flag = random.uniform(1.5, 2)  # flag>1为调暗,小于1为调亮
        img = exposure.adjust_gamma(img, flag)
        cv2.imwrite('./1.jpg', img)
        img = cv2.imread('./1.jpg')
        os.remove('./1.jpg')
        for i in range(len(bboxes)):
            bbox = bboxes[i]
            x_min = bbox[0]
            y_min = bbox[1]
            x_max = bbox[2]
            y_max = bbox[3]
            cv2.rectangle(img, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
        return img
    img_path = '023.jpg'
    img = cv2.imread(img_path)
    b, g, r = cv2.split(img)
    img = cv2.merge([r, g, b])
    img = cv2.GaussianBlur(img, (3, 3), 0)
    image = cv2.GaussianBlur(img, (3, 3), 0)
    coords = get_bbox(img_path)
    coords = [coord[:4] for coord in coords]
    for i in range(len(coords)):
        bbox = coords[i]
        x_min = bbox[0]
        y_min = bbox[1]
        x_max = bbox[2]
        y_max = bbox[3]
        cv2.rectangle(image, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
    '''调整亮度'''
    change_light_img = changeLight(img=img, bboxes=coords)
    plt.subplot(121), plt.imshow(image), plt.title('original', fontsize='medium')
    plt.subplot(122), plt.imshow(change_light_img), plt.title('change light', fontsize='medium')
    plt.show()

输出结果如下：

（2）cutout

代码如下：

    '''cutout'''
    def cutout(img, bboxes, length=100, n_holes=1, threshold=0.5):
        '''
        原版本：https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
        Randomly mask out one or more patches from an image.
        Args:
            img : a 3D numpy array,(h,w,c)
            bboxes : 框的坐标
            n_holes (int): Number of patches to cut out of each image.
            length (int): The length (in pixels) of each square patch.
        '''
        def cal_iou(boxA, boxB):
            '''
            boxA, boxB为两个框，返回iou
            boxB为bouding box
            '''
            # determine the (x, y)-coordinates of the intersection rectangle
            xA = max(boxA[0], boxB[0])
            yA = max(boxA[1], boxB[1])
            xB = min(boxA[2], boxB[2])
            yB = min(boxA[3], boxB[3])
            if xB <= xA or yB <= yA:
                return 0.0
            # compute the area of intersection rectangle
            interArea = (xB - xA + 1) * (yB - yA + 1)
            # compute the area of both the prediction and ground-truth
            # rectangles
            boxAArea = (boxA[2] - boxA[0] + 1) * (boxA[3] - boxA[1] + 1)
            boxBArea = (boxB[2] - boxB[0] + 1) * (boxB[3] - boxB[1] + 1)
            # compute the intersection over union by taking the intersection
            # area and dividing it by the sum of prediction + ground-truth
            # areas - the interesection area
            iou = interArea / float(boxAArea + boxBArea - interArea)
            #iou = interArea / float(boxBArea)
             # return the intersection over union value
            return iou
        # 得到h和w
        if img.ndim == 3:
            h, w, c = img.shape
        else:
            _, h, w, c = img.shape
        mask = np.ones((h, w, c), np.float32)
        for n in range(n_holes):
            chongdie = True  # 看切割的区域是否与box重叠太多
            while chongdie:
                y = np.random.randint(h)
                x = np.random.randint(w)
                y1 = np.clip(y - length // 2, 0,
                            h)  # numpy.clip(a, a_min, a_max, out=None), clip这个函数将将数组中的元素限制在a_min, a_max之间，大于a_max的就使得它等于 a_max，小于a_min,的就使得它等于a_min
                y2 = np.clip(y + length // 2, 0, h)
                x1 = np.clip(x - length // 2, 0, w)
                x2 = np.clip(x + length // 2, 0, w)
                chongdie = False
                for box in bboxes:
                    if cal_iou([x1, y1, x2, y2], box) > threshold:
                        chongdie = True
                        break
            mask[y1: y2, x1: x2, :] = 0.
        # mask = np.expand_dims(mask, axis=0)
        img = img * mask
        for i in range(len(bboxes)):
            bbox = bboxes[i]
            x_min = bbox[0]
            y_min = bbox[1]
            x_max = bbox[2]
            y_max = bbox[3]
            cv2.rectangle(img, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
        cv2.imwrite('./1.jpg', img)
        img = cv2.imread('./1.jpg')
        os.remove('./1.jpg')
        return img
 img_path = '023.jpg'
 img = cv2.imread(img_path)
 b, g, r = cv2.split(img)
 img = cv2.merge([r, g, b])
 img = cv2.GaussianBlur(img, (3, 3), 0)
 image = cv2.GaussianBlur(img, (3, 3), 0)
 coords = get_bbox(img_path)
 coords = [coord[:4] for coord in coords]
 for i in range(len(coords)):
     bbox = coords[i]
     x_min = bbox[0]
     y_min = bbox[1]
     x_max = bbox[2]
     y_max = bbox[3]
     cv2.rectangle(image, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
 '''调整亮度'''
 cut_out_img = cutout(img=img, bboxes=coords)
 plt.subplot(121), plt.imshow(image), plt.title('original', fontsize='medium')
 plt.subplot(122), plt.imshow(cut_out_img), plt.title('cutout', fontsize='medium')
 plt.show()

输出结果如下：

（3）旋转

代码如下：

'''旋转'''
def rotate_img_bbox(img, bboxes, angle=5, scale=1.):
    '''
    参考:https://blog.csdn.net/u014540717/article/details/53301195crop_rate
    输入:
        img:图像array,(h,w,c)
        bboxes:该图像包含的所有boundingboxs,一个list,每个元素为[x_min, y_min, x_max, y_max],要确保是数值
        angle:旋转角度
        scale:默认1
    输出:
        rot_img:旋转后的图像array
        rot_bboxes:旋转后的boundingbox坐标list
    '''
    # ---------------------- 旋转图像 ----------------------
    w = img.shape[1]
    h = img.shape[0]
    # 角度变弧度
    rangle = np.deg2rad(angle)  # angle in radians
    # now calculate new image width and height
    nw = (abs(np.sin(rangle) * h) + abs(np.cos(rangle) * w)) * scale
    nh = (abs(np.cos(rangle) * h) + abs(np.sin(rangle) * w)) * scale
    # ask OpenCV for the rotation matrix
    rot_mat = cv2.getRotationMatrix2D((nw * 0.5, nh * 0.5), angle, scale)
    # calculate the move from the old center to the new center combined
    # with the rotation
    rot_move = np.dot(rot_mat, np.array([(nw - w) * 0.5, (nh - h) * 0.5, 0]))
    # the move only affects the translation, so update the translation
    # part of the transform
    rot_mat[0, 2] += rot_move[0]
    rot_mat[1, 2] += rot_move[1]
    # 仿射变换
    rot_img = cv2.warpAffine(img, rot_mat, (int(math.ceil(nw)), int(math.ceil(nh))), flags=cv2.INTER_LANCZOS4)
    # ---------------------- 矫正bbox坐标 ----------------------
    # rot_mat是最终的旋转矩阵
    # 获取原始bbox的四个中点，然后将这四个点转换到旋转后的坐标系下
    rot_bboxes = list()
    for bbox in bboxes:
        xmin = bbox[0]
        ymin = bbox[1]
        xmax = bbox[2]
        ymax = bbox[3]
        point1 = np.dot(rot_mat, np.array([(xmin + xmax) / 2, ymin, 1]))
        point2 = np.dot(rot_mat, np.array([xmax, (ymin + ymax) / 2, 1]))
        point3 = np.dot(rot_mat, np.array([(xmin + xmax) / 2, ymax, 1]))
        point4 = np.dot(rot_mat, np.array([xmin, (ymin + ymax) / 2, 1]))
        # 合并np.array
        concat = np.vstack((point1, point2, point3, point4))
        # 改变array类型
        concat = concat.astype(np.int32)
        # 得到旋转后的坐标
        rx, ry, rw, rh = cv2.boundingRect(concat)
        rx_min = rx
        ry_min = ry
        rx_max = rx + rw
        ry_max = ry + rh
        # 加入list中
        rot_bboxes.append([rx_min, ry_min, rx_max, ry_max])
    for i in range(len(rot_bboxes)):
        bbox = rot_bboxes[i]
        x_min = bbox[0]
        y_min = bbox[1]
        x_max = bbox[2]
        y_max = bbox[3]
        cv2.rectangle(rot_img, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
    cv2.imwrite('./1.jpg', rot_img)
    rot_img = cv2.imread('./1.jpg')
    os.remove('./1.jpg')
    return rot_img
 img_path = '023.jpg'
 img = cv2.imread(img_path)
 b, g, r = cv2.split(img)
 img = cv2.merge([r, g, b])
 img = cv2.GaussianBlur(img, (3, 3), 0)
 image = cv2.GaussianBlur(img, (3, 3), 0)
 coords = get_bbox(img_path)
 coords = [coord[:4] for coord in coords]
 for i in range(len(coords)):
     bbox = coords[i]
     x_min = bbox[0]
     y_min = bbox[1]
     x_max = bbox[2]
     y_max = bbox[3]
     cv2.rectangle(image, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
 '''调整亮度'''
rotate_img = rotate_img_bbox(img=img, bboxes=coords)
plt.subplot(121), plt.imshow(image), plt.title('original', fontsize='medium')
plt.subplot(122), plt.imshow(rotate_img), plt.title('rotate', fontsize='medium')
plt.show()

输出结果如下：

（4）裁剪

代码如下：

'''裁剪'''
def crop_img_bboxes(img, bboxes):
    '''
    裁剪后的图片要包含所有的框
    输入:
        img:图像array
        bboxes:该图像包含的所有boundingboxs,一个list,每个元素为[x_min, y_min, x_max, y_max],要确保是数值
    输出:
        crop_img:裁剪后的图像array
        crop_bboxes:裁剪后的bounding box的坐标list
    '''
    # ---------------------- 裁剪图像 ----------------------
    w = img.shape[1]
    h = img.shape[0]
    x_min = w  # 裁剪后的包含所有目标框的最小的框
    x_max = 0
    y_min = h
    y_max = 0
    for bbox in bboxes:
        x_min = min(x_min, bbox[0])
        y_min = min(y_min, bbox[1])
        x_max = max(x_max, bbox[2])
        y_max = max(y_max, bbox[3])
    d_to_left = x_min  # 包含所有目标框的最小框到左边的距离
    d_to_right = w - x_max  # 包含所有目标框的最小框到右边的距离
    d_to_top = y_min  # 包含所有目标框的最小框到顶端的距离
    d_to_bottom = h - y_max  # 包含所有目标框的最小框到底部的距离
    # 随机扩展这个最小框
    crop_x_min = int(x_min - random.uniform(0, d_to_left))
    crop_y_min = int(y_min - random.uniform(0, d_to_top))
    crop_x_max = int(x_max + random.uniform(0, d_to_right))
    crop_y_max = int(y_max + random.uniform(0, d_to_bottom))
    # 随机扩展这个最小框 , 防止别裁的太小
    # crop_x_min = int(x_min - random.uniform(d_to_left//2, d_to_left))
    # crop_y_min = int(y_min - random.uniform(d_to_top//2, d_to_top))
    # crop_x_max = int(x_max + random.uniform(d_to_right//2, d_to_right))
    # crop_y_max = int(y_max + random.uniform(d_to_bottom//2, d_to_bottom))
    # 确保不要越界
    crop_x_min = max(0, crop_x_min)
    crop_y_min = max(0, crop_y_min)
    crop_x_max = min(w, crop_x_max)
    crop_y_max = min(h, crop_y_max)
    crop_img = img[crop_y_min:crop_y_max, crop_x_min:crop_x_max]
    # ---------------------- 裁剪boundingbox ----------------------
    # 裁剪后的boundingbox坐标计算
    crop_bboxes = list()
    for bbox in bboxes:
        crop_bboxes.append([bbox[0] - crop_x_min, bbox[1] - crop_y_min, bbox[2] - crop_x_min, bbox[3] - crop_y_min])
    for i in range(len(crop_bboxes)):
        bbox = crop_bboxes[i]
        x_min = bbox[0]
        y_min = bbox[1]
        x_max = bbox[2]
        y_max = bbox[3]
        cv2.rectangle(crop_img, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
    cv2.imwrite('./1.jpg', crop_img)
    crop_img = cv2.imread('./1.jpg')
    os.remove('./1.jpg')
    return crop_img
 img_path = '023.jpg'
 img = cv2.imread(img_path)
 b, g, r = cv2.split(img)
 img = cv2.merge([r, g, b])
 img = cv2.GaussianBlur(img, (3, 3), 0)
 image = cv2.GaussianBlur(img, (3, 3), 0)
 coords = get_bbox(img_path)
 coords = [coord[:4] for coord in coords]
 for i in range(len(coords)):
     bbox = coords[i]
     x_min = bbox[0]
     y_min = bbox[1]
     x_max = bbox[2]
     y_max = bbox[3]
     cv2.rectangle(image, (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 255, 0), 3)
 '''调整亮度'''
crop_img = crop_img_bboxes(img=img, bboxes=coords)
plt.subplot(121), plt.imshow(image), plt.title('original', fontsize='medium')
plt.subplot(122), plt.imshow(crop_img), plt.title('crop', fontsize='medium')
plt.show()

输出结果如下：

全部评论

推荐最新楼层

昨天 15:07

顺丰集团_大数据挖掘与分析工程师(准入职员工)

顺丰内推，顺丰内推码

已经来工作一个多月啦，感觉幸福度很高，顺丰真的是很值得来的一家公司！公司里的氛围很好，同事们很nice很好沟通，和大家私下相处也贼融洽。工作很充实但是没什么压力，领导带教都不push，感觉每天来公司都很开心~💼 公司岗位数据分析💢 工作强度在整个集团里算卷的，顺丰科技大多数部门965美滋滋，但是我进的部门业务多975，双休基本可以保证，需求多的话周末也要赶一下进度。🫶️ 团队氛围进的团队好，前辈都很热心教人，有问必答。❤️ 职场感受希望我这次转正能通过吧，能过的话会来报喜😊顺丰集团2026届校招启动！【公司简介】：顺丰集团是世界500强企业第377位，中国第一大、世界第四大快递物流服务...

顺丰集团工作强度 406人发布

点赞评论收藏

分享

今天 13:10

门头沟学院 Java

推荐一个Java项目

不知道从什么时候，应届生简历的项目经验被各大培训厂商从图书管理系统卷成了秒杀、商城、抢票、学院、外卖项目。这其实是个很悲哀的现象。这类项目通常都是公司的核心服务，很多在职的朋友都不敢说完全掌握，应届生们竟然要为了求一个工作机会死磕一个可能在整个职业生涯都用不到的代码逻辑。当然这些服务的代码逻辑和架构设计绝对称得上顶级，用来学习一定是个好选择，毋庸置疑，但真的适合你写在简历上吗。且不说因为学校差+同质化严重能不能到面试环节，就算进了面。这些年面试下来给我感受是，大部分同学写这种项目都是押几个常考面试题，这类问题的套路都比较成熟了，有统一的话术，但往往面试的高压环境下面试官一变奏就崩了。比如提问分...

简历中的项目经历要怎么写

点赞评论收藏

分享

2025-12-17 23:46

广东工业大学嵌入式工程师

27届求拷打简历

简历求拷打，路过的大佬点评一下，第二次写简历。

实习简历求拷打

点赞评论收藏

分享

昨天 16:48

卓驭科技_HR(准入职员工)

卓驭（大疆车载）内推

自我介绍随后对项目经历的真实性进行了核实，包括项目背景、团队构成及个人分工；面试官询问是否亲自编写过IIC和SPI的底层驱动代码（回答为是）；对比IIC与SPI通信的区别（主要涉及速率方面IIC较低而SPI较高、IIC为半双工而SPI为全双工、通信结构上IIC支持多主多从而SPI一般为单主多从点对点模式）；介绍BootLoader进行固件升级的整体流程；解释static关键字的用途（可用于修饰变量和函数）；阐述函数指针的概念及其实际应用场景（例如在BootLoader中实现向应用程序的跳转）；说明结构体变量的几种初始化方式（包括定义后逐成员赋值、定义时按声明顺序初始化、以及定义时通过指定成员名...

点赞评论收藏

分享

评论

点赞

收藏

全站热榜

更多

创作者周榜

更多

正在热议

更多

# 为了入行xx岗，我学了__ #

4712次浏览 89人参与

# 小厂实习有必要去吗 #

77793次浏览 368人参与

# MiniMax求职进展汇总 #

1717次浏览 26人参与

# 实习的你做了哪些离谱的工作 #

7402次浏览 107人参与

# Prompt分享 #

1473次浏览 48人参与

# 简历第一个项目做什么 #

5943次浏览 93人参与

# 你都见过什么样的草台班子？ #

3132次浏览 41人参与

# 被说“做题家”，你的反应是_____？ #

1123次浏览 46人参与

# 如果让你发明个APP，你会想做什么 #

1525次浏览 48人参与

# 听到哪句话代表面试稳了OR挂了？ #

124509次浏览 559人参与

# 工作压力大，你会干什么？ #

10773次浏览 263人参与

# 找实习记录 #

22388次浏览 400人参与

# 大家实习每天都在干啥 #

112095次浏览 606人参与

# 如果不上班，你会去做什么 #

5348次浏览 227人参与

# 邪修省钱套路 #

6223次浏览 216人参与

# AI让你的思考变深了还是变浅了？ #

3617次浏览 106人参与

# 金三银四，你有感觉到吗 #

673339次浏览 6040人参与

# 分享一个让你热爱工作的瞬间 #

57124次浏览 482人参与

# 你想跟着什么样领导？ #

45446次浏览 231人参与

# 我的求职精神状态 #

419343次浏览 3071人参与

# 通信硬件薪资爆料 #

1200241次浏览 7192人参与

牛客网
牛客网在线编程
牛客网题解
牛客企业服务