当前位置:首页 » 《随便一记》 » 正文

【yolov5】将标注好的数据集进行划分(附完整可运行python代码)

28 人参与  2023年04月03日 19:15  分类 : 《随便一记》  评论

点击全文阅读


问题描述

准备使用yolov5训练自己的模型,自己将下载的开源数据集按照自己的要求重新标注了一下,然后现在对其进行划分。

问题分析

划分数据集主要的步骤就是,首先要将数据集打乱顺序,然后按照一定的比例将其分为训练集,验证集和测试集。
这里我定的比例是7:1:2。

步骤流程

1、将数据集打乱顺序

数据集有图片和标注文件,我们需要把两种文件绑定然后将其打乱顺序。
首先读取数据后,将两种文件通过zip函数绑定

each_class_image = []    each_class_label = []    for image in os.listdir(file_path):        each_class_image.append(image)    for label in os.listdir(xml_path):        each_class_label.append(label)    data=list(zip(each_class_image,each_class_label))

然后打乱顺序,再将两个列表分开

    random.shuffle(data)    each_class_image,each_class_label=zip(*data)

2、按照确定好的比例将两个列表元素分割

分别用三个列表储存一下图片和标注文件的元素

train_images = each_class_image[0:int(train_rate * total)]    val_images = each_class_image[int(train_rate * total):int((train_rate + val_rate) * total)]    test_images = each_class_image[int((train_rate + val_rate) * total):]        train_labels = each_class_label[0:int(train_rate * total)]    val_labels = each_class_label[int(train_rate * total):int((train_rate + val_rate) * total)]    test_labels = each_class_label[int((train_rate + val_rate) * total):]

3、在本地生成文件夹,将划分好的数据集分别保存

这样就保存好了。

    for image in train_images:        #print(image)        old_path = file_path + '/' + image        new_path1 = new_file_path + '/' + 'train' + '/' + 'images'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + image        shutil.copy(old_path, new_path)    for label in train_labels:        #print(label)        old_path = xml_path + '/' + label        new_path1 = new_file_path + '/' + 'train' + '/' + 'labels'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + label        shutil.copy(old_path, new_path)    for image in val_images:        old_path = file_path + '/' + image        new_path1 = new_file_path + '/' + 'val' + '/' + 'images'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + image        shutil.copy(old_path, new_path)    for label in val_labels:        old_path = xml_path + '/' + label        new_path1 = new_file_path + '/' + 'val' + '/' + 'labels'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + label        shutil.copy(old_path, new_path)    for image in test_images:        old_path = file_path + '/' + image        new_path1 = new_file_path + '/' + 'test' + '/' + 'images'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + image        shutil.copy(old_path, new_path)    for label in test_labels:        old_path = xml_path + '/' + label        new_path1 = new_file_path + '/' + 'test' + '/' + 'labels'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + label        shutil.copy(old_path, new_path)

运行结果展示

直接运行单个python文件即可。
在这里插入图片描述
运行完毕
去本地查看
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
图片和标注文件乱序,且一一对应。

完整代码分享

import osimport shutilimport randomrandom.seed(0)def split_data(file_path,xml_path, new_file_path, train_rate, val_rate, test_rate):    each_class_image = []    each_class_label = []    for image in os.listdir(file_path):        each_class_image.append(image)    for label in os.listdir(xml_path):        each_class_label.append(label)    data=list(zip(each_class_image,each_class_label))    total = len(each_class_image)    random.shuffle(data)    each_class_image,each_class_label=zip(*data)    train_images = each_class_image[0:int(train_rate * total)]    val_images = each_class_image[int(train_rate * total):int((train_rate + val_rate) * total)]    test_images = each_class_image[int((train_rate + val_rate) * total):]    train_labels = each_class_label[0:int(train_rate * total)]    val_labels = each_class_label[int(train_rate * total):int((train_rate + val_rate) * total)]    test_labels = each_class_label[int((train_rate + val_rate) * total):]    for image in train_images:        print(image)        old_path = file_path + '/' + image        new_path1 = new_file_path + '/' + 'train' + '/' + 'images'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + image        shutil.copy(old_path, new_path)    for label in train_labels:        print(label)        old_path = xml_path + '/' + label        new_path1 = new_file_path + '/' + 'train' + '/' + 'labels'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + label        shutil.copy(old_path, new_path)    for image in val_images:        old_path = file_path + '/' + image        new_path1 = new_file_path + '/' + 'val' + '/' + 'images'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + image        shutil.copy(old_path, new_path)    for label in val_labels:        old_path = xml_path + '/' + label        new_path1 = new_file_path + '/' + 'val' + '/' + 'labels'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + label        shutil.copy(old_path, new_path)    for image in test_images:        old_path = file_path + '/' + image        new_path1 = new_file_path + '/' + 'test' + '/' + 'images'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + image        shutil.copy(old_path, new_path)    for label in test_labels:        old_path = xml_path + '/' + label        new_path1 = new_file_path + '/' + 'test' + '/' + 'labels'        if not os.path.exists(new_path1):            os.makedirs(new_path1)        new_path = new_path1 + '/' + label        shutil.copy(old_path, new_path)if __name__ == '__main__':    file_path = "D:/Files/dataSet/drone_images"    xml_path = 'D:/Files/dataSet/drone_labels'    new_file_path = "D:/Files/dataSet/droneData"    split_data(file_path,xml_path, new_file_path, train_rate=0.7, val_rate=0.1, test_rate=0.2)

点击全文阅读


本文链接:http://m.zhangshiyu.com/post/58248.html

<< 上一篇 下一篇 >>

  • 评论(0)
  • 赞助本站

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。

最新文章

  • 女士的玩具推文_杜小灵白月光杜雪必读文_小说后续在线阅读_无删减免费完结_
  • 女儿要给我养老,我却反手把她告上法庭每日分享_林梦王浩养老一口气完结_小说后续在线阅读_无删减免费完结_
  • 闻妻有两意(林鹿小柿子)_闻妻有两意
  • 我的死党是刘秀?这皇位我不篡了(李哲王莽)全书免费_(李哲王莽)我的死党是刘秀?这皇位我不篡了后续(李哲王莽)
  • 逃荒路末世女王带着空间养儿女(周铁山王寡妇阿蛮)_逃荒路末世女王带着空间养儿女(周铁山王寡妇阿蛮)
  • 霍远凡肖灿续集(霍远凡肖灿)章节前文+全书阅读(丈夫逼我流产,我以死谢罪)最新连载
  • 老公给我13.14亲密付,我堕胎再婚后他悔疯了每日分享_苏暖顾川林晚晚超长版_小说后续在线阅读_无删减免费完结_
  • (白瑶,李玄胤,冰冷)白瑶,李玄胤,冰冷小说(九尾渡红尘)无套路无弹窗全部章节列表
  • (此去经年无故人)南初陆南城:结局+番外精品选集起点章节+阅读即将发布预订
  • 沈凝夏叶晚怡附加完整在线阅读(归雁不栖故人枝)最近更新列表
  • 剧情人物是时初,白浩雄的玄幻言情小说《召诸神,踏万界,天命帝女逆乾坤》,由网络作家&ldquo;海鸥&rdquo;所著,情节扣人心弦,本站TXT全本,欢迎阅读!本书共计381345字,185章节,:结局+番外免费品鉴:结局+番外评价五颗星
  • 凤青禾,江明远,***枢小说(别人修仙我捡漏,卷王们破防了)最近更新(凤青禾,江明远,***枢)整本无套路阅读

    关于我们 | 我要投稿 | 免责申明

    Copyright © 2020-2022 ZhangShiYu.com Rights Reserved.豫ICP备2022013469号-1