MsCOCO数据集标注详解

MsCOCO数据标注详解
COCO数据集的标注格式
COCO数据集的介绍:全称是Common Objects in COntext,是微软团队提供的⼀个可以⽤来进⾏图像识别的数据集。MS COCO数据集中的图像分为训练、验证和测试集。COCO通过在Flickr上搜索80个对象类别和各种场景类型来收集图像,其使⽤了亚马逊的Mechanical Turk(AMT)。
COCO数据集的标注类型:object instances(⽬标实例), object keypoints(⽬标上的关键点), and image captions(看图说话)
COCO数据集存储⽅式:数据集以json⽂件存储,每个标注类型有训练和验证两个数据集划分,因此3种标注类型共有6个jason⽂件。
基本的JSON结构体类型
每个json⽂件⼜包括info、image、license、annotation四个部分,其中annotation(图像标注)的格式每种标注类型各不相同。
{
"info": info,
"licenses":[license],
"images":[image],
"annotations":[annotation],
}
info{
"year":int,
"version": str,
"description": str,
"contributor": str,
"url": str,
"date_created": datetime,
}
license{
"id":int,
"name": str,
"url": str,
}
image{
"id":int,
"width":int,
"height":int,
"file_name": str,
"license":int,
"flickr_url": str,
"coco_url": str,
"date_captured": datetime,
}
object instances的标注格式
1. 整体json⽂件格式:
盐酸储存罐{
"info": info,
"licenses":[license],
"images":[image],
"annotations":[annotation],
"categories":[category]
}
2. annotation格式:
annotations字段是包含多个annotation实例的⼀个数组,annotation类型本⾝⼜包含了⼀系列的字段,如这个⽬标的category id和segmentation mask。segmentation格式取决于这个实例是⼀个单个的对象(即iscrowd=0,将使⽤polygons格式)还是⼀组对象(即iscrowd=1,将使⽤RLE格式)。如下所⽰:
annotation{
"id":int,
"image_id":int,
"category_id":int,
"segmentation": RLE or [polygon],
"area":float,
"bbox":[x,y,width,height],
"iscrowd":0 or 1,
oadm
}
单个的对象(iscrowd=0)可能需要多个polygon来表⽰,⽐如这个对象在图像中被挡住了。⽽iscrowd=1时(将标注⼀组对象,⽐如⼀⼈)的segmentation使⽤的就是RLE格式。
另外,每个对象(不管是iscrowd=0还是iscrowd=1)都会有⼀个矩形框bbox ,矩形框左上⾓的坐标和矩形框的长宽会以数组的形式提供,数组第⼀个元素就是左上⾓的横坐标值。
注塑鞋
area是area of encoded masks。
最后,annotation结构中的categories字段存储的是当前对象所属的category的id,以及所属的superca广告伞制作
tegory的name。
下⾯是从instances_val2017.json⽂件中摘出的⼀个annotation的实例:
{
"segmentation":[[510.66,423.01,511.72,420.03,510.45,416.0,510.34,413.02,510.77,410.26,\
510.77,407.5,510.34,405.16,511.51,402.83,511.41,400.49,510.24,398.16,509.39,\
397.31,504.61,399.22,502.17,399.64,500.89,401.66,500.47,402.08,499.09,401.87,\
495.79,401.98,490.59,401.77,488.79,401.77,485.39,398.58,483.9,397.31,481.56,\
396.35,478.48,395.93,476.68,396.03,475.4,396.77,473.92,398.79,473.28,399.96,\
473.49,401.87,474.56,403.47,473.07,405.59,473.39,407.71,476.68,409.41,479.23,\
409.73,481.56,410.69,480.4,411.85,481.35,414.93,479.86,418.65,477.32,420.03,\
476.04,422.58,479.02,422.58,480.29,423.01,483.79,419.93,486.66,416.21,490.06,\
415.57,492.18,416.85,491.65,420.24,492.82,422.9,493.56,424.39,496.43,424.6,\
498.02,423.01,498.13,421.31,497.07,420.03,497.07,415.15,496.33,414.51,501.1,\
411.96,502.06,411.32,503.02,415.04,503.33,418.12,501.1,420.24,498.98,421.63,\
500.47,424.39,505.03,423.32,506.2,421.31,507.69,419.5,506.31,423.32,510.03,\
423.01,510.45,423.01]],
"area":702.1057499999998,
"iscrowd":0,
"image_id":289343,
"bbox":[473.07,395.93,38.65,28.67],
"category_id":18,
"id":1768
},
3. categories字段
从instances_val2017.json⽂件中摘出的2个category实例如下所⽰:
{
"supercategory":"person",
"id":1,
"name":"person"
},
{
"supercategory":"vehicle",
"id":2,
"name":"bicycle"
},
Object Keypoint 类型的标注格式
1. annotations字段
包含了Object Instance中annotation结构体的所有字段,再加上2个额外的字段。
新增的keypoints是⼀个长度为3*k的数组,其中k是category中keypoints的总数量。每⼀个keypoint是⼀个长度为3的数组,第⼀和第⼆个元素分别是x和y坐标值,第三个元素是个标志位v,v为0时表⽰这个关键点没有标注(这种情况下x=y=v=0),v为1时表⽰这个关键点标注了但是不可见(被遮挡了),v为2时表⽰这个关键点标注了同时也可见。
num_keypoints表⽰这个⽬标上被标注的关键点的数量(v>0),⽐较⼩的⽬标上可能就⽆法标注关键点。
{
组织芯片
"segmentation":[[125.12,539.69,140.94,522.43,100.67,496.54,84.85,469.21,73.35,450.52,104.99,342.65,168.27,290.88,179.78,288,189.84,286.56,191.2 8,260.67,202.79,240.54,221.48,237.66,248.81,243.42,257.44,256.36,253.12,262.11,253.12,275.06,299.15,233.35,329.35,207.46,355.24,206.02,363.87,2 06.02,365.3,210.34,373.93,221.84,363.87,226.16,363.87,237.66,350.92,237.66,332.22,234.79,314.97,249.17,271.82,313.89,253.12,326.83,227.24,352.7 2,214.29,357.03,212.85,372.85,208.54,395.87,228.67,414.56,245.93,421.75,266.07,424.63,276.13,437.57,266.07,450.52,284.76,464.9,286.2,479.28,291 .96,489.35,310.65,512.36,284.76,549.75,244.49,522.43,215.73,546.88,199.91,558.38,204.22,565.57,189.84,568.45,184.09,575.64,172.58,578.52,145.26 ,567.01,117.93,551.19,133.75,532.49]],
"num_keypoints":10,
"area":47803.27955,
"iscrowd":0,
"keypoints":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,142,309,1,177,320,2,191,398,2,237,317,2,233,426,2,306,233,2,92,452,2,123,468,2,0,0,0,251,469,2,0,0,0,162,5 51,2],
"image_id":425226,"bbox":[73.35,206.02,300.58,372.5],"category_id":1,
"id":183126
},
2. categories字段
对于每⼀个category结构体,相⽐Object Instance中的category新增了2个额外的字段,keypoints是⼀个长度为k的数组,包含了每个关键点的名字;skeleton定义了各个关键点之间的连接性(⽐如⼈的左⼿腕和左肘就是连接的,但是左⼿腕和右⼿腕就不是)。⽬前,COCO 的keypoints只标注了person category (分类为⼈)。
从person_keypoints_val2017.json⽂件中摘出⼀个category的实例如下:
{
"supercategory":"person",
"id":1,
"name":"person",
道路交通事故现场图
"keypoints":["nose","left_eye","right_eye","left_ear","right_ear","left_shoulder","right_shoulder","left_elbow","right_elbow","left_wrist","right_wrist","left_hip", "right_hip","left_knee","right_knee","left_ankle","right_ankle"],
"skeleton":[[16,14],[14,12],[17,15],[15,13],[12,13],[6,12],[7,13],[6,7],[6,8],[7,9],[8,10],[9,11],[2,3],[1,2],[1,3],[2,4],[3,5],[4,6],[5,7]]
}
COCO数据集图像和注解下载地址:
Images
Annotations

本文发布于:2024-09-22 12:40:02,感谢您对本站的认可!

本文链接:https://www.17tex.com/tex/1/167910.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:标注   数据   类型   关键点   图像   数组   格式   训练
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2024 Comsenz Inc.Powered by © 易纺专利技术学习网 豫ICP备2022007602号 豫公网安备41160202000603 站长QQ:729038198 关于我们 投诉建议