Hi guys, from Colab, I would like to learn how to register in detectron2 my custom dataset.
https://rosenfelder.ai/Instance_Image_Segmentation_for_Window_and_Building_Detection_with_detectron2/#prepare-the-data
Inputs
I used via.html to make annotations and save them in two json files (train - val)
Each image has street and hole labels.
images are located in Colab in train and val folders, inside of each folder there are images and a json file called
I found a function in beginer's tutorial for converts them into a format that is usable by detectron2.
I expect how outputs about that function converts images into a format that is usable by detectron2
Code implemented
def get_street_dicts(img_dir):
"""This function loads the JSON file created with the annotator and converts it to
the detectron2 metadata specifications.
"""
# load the JSON file
json_file = os.path.join(img_dir, "via_region_data.json")
with open(json_file) as f:
imgs_anns = json.load(f)
dataset_dicts = []
# loop through the entries in the JSON file
for idx, v in enumerate(imgs_anns.values()):
record = {}
# add file_name, image_id, height and width information to the records
filename = os.path.join(img_dir, v["filename"])
height, width = cv2.imread(filename).shape[:2]
record["file_name"] = filename
record["image_id"] = idx
record["height"] = height
record["width"] = width
annos = v["regions"]
objs = []
# one image can have multiple annotations, therefore this loop is needed
for annotation in annos:
# reformat the polygon information to fit the specifications
anno = annotation["shape_attributes"]
px = anno["all_points_x"]
py = anno["all_points_y"]
poly = [(x + 0.5, y + 0.5) for x, y in zip(px, py)]
poly = [p for x in poly for p in x]
region_attributes = annotation["region_attributes"]["class"]
# specify the category_id to match with the class.
if "street" in region_attributes:
category_id = 1
elif "hole" in region_attributes:
category_id = 0
obj = {
"bbox": [np.min(px), np.min(py), np.max(px), np.max(py)],
"bbox_mode": BoxMode.XYXY_ABS,
"segmentation": [poly],
"category_id": category_id,
"iscrowd": 0,
}
objs.append(obj)
record["annotations"] = objs
dataset_dicts.append(record)
return dataset_dicts
for d in ["train", "val"]:
DatasetCatalog.register("streets_" + d,lambda d=d: get_street_dicts("/content/potholes/"+ d))
street_metadata = MetadataCatalog.get("streets_train")
dataset_dicts = get_street_dicts("/content/potholes/train")
AssertionError
AssertionError Traceback (most recent call last)
in ()
59 from detectron2.data import DatasetCatalog, MetadataCatalog
60 for d in ["train", "val"]:
---> 61 DatasetCatalog.register("streets_" + d,lambda d=d: get_street_dicts("/content/potholes/", d))
62 street_metadata = MetadataCatalog.get("streets_train")
63 dataset_dicts = get_street_dicts("/content/potholes/train")
/content/detectron2_repo/detectron2/data/catalog.py in register(name, func)
38 assert callable(func), "You must register a function with DatasetCatalog.register!"
39 assert name not in DatasetCatalog._REGISTERED, "Dataset '{}' is already registered!".format(
---> 40 name
41 )
42 DatasetCatalog._REGISTERED[name] = func
AssertionError: Dataset 'streets_train' is already registered!
Thanks for check it.
As the error says, the dataset is already registered. Registering it again is expected to cause this error
How do you unregister the dataset. I'm creating a huge number of datasets because I can't unregister them.
Even better - how do I attach a new create_dataset_dicts function to a registered dataset
@RishiMalhotra920
I am facing same issue and there is no doc talking about it. did you manage to find how to unregister a dataset? thanks.
@ghazni123 @RishiMalhotra920 I was having the same issue; after reading the code I found there is a clear method that you can use to _unregister_:
DatasetCatalog.clear()