Carla: 2D bounding boxes information, for agents available in the scene

Created on 8 Aug 2018 · 19Comments · Source: carla-simulator/carla

Hello,
I would like to get the 2D bounding box coordinates for the agents available in the front view of my scene, not the information of all non-player agents in the simulation.

Is there a way to do so ?

duplicate python support

Source

mhusseinsh

Most helpful comment

Hi @mhusseinsh , sorry I don't have time dig into your code, so I firstly assume you can of course draw a rectangle on the window given the rectangle location.

According to your code, you are drawing only a 10x10 rectangle at the projected center-point vehicles.
If this cannot be seen, you should first check if your plotting code has any bug inside.

After that, I think your question is how to get the bounding box information, in the page non-player-agents-info , you have vehicle.bounding_box.transform and vehicle.bounding_box.extent , with which you can calculate the vertices coordinates of the 3D bounding box of the vehicle in world coordinate system.

Then you can use your center-point projection code to get the projected 8 vertices of the 3D bounding box of each vehicle inside your image. The the bounding box size and location can be decided based on these 8 projected points.

Please be aware that there's no straightforward way to get the 4 projected bounding box location on the image directly from any measurement the Carla provides. You have to calculate them based on the information in the measurement, but all required information are given already.
Also, there're quite a lot coordinate transformation and projection things, please be sure you are doing it correctly.

wzhAptiv on 23 Aug 2018

👍2

All 19 comments

Hi @mhusseinsh , you can try to use depth map to filter out all occluded boxes:
#314 comment

wzhAptiv on 8 Aug 2018

@wzhAptiv but I believe even the info provided by the measurements are not 4 points coordinates, right ?

mhusseinsh on 8 Aug 2018

@mhusseinsh you can convert the 8 vertices 3D bounding box coordinate into camera space, and perform a projection to the 2D location. #314 provides the method to do that.

wzhAptiv on 8 Aug 2018

@wzhAptiv the method that was written was supporting version 0.8.1, I am working with 0.8.4.

Do you have any hints, or part of code for the client example, to just extract 2D BB coordinates of agents in the scene ?

mhusseinsh on 15 Aug 2018

@mhusseinsh once you see how to plot/project a point in world coordinate to your camera, following #314 comment, you can use the agent measurement non-player-agents-info to calculate the bounding box in world coordinate, and finally project to your camera.

wzhAptiv on 17 Aug 2018

❤1

Hi @mhusseinsh.

As @wzhAptiv says, this is an already answered question. I'm closing the issue.
Thanks for your contribution!

marcgpuig on 20 Aug 2018

@wzhAptiv I followed the instructions in #314 and also the client example, and again I can't reach something

Here is what I have written:

def point_in_canvas(pos):
    """Return true if point is in canvas"""
    if (pos[0] >= 0) and (pos[0] < 300) and (pos[1] >= 0) and (pos[1] < 400):
        return True
    return False

def draw_rect(array, pos, size, color=(255, 0, 255)):
    """Draws a rect"""
    point_0 = (pos[0]-size/2, pos[1]-size/2)
    point_1 = (pos[0]+size/2, pos[1]+size/2)
    if point_in_canvas(point_0) and point_in_canvas(point_1):
        for i in range(size):
            for j in range(size):
                array[int(point_0[0]+i), int(point_0[1]+j)] = color

def rand_color(seed):
    """Return random color based on a seed"""
    random.seed(seed)
    col = colorsys.hls_to_rgb(random.random(), random.uniform(.2, .8), 1.0)
    return (int(col[0]*255), int(col[1]*255), int(col[2]*255))

for frame in range(0, frames_per_episode):

                # Read the data produced by the server this frame.
                measurements, sensor_data = client.read_data()

                # Print some of the measurements.
                print_measurements(measurements)
                if (frame > 19):
                    # Save the images to disk if requested.
                    if args.save_images_to_disk:
                        for name, measurement in sensor_data.items():
                            filename = args.out_filename_format.format(episode, name, episode, frame-20)
                            measurement.save_to_disk(filename)
                            # Bounding Boxes
                            camera_to_car_transform = camera0.get_unreal_transform()
                            # (Intrinsic) (3, 3) K Matrix
                            K = np.identity(3)
                            K[0, 2] = 400 / 2.0
                            K[1, 2] = 300 / 2.0
                            K[0, 0] = K[1, 1] = 400 / (2.0 * math.tan(90.0 * math.pi / 360.0))

                            # (Camera) local 3d to world 3d
                            # Get the transform from the player protobuf transformation
                            world_transform = Transform(
                                measurements.player_measurements.transform
                            )
                            # Compute the final transformation matrix
                            Rt = world_transform * camera_to_car_transform
                            # Get the (4, 4) Numpy matrix itself
                            Rt = Rt.matrix
                            main_image = sensor_data.get('CameraRGB', None)

                            if main_image is not None:
                                array = image_converter.to_rgb_array(main_image)
                                array.setflags(write=1)

                            for agent in measurements.non_player_agents:
                                if agent.HasField('vehicle'): # if you want only cars
                                    pos = agent.vehicle.transform.location
                                    pos_vector = np.array([[pos.x], [pos.y], [pos.z], [1.0]])

                                    trnasformed_3d_pos = np.dot(inv(Rt), pos_vector)
                                    pos2d = np.dot(K, trnasformed_3d_pos[:3])

                                    # Normalize the point 
                                    pos2d = np.array([
                                        pos2d[0] / pos2d[2],
                                        pos2d[1] / pos2d[2],
                                        pos2d[2]])

                                    # Now, pos2d contains the [x, y, d] values of the image in pixels (where d is the depth)
                                    # You can use the depth to know the points that are in front of the camera (positive depth).
                                    if pos2d[2] > 0:
                                        x_2d = 400 - pos2d[0]
                                        y_2d = 300 - pos2d[1]
                                        # Draw something in the image 
                                        draw_rect(array, (y_2d, x_2d), 10, rand_color(agent.id))

Any help for my problem would be really appreciated .. I tried a lot, and failed to have the bounding box information

mhusseinsh on 23 Aug 2018

Hi @mhusseinsh , sorry I don't have time dig into your code, so I firstly assume you can of course draw a rectangle on the window given the rectangle location.

According to your code, you are drawing only a 10x10 rectangle at the projected center-point vehicles.
If this cannot be seen, you should first check if your plotting code has any bug inside.

wzhAptiv on 23 Aug 2018

👍2

@wzhAptiv Actually I don't need to do all of this and I don't even want to draw the boxes on the images.
What I care about, is just the 4 points of the 2D bounding boxes of the non-player agents in the scene.

Since I am not so expert with extrinsic and intrinsic matrices and projections and so, that's why I don't know how to do the calculations or how to get these 4 coordinates (which seem to be very easy, but I can't achieve it). That's why I am looking into older issues and trying to copy paste what is done, but again .. no progress

mhusseinsh on 23 Aug 2018

@mhusseinsh I think you have to do all of those to get what you want. Also, learning about how to do it is important, when you finally have to do the filtering yourself. If you know nothing and don't want to know anything about those coordinate system conversion and projection, you cannot even make the filtering finally. And I think this is the objective of this issue as you've first mentioned.

According to the current CARLA project, I don't see the official support about the filtering stuff, so you have to do it on your own. #425 can be somehow helpful, but finally guided to #314 .

wzhAptiv on 23 Aug 2018

Hey @mhusseinsh @wzhAptiv.

Sadly, we have no time to give that kind of support right now. We are currently a few developers so we are focusing our work to keep improving (and fixing) CARLA. We try to provide all what you will hardly find on the Internet, but the rest is up to you and your research :)
Luckily for our community, there are more experienced users with good advice (like @wzhAptiv) that are helping a lot, try to listen to them.

We provide well-documented python code that you can read and analyze to understand how are these transformations made also a few useful functions, don't be afraid to search inside the CARLA module.

I recommend you these excellent blog posts, that I hope they help you as they helped me in the past:

LearnOpengl: Transformations, Coordinate-Systems
The Extrinsic Matrix
The Intrinsic Matrix

Best of luck!

marcgpuig on 23 Aug 2018

@wzhAptiv how exactly I could use vehicle.bounding_box.transform and vehicle.bounding_box.extent to get the eight vertexes coordinates in the world coordinate system? I use the vehicle.transform.location plus vehicle.bounding_box.extents to perform the following calculation:

pos = vehicle.transform.location;
bbox3d = vehicle.bounding_box.extents
        bounding_box_3d = np.array([[[pos.x + bbox3d.x], [pos.y - bbox3d.y], [pos.z], [1.0]],
                     [[pos.x + bbox3d.x], [pos.y + bbox3d.y], [pos.z], [1.0]],
                     [[pos.x - bbox3d.x], [pos.y + bbox3d.y], [pos.z], [1.0]],
                     [[pos.x - bbox3d.x], [pos.y - bbox3d.y], [pos.z], [1.0]],
                     [[pos.x + bbox3d.x], [pos.y - bbox3d.y], [pos.z + 2*bbox3d.z], [1.0]],
                     [[pos.x + bbox3d.x], [pos.y + bbox3d.y], [pos.z + 2*bbox3d.z], [1.0]],
                     [[pos.x - bbox3d.x], [pos.y + bbox3d.y], [pos.z + 2*bbox3d.z], [1.0]],
                     [[pos.x - bbox3d.x], [pos.y - bbox3d.y], [pos.z + 2*bbox3d.z], [1.0]]])

But it doesn't seem give me correct coordinates.

danxuhk on 27 Aug 2018

Hi @danxuhk , I think you are not considering the rotation of the bounding box. Please take that into calculation, namely, vehicle.bounding_box.transform.rotation. Also, don't forget to check vehicle.bounding_box.transform.location, even though the relative location to the vehicle-location may be all zeros.

wzhAptiv on 27 Aug 2018

Hi @wzhAptiv Do you mean there is a relative motion between the center of the player (i.e. vehicle.transform.location) and the center of the bounding box?

danxuhk on 27 Aug 2018

@danxuhk Possible. I forget if the center point of the player is on chassis or at the geometry center. In the former case, there can be the relative offset.

wzhAptiv on 27 Aug 2018

@wzhAptiv so I need to apply a matrix transformation on bbox_extents or on vehicle.transform.location? The vehicle.bounding_box.transform is which one relatative to which one?

danxuhk on 27 Aug 2018

@wzhAptiv I tried both, but they don't work for me. Could you give me a simple code to get the world coordinates of one of the eight vertexes from vehicle.bounding_box.transform and vehicle.bounding_box? Thanks.

danxuhk on 27 Aug 2018

solved it after restarting the carla server...

danxuhk on 27 Aug 2018

🎉1

@danxuhk What worked for you at the end?