3D Point Cloud

Point Cloud is a type of workspace output often used for LIDAR data. Each point has three dimensions (x, y, and z coordinates), as well as an optional intensity value. Much like a video, the point cloud data is captured over time, and associates will scrub through the asset annotating objects. The Point Cloud output is often used for autonomous driving use cases and will resemble the following:

Project input data

'cloud' files - Required: .pcd files

These files are the point cloud files generated from a Lidar sensor. There may be one or more files of point cloud data to display multiple frames per task (a “video” of 3D data). Multiple files will display as a series of frames on a single task.
If the files are sequential, that is, should be viewed in a certain order, ensure they have an alphanumeric naming convention that is easily sortable to the correct order in the folder (e.g. 001.pcd, 002.pcd, 003.pcd, 004.pcd, etc.).

'ground' files - Optional: -ground.pcd files
- Point cloud of data for the ground points (the “floor” in the data) in the 3D data, if it has already been filtered and extracted.
- If you have not filtered the ground points in the 3D data, the Sama Platform will automatically estimate and filter the ground while ingesting the data. The Sama Platform allows agents to show/hide the floor, which can aid in precise annotation.
- Ground files must be represented by “-ground” in the file name. It is recommended that the -ground file's naming matches that of the 3D files (e.g., 001.pcd, 001-ground.pcd, etc.) so they are properly ordered. They can be placed in the same folder as the cloud files, or in separate cloud and ground folders (e.g., cloud/001.pcd, cloud/002.pcd, and ground/001-ground.pcd, ground/002-ground.pcd)
'video/image' files - Optional: .mp4 files or .jpeg/png

2D reference videos or image frames captured by multiple cameras that correspond to the 3D point cloud data. They are typically for display and reference only.
If providing image frames and the files are sequential, that is, should be viewed in a certain order, ensure they have an alphanumeric naming convention that is easily sortable to the correct order in the folder (e.g. 001.jpeg, 002.jpeg, 010.jpeg, 100.jpeg, etc.).
The 2D reference videos/image frame count should match that of the cloud files and synced to the nearest timestamp.

Coordinate System

The Sama platform uses the right-handed Cartesian coordinate system, and the positive z direction must represent “up” in the physical world.

Sama uses the roll-pitch-yaw XYZ Euler angle convention where roll is around the x-axis, pitch is around the y-axis, and yaw is around the z-axis.

Fixed world

Fixed world refers to a 3D environment that is anchored to the world, rather than a 3D environment that is relative to a moving object like the ego vehicle. Annotating in a fixed-world environment can be much faster and result in better quality, especially with static objects, as they mostly stay in place as the scene progresses.

Whether the point cloud coordinate system origin is a fixed point on the ego vehicle or a fixed point in world coordinates, any pre-annotations, camera calibration or odometry data must be given in that same coordinate system. Annotations will be delivered in the same coordinate system as the point cloud

Point cloud data is typically delivered in its local coordinate system that uses the LIDAR camera as the frame of reference, but fixed-world 3D environments require that the point cloud data use the world coordinate system. When this setting is enabled, the platform will pre-process your point cloud data into the world coordinate system (you will also need to provide sensor location metadata in a separate input).

Sensor location metadata

The sensor location metadata describes the LIDAR sensor’s position and what direction it's pointing, and is required to enable fixed world. This is also known as the LIDAR extrinsic matrix. The metadata is given for each frame and contains values of:

Position [x,y,z] - the position of the sensor regarding a world frame.
Rotation [rotation_x,rotation_y,rotation_z,rotation_w] - Also known as the heading/orientation, which represents the orientation of the sensor regarding a world frame. Values should be in a quaternion.

There are sensor location metadata files for each point cloud frame, and the total number of sensor location metadata files must match the number of point cloud frames/files. The accepted formats are CSV (.csv or .txt) or JSON (.json), and these files must be zipped together into a .zip file (unless the assets are on the sama-client-assets S3 bucket, in which case they can be in a folder).

Sensor calibration metadata files

‘sensor calibration metadata’ files - Optional: .pcd files

Sensor calibration metadata enables 3D to 2D projections on the reference video(s)/images(s). The metadata includes a video camera’s:

Intrinsic calibration values:

Focal Length in pixels, relative to the video file [f_x,f_y]
Principle Point in pixels, relative to the video file [c_x,c_y]
Distortion [k1,k2,k3,k4,p1,p2] (optional - upcoming support)

Extrinsic calibration values, expressed in the coordinate system of the corresponding point cloud file:

Position [x,y,z]
Rotation [rotation_x,rotation_y,rotation_z,rotation_w]

These are the rotation values required to orient a camera pointing along the positive x-axis of the lidar coordinate system (i.e. a front-facing camera) so that it will point in the direction of the camera in question. For example:

[0,0,0,1] for a perfectly aligned front-facing camera.
[0,0,-0.707,0.707] for a perfectly aligned right-facing camera
[0,0,0.707,0.707] for a perfectly aligned left-facing camera.
[0,0,1,0] for a perfectly aligned rear-facing camera

These are NOT rotation values of the camera coordinate system to the lidar coordinate system.

The metadata can be a single CSV/JSON file where it is applied to each frame, or you can provide a separate file for each frame. For example,

x, y, z, rotation_x, rotation_y, rotation_z, rotation_w, k1, k2, k3, k4, p1, p2, f_x, f_y, c_x, c_y 1.53867, -0.02493, 2.11534, 0, 0, 0, 1, 0.03016, -0.27366, 0.00109, -0.00192, 0.01, 0.001, 3255.72, 3255.72, 1487.64, 1014.38
Note: distortion, radial focal length and principal point metadata is needed only in the file of the first frame, as they remain static throughout the video.

‘sensor location metadata’ files - Optional: .csv, .txt, or .json files

Sensor location metadata enables 3D environments that are fixed to the world. It enables annotators to work in world coordinates so that static objects don't move regardless of the motion of the ego vehicle. The metadata includes the LIDAR camera’s:
- Position [x,y,z]
- Rotation [rotation_x, rotation_y, rotation_z, rotation_w]
The metadata is in a separate file for each frame and in CSV(.csv or .txt) or JSON(.json) format. For example:

CSV:x, y, z, rotation_x, rotation_y, rotation_z, rotation_w 1.43567, -0.04453, 2.51554, 0.01515, 0.01535, -0.34024, 0.94454
JSON:{ “x”:1.43567, “y”: -0.04453, “z”:2.51554, “rotation_x”:0.01515, “rotation_y”:0.01535, “rotation_z”:-0.34024, “rotation_w”:0.94454 }
Link sensor location metadata files to point cloud files by using the same base names. For example, 001.pcd: 001.json or 001.csv or 001.txt; 002.pcd: 002.json or 002.csv or 002.tx

Supported formats

Sama supports the following point cloud data formats from LIDAR and other 3D sensors, as supported by PDAL:

.bpf
.csd
.ept
.e57
.gdal
.geowave
.i3s
.ilvis2
.las
.matlab
.mbio
.mrsid
.nitf
.npy
.pcd (in text or binary uncompressed; binary compressed is not supported)
.ply
.pts
.qfit
.txt

📘
Additional types
If you require any other format, reach out to our team for further assistance.

Each data point must have (x,y,z) coordinates.

Point intensity values are also optionally supported (normalized to a 0-1 scale), encoded in an intensity dimension. Intensity values can also be normalized via project settings. Please share example files with your project manager early in the scoping process.

The following dropdown demonstrates the file structure needed using zip files. Folders are also supported in place of zips, and follow the same file structure, but must be uploaded to Sama's secure AWS S3 bucket.

🚧
Sensor fusion support
Please see additional notes below on folder support. It’s required to keep the assets in a correct and matching sequence for sensor fusion to be properly supported.

Input data folder structure

Important: when creating zip files, ensure that you are zipping just the files, and there is not a parent folder.

Not acceptable: Cloud.zip contains; parent_folder1 - frame001.pcd, frame002.pcd, frame003.pcd
Acceptable: Cloud.zip contains; frame001.pcd, frame002.pcd, frame003.pcd

You have a sequence that is three frames long consisting of a point cloud. The following is the file structure needed:

Cloud.zip: frame001.pcd, frame002.pcd, frame003.pcd

You have a sequence that is three frames long consisting of a point cloud, and a ground point cloud provided by the client. The following is the file structures needed:

Cloud.zip: frame001.pcd, frame001-ground.pcd, frame002.pcd, frame002-ground.pcd, frame003.pcd, frame003-ground.pcd (Make note the ground point cloud naming matches the point cloud)
Alternative Cloud.zip: cloud/frame001.pcd, cloud/frame002.pcd, cloud/frame001.pcd, ground/frame001-ground.pcd, ground/frame002-ground.pcd, ground/frame003-ground.pcd

You have a sequence that is three frames long consisting of a point cloud, a ground point cloud provided by the client and want 2 reference videos/images shown. The following is the file structures needed:

Cloud.zip: frame001.pcd, frame001-ground.pcd, frame002.pcd, frame002-ground.pcd, frame003.pcd, frame003-ground.pcd
Camera1.zip: frame001.jpeg, frame002.jpeg, frame003.jpeg. (Alternatively, Camera1.mp4 video file instead of a zip of images.)
Camera2.zip: camera2-frame001.jpeg, camera2-frame002.jpeg, camera2-frame003.jpeg (Alternatively, Camera2.mp4 video file instead of a zip of images.)

You have a sequence that is three frames long consisting of a point cloud, a ground point cloud provided by the client, want 2 reference videos/images shown, 3D to 2D projections, and fixed to the world. The following is the file structures needed:

Cloud.zip: frame001.pcd, frame001-ground.pcd, frame002.pcd, frame002-ground.pcd, frame003.pcd, frame003-ground.pcd
Camera1.zip: frame001.jpeg, frame002.jpeg, frame003.jpeg. (Alternatively, Camera1.mp4 video file instead of a zip of images.)
Camera2.zip: camera2-frame001.jpeg, camera2-frame002.jpeg, camera2-frame003.jpeg. (Alternatively, Camera2.mp4 video file instead of a zip of images.)
Camera1-sensor-calibration.zip: frame001.csv, frame002.csv, frame003.csv.
Camera2-sensor-calibration.zip: frame001.csv, frame002.csv, frame003.csv.
Cloud-sensor-location.zip: frame001.csv, frame002.csv, frame003.csv.

Testing the compatibility of a point cloud file

To test whether your point cloud file is compatible with the Sama platform:

Install PDAL from https://pdal.io/quickstart.html.
Run “pdal translate lidar.pcd”.

This will attempt to do a vanilla translation from the input format to .pcd. If an error occurs, the file will not load in the Sama platform.

Point Cloud tools

Cuboid (3D)

The (x,y,z) vertices of each 3D cuboid are always exported in the following pattern: one full face and then the other full face. In this diagram, the side bounded by 3-1-2-0 is one face of the cuboid and the side bounded by 7-5-6-4 is the other face of the cuboid. These faces may be “front” and “back,” “left” and “right,” or “top” and “bottom,” depending on the cuboid’s rotation in the data.

The cuboid vertex export order can be used to define a face of interest. The face of interest is defined by the first four vertices of the cuboid listed in the export file. A face of interest, or side of the cuboid, can be used to designate the front of a car or the direction of movement of an object bounded by a cuboid. Therefore, there is always a default face of interest.

1600 — Cuboid vertex export pattern, showing the first 4 vertices exported as the “front” and the last 4 vertices exported as the “back.”

As an illustrative example, consider the Point Cloud workspace below, where a single vehicle has been delimited by a cuboid.

The JSON for this object would be as follows:

"Point Cloud Scene": [
  {
    "shapes": [
      {
        "tags": {
          "Parent": "None",
          "Answers": "TYPE_VEHICLE",
          "Occlusion": ""
        },
        "type": "cuboid",
        "index": 3,
        "key_locations": [
          {
            "tags": {
              "Occlusion": "Visible"
            },
            "points": [
              [
                -2.64196,
                3.93875,
                -1.82775
              ],
              [
                -2.64196,
                3.93875,
                -0.40086
              ],
              [
                -2.64196,
                5.85652,
                -1.82775
              ],
              [
                -2.64196,
                5.85652,
                -0.40086
              ],
              [
                1.83147,
                3.93875,
                -1.82775
              ],
              [
                1.83147,
                3.93875,
                -0.40086
              ],
              [
                1.83147,
                5.85652,
                -1.82775
              ],
              [
                1.83147,
                5.85652,
                -0.40086
              ]
            ],
            "visibility": 1,
            "frame_number": 0,
            "position_center": [
              -0.40525,
              4.89764,
              -1.1143
            ],
            "direction": {
              "roll": 0.0,
              "pitch": 0.0,
              "yaw": 3.141592653589793
            },
            "dimensions": {
              "length": 4.47343,
              "width": 1.91777,
              "height": 1.42689
            }
          }
        ],
        ...
 }

The first 4 points represent the face of interest.
The position_center (x,y,z) is the center point of the cuboid.
Dimensions are the length(along x-axis), width(along y-axis) and height(along z-axis) of the cuboid.
Direction is the roll, pitch, and yaw of the cuboid; applied in that respective order and in radians. Using direction, it is easy to determine any faces of interest (such as the front of a cuboid that may represent the front of a vehicle). For specific faces of interest that depend on annotation instructions, use points as above.

3D Point Cloud

Coordinate System

Fixed world

Sensor location metadata

Supported formats

📘
Additional types

🚧
Sensor fusion support

Testing the compatibility of a point cloud file

Point Cloud tools

Cuboid (3D)

Coordinate System

Fixed world

Sensor location metadata

Supported formats

📘Additional types

🚧Sensor fusion support

Testing the compatibility of a point cloud file

Point Cloud tools

Cuboid (3D)

📘
Additional types

🚧
Sensor fusion support