Skip to main content

Data Code

Created on August 23|Last edited on August 23


Log Data to #🪄🐝



Pull Data from Woven Planet and then Log to WANDB for Reuse across Different Models

Setup

[null]:
[null]:
[null]:
[null]:
[null]:

Download Data

[null]:
[null]:

Log Data to WANDB

[null]:

Enrich Run information and Artifact Information

[null]:
[null]:

Log Data to Wandb

[null]:
[null]:
[null]:
[null]:

Visualized Logged Data Back to #🪄🐝



Visualisation Examples

This notebook shows some of the visualisation utility of our toolkit.

The core packages for visualisation are:

rasterization

contains classes for getting visual data as multi-channel tensors and turning them into interpretable RGB images. Every class has at least a rasterize method to get the tensor and a to_rgb method to convert it into an image. A few examples are:

  • BoxRasterizer: this object renders agents (e.g. vehicles or pedestrians) as oriented 2D boxes
  • SatelliteRasterizer: this object renders an oriented crop from a satellite map

visualization

contains utilities to draw additional information (e.g. trajectories) onto RGB images. These utilities are commonly used after a to_rgb call to add other information to the final visualisation. One example is:

  • draw_trajectory: this function draws 2D trajectories from coordinates and yaws offset on an image

Setup

[null]:
[null]:
[null]:
[null]:
[null]:
[null]:

First, let's configure where our data lives!

The data is expected to live in a folder that can be configured using the L5KIT_DATA_FOLDER env variable. You data folder is expected to contain subfolders for the aerial and semantic maps as well as the scenes (.zarr files). In this example, the env variable is set to the local data folder. You should make sure the path points to the correct location for you.

We built our code to work with a human-readable yaml config. This config file holds much useful information, however, we will only focus on a few functionalities concerning loading and visualization here.

[null]:
[null]:
[null]:
[null]:
[null]:
[null]:
[null]:

We can look into our current configuration for interesting fields

- when loaded in python, the yamlfile is converted into a python dict.

raster_params contains all the information related to the transformation of the 3D world onto an image plane:

  • raster_size: the image plane size
  • pixel_size: how many meters correspond to a pixel
  • ego_center: our raster is centered around an agent, we can move the agent in the image plane with this param
  • map_type: the rasterizer to be employed. We currently support a satellite-based and a semantic-based one. We will look at the differences further down in this script
[null]:

Load the data

The same config file is also used to load the data. Every split in the data has its own section, and multiple datasets can be used (as a whole or sliced). In this short example we will only use the first dataset from the sample set. You can change this by configuring the 'train_data_loader' variable in the config.

You may also have noticed that we're building a LocalDataManager object. This will resolve relative paths from the config using the L5KIT_DATA_FOLDER env variable we have just set.

[null]:
[null]:
[null]:
[null]:
[null]:
[null]:

Working with the raw data

.zarr files support most of the traditional numpy array operations. In the following cell we iterate over the frames to get a scatter plot of the AV locations:

[null]:
[null]:

Another easy thing to try is to get an idea of the agents types distribution.

We can get all the agents label_probabilities and get the argmax for each raw. because .zarr files map to numpy array we can use all the traditional numpy operations and functions.

[null]:
[null]:
[null]:

Working with data abstraction

Even though it's absolutely fine to work with the raw data, we also provide classes that abstract data access to offer an easier way to generate inputs and targets.

Core Objects

Along with the rasterizer, our toolkit contains other classes you may want to use while you build your solution. The dataset package, for example, already implements PyTorch ready datasets, so you can hit the ground running and start coding immediately.

Dataset package

We will use two classes from the dataset package for this example. Both of them can be iterated and return multi-channel images from the rasterizer along with future trajectories offsets and other information.

  • EgoDataset: this dataset iterates over the AV annotations
  • AgentDataset: this dataset iterates over other agents annotations

Both support multi-threading (through PyTorch DataLoader) OOB.

What if I want to visualise the Autonomous Vehicle (AV)?

Let's get a sample from the dataset and use our rasterizer to get an RGB image we can plot.

If we want to plot the ground truth trajectory, we can convert the dataset's target_position (displacements in meters in agent coordinates) into pixel coordinates in the image space, and call our utility function draw_trajectory (note that you can use this function for the predicted trajectories, as well).

[null]:
[null]:
[null]:
[null]:
[null]:

What if I want to change the rasterizer?

We can do so easily by building a new rasterizer and new dataset for it. In this example, we change the value to py_satellite which renders boxes on an aerial image.

[null]:
[null]:
[null]:
[null]:

What if I want to visualise an agent?

Glad you asked! We can just replace the EgoDataset with an AgentDataset. Now we're iterating over agents and not the AV anymore, and the first one happens to be the pace car (you will see this one around a lot in the dataset).

Semantic

[null]:
[null]:
[null]:
[null]:

Satellite

[null]:
[null]:
[null]:

Join into Table

[null]:

System Origin and Orientation

~At this point you may have noticed that we vertically flip the image before plotting it.~

Vertical flipping is not required anymore as it's already performed inside the rasteriser.

Further, all our rotations are counter-clockwise for positive value of the angle.

How does an entire scene look like?

It's easy to visualise an individual scene using our toolkit. Both EgoDataset and AgentDataset provide 2 methods for getting interesting indices:

  • get_frame_indices returns the indices for a given frame. For the EgoDataset this matches a single observation, while more than one index could be available for the AgentDataset, as that given frame may contain more than one valid agent
  • get_scene_indices returns indices for a given scene. For both datasets, these might return more than one index

In this example, we visualise a scene from the ego's point of view:

Semantic

[null]:
[null]:
[null]:
[null]:
[null]:

satellite

[null]:
[null]:
[null]:

Introducing a new visualizer

starting from l5kit v1.3.0 you can now use an interactive visualiser (based on Bokeh) to inspect the scene.

The visualization can be built starting from individual scenes and allows for a closer inspection over ego, agents and trajectories.

PRO TIP: try to hover over one agent to show information about it

[null]:
[null]:
[null]:
[null]:
[null]:
[null]:
[null]:
[null]:
[null]:
File<{extension: ipynb}>
File<{extension: ipynb}>