arxiv.org/pdf/2109.05441.pdf

1 Users

0 Comments

22 Highlights

0 Notes

Tags

Top Highlights

often project the point clouds to 2D space and then process them via 2D convolution

A natural remedy is to utilize the 3D voxelization and 3D convolution network

SemanticKITTI, nuScenes and A2D2

we directly apply the 3D voxelization [12], [13] and 3D convolution networks to outdoor LiDAR point cloud, only to find very limited performance gain (as shown in Fig. 1b).

However, previous 3D voxelization methods consider the point cloud as a uniform one and split them via the uniform cube, while neglecting the varying-density property of outdoor point cloud. Consequently, this effect to apply the 3D partition to outdoor point cloud is met with fundamental difficult

3D cylindrical partition and asym- metrical 3D convolution networks, which maintain the 3D geometric information and handle these issues from parti- tion and networks,

cylindrical partition resorts to the cylinder coordinates to divide the point cloud dynamically according to the distance (Regions that are far away from the origin have much sparse points, thus requiring a larger cell), which produces a more balanced point distribution (

For semantic segmentation, we evaluate the proposed method on several large-scale outdoor datasets, including SemanticKITTI [14], nuScenes [15] and A2D2 [16].

(1) We reposition the focus of outdoor LiDAR segmentation from 2D projection to 3D structure, and further inves- tigate the inherent properties (difficulties) of outdoor point cloud. (2) We introduce a new framework to explore the 3D geometric pattern and tackle these difficulties caused by sparsity and varying density, through cylindrical partition and asymmetrical 3D convolution networks. (3) The proposed method achieves the state of art on LiDAR- based semantic segmentation, LiDAR panoptic segmen- tation and LiDAR point cloud 3D detection, which also demonstrates its strong generalization capability

PointNet,

SqueezeSeg [10], Dark- net [14], SqueezeSegv2 [36], and RangeNet++

. PolarNet [6

. However, these 3D-to-2D projection methods inevitably loss and alter the 3D topology and fails to model the geometric information. Moreover, in most outdoor scenes, LiDAR device is often used to produce the point cloud data, where its inherent properties, i.e., sparsity and varying density , are often neglected.

ccuSeg [37], SSCN [12] and SEGCloud [38]

DeepLab [43], [44] and PSP [45].

given a point cloud, the task is to assign the semantic label to each point.

we first employ the cylindrical partition to generate the more balanced point distribution (more robust to varying density), then apply the asymmetrical 3D convolution networks to power the horizontal and vertical weights, thus well matching the object point distribution in driving scene and enhancing the robustness to the sparsity

The LiDAR point cloud is first divided by the cylindrical partition and the features extracted from MLP is then reassigned based on this partition. Asymmetrical 3D convolution networks are then used to generate the voxel- wise outputs. For segmentation tasks, a point-wise module is introduced to alleviate the interference of lossy cell-label encoding, thus refining the outputs

varying density, where nearby region has much greater density than farther-away region.

Therefore, uniform cells splitting the varying-density points would fall into an imbalanced distribution (

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.