【极简笔记】LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments

LineNet: a Zoomable CNN for Crowdsourced High Definition Maps Modeling in Urban Environments

Abstract：

We proposed a convolutional neural network with a novel prediction layer and a zoom(缩放) module, called LineNet. It is designed for state-of-the-art lane detection in an unordered crowdsourced image dataset（无序的众包图像数据集）.

And we introduced TTLane, a dataset for efficient lane detection in urban road modeling applications.

Combining LineNet and TTLane, we proposed a pipeline to model HD maps with crowdsourced data for the first time. And the maps can be constructed precisely even with inaccurate crowdsourced data.

I. INTRODUCTION

高清地图建模思路：

1.采用crowdsourced way（众包）的形式收集数据，例如行车记录仪；

2.用LineNet识别标记lane markings；

3.model HD maps using structure from motion (SfM) with the detection results

To achieve high precision, a novel Line Prediction(LP) layer is included to locate lane markings directly instead of calculating from segmentation.采用直接定位车道标志而不是图像分割的计算方法

In addition to the LP layer, the Zoom Module is added to recognize occlusion segments and gaps inside dashed lanes.缩放模块用来识别虚线内的遮挡段和间隙，

With this module, the field of view (FoV) can be enlarged to arbitrary size without changing the network architecture.使用该模块，视场（FoV）可以放大到任意大小而不需要改变网络架构

IV. LINENET

A. Line Prediction Layer

The Line Prediction (LP) layer is designed for accurate lane positioning and classification, inspired by Zhu et al.[32]’s three branch prediction procedure. There are six branches in our LP layer: line mask; line type; line position; line direction; line confidence; and line distance, as is shown in Fig 2.

line mask：is a stroke we draw with a fixed width (32 pixels in our experiments).
line type：indicates one of the six-lane marking types（WS(WHITE SOLID), WD(WHITE DASH), RB(ROAD BOUNDARIES), YS(YELLOW SOLID), YD(YELLOW DASH)，others）
line position：Line position predicts the vector from an anchor point to the closest point in the line. Supervising on the line position can produce much more accurate results than using a mask.线位置预测从anchor 到线中最近点的矢量。对线位置进行监控可以比使用mask产生更准确的结果。
Line direction：predicts the orientation of the lane
Line confidence：predicts the confidence ratio, i.e., whether the network can see the lane clear enough, which is defined between 0 (if two lines are closer than 46 pixels) and 1 (otherwise).When we zoom into two adjacent lines, they will gradually separates, result in change of the confidence ratio.
Line distance：the distance from the anchor point to the closest point in the line

B. Zoom Module

The thumbnail CNN provides a global context for the features that the high-resolution CNN “sees” in detail.

两个cnn通过“injection layer”在一定程度上权重共享。

For the zoom module, we use the 66th layer (4b13) of the Dilated Resnet architecture [5] （Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs）as the injection layer,

The Zoom Module and LP layer work together to boost performance. The zoom module allows the network to zoom into areas where the LP layer is not confident enough.

In practice, the zooming procedure is used multiple times, with the zoom ratio gradually multiplied from 0.5 to 16.Fig 5 shows the four stages of the zooming process and how the LP layer and the zoom module interact with each other through the output of the line confidence branch. Overall, when the certainty of the lane rises, the spatial location of the lane becomes more accurate.

C. Post Processing

The clustering algorithm named DBSCAN[7] was used with our hierarchical distance(HDis).

V. EXPERIMENTS

A. Lane Detection for HD Maps Modeling

2) LineNet for Lane Positioning:

In the lane positioning subtask, our goal is to predict all the lane markings in the image without considering their types.

3) LineNet for Comprehensive Lane Detection:

In this subtask, we need to predict both lanes marking positions and their types.here we evaluate area and distance between lines, not masks.

pdLine预测准确满足的标准：

The distance between the endpoints of two lines is smaller than a certain threshold (annotated as disThresh).两条线的端点之间的距离小于某个阈值（注释为disThresh）。
The area enclosed by two lines is smaller than a certain threshold (annotated as areaThresh).由两条线包围的区域小于某个阈值（注释为areaThresh）。

In our experiments, disThresh is set to 40px due to the high-resolution of images and areaThresh is set to disThresh times the length of gtLine.

Visualization

SCNN[30] and MLD-CRF[12] do not produce classes information so the classes with all methods were evaluated except these two.

By contrast, LineNet was robust and performed well in both situations.