算法3：神经规划器训练集构造

输入：训练样本集，任务场景数据集 $X_{o b s}$ ，任务场景的隐藏表示集 $H = {h_{1}, h_{2}, ..., h_{n}}$ ，任务场景中可行路径的集 $P = {P_{i} | P_{i} = {L_{1}, L_{2}, ..., L_{m}}}$

输出：下一时刻位置

1.← $\emptyset$

2. $H$ ←Encoder( $X_{o b s}$ )

3. $P$ ←LoadPath()

4. for i←1 to N do//N代表任务场景的个数

5. forj←1 to M do //M代表第i个任务场景的可行路径条数

6. Length← $L_{j}^{P_{i}}$ .Length()// $L_{j}^{P i}$ 代表第i个任务场景的第j条可行路径

7. Reverse←False

1 8. if Length > 0 then