Researchers from Harbin Institute of Technology recently proposed a reinforcement learning-based approach to design multi-impulse rendezvous trajectories in linear relative motions. This approach enables the rapid generation of rendezvous trajectories through the offline training and the on-board deployment. Numerical optimization methods, deeplearning methods, and reinforcement learning methods have been proposed for this classical spacecraft trajectory optimization problem, but each has its own weaknesses. Combining the advantages of both, a policy can be pretrained differently.