A new on-ramp merging model for unmanned vehicles based on the MAPPO algorithm is proposed to improve safety and traffic efficiency. The model incorporates an Action-Mask and noise advantage values to prevent invalid actions and encourage exploration. Experimental results show promising results in reducing accidents and improving response to dynamic environments.