Is it possible to use reinforcement learning to train AGVs? I was thinking of using the Advanced AGV template in a network with multiple AGVs circulating, implementing the label of the Tokens for the current CP (Control Point), the last CP, and the AGV's destination in a global table, and creating a Parameter Table with this information for each AGV. This data would be used as observations for machine learning, and the actions would involve modifying the destinations of the AGVs or assigning each AGV to a specific destination. The goal would be to maximize the number of pallet outputs throughout the model, and I believe this approach could help optimize routes and reduce potential deadlocks.
I have read the available documentation on reinforcement learning