question

Vothan Salomão avatar image
0 Likes"
Vothan Salomão asked Vothan Salomão commented

Is it possible to use reinforcement learning to train AGVs?

Is it possible to use reinforcement learning to train AGVs? I was thinking of using the Advanced AGV template in a network with multiple AGVs circulating, implementing the label of the Tokens for the current CP (Control Point), the last CP, and the AGV's destination in a global table, and creating a Parameter Table with this information for each AGV. This data would be used as observations for machine learning, and the actions would involve modifying the destinations of the AGVs or assigning each AGV to a specific destination. The goal would be to maximize the number of pallet outputs throughout the model, and I believe this approach could help optimize routes and reduce potential deadlocks.



I have read the available documentation on reinforcement learning

FlexSim 23.2.0
reinforcement learningoptimizeagvs
· 6
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Kavika F avatar image Kavika F ♦ commented ·

Hey @Vothan Salomão, I haven't built any models like that, but by the way you've described the problem I think it's solvable. Is there a specific aspect of your issue you're trying to solve or figure out?

0 Likes 0 ·
Vothan Salomão avatar image Vothan Salomão Kavika F ♦ commented ·

ReinforcementLearning_AGVs2.fsm




I would like to try building a model using reinforcement learning with AGVs, but I'm facing some difficulties.


First, it's related to the Process Flow and the Global Table I've constructed. I'm having trouble exactly capturing the CP and Last CP of the AGVs in the dynamic table I built throughout the route using the Process Flow and the CP and Last CP labels.


Another question concerns the AGVs' Destinations. In the global table, when referencing the Destination Label, the AGV doesn't actually update with the true destination.


The idea behind my model is to train an agent to make the best allocation decisions for the AGVs for Pallets periodically in the model, thus updating the Destination Label for empty AGVs or those on their way to pick up an item.


The agent's observations would be the columns CP, Last CP, and Destination. The actions would involve modifying the Destinations of AGVs that are not carrying items to pick up Pallets periodically. The reward function would aim to maximize the number of Pallets transported over time.


How could I update my Global Table more accurately by obtaining this information from each AGV and not from the Process Flow?

0 Likes 0 ·
Jason Lightfoot avatar image Jason Lightfoot ♦♦ commented ·

Hi @Vothan Salomão, was David Seo's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always comment back to reopen your question.

0 Likes 0 ·
Vothan Salomão avatar image Vothan Salomão Jason Lightfoot ♦♦ commented ·

Hello @Jason Lightfoot I elaborated my questions in a better way after the initial inquiry, thank you


0 Likes 0 ·
Jason Lightfoot avatar image Jason Lightfoot ♦♦ Vothan Salomão commented ·

We've explained how to update the table in this post. Do you still need help with this question?

0 Likes 0 ·
Show more comments

1 Answer

David Seo avatar image
1 Like"
David Seo answered Vothan Salomão commented

@Vothan Salomão

I take my client's case study as the example of using AI:RL looking for an optimized AGV route without dynamic traffic. It's not my project but the customer's project by themselves.

Yes. It's flexsim's python AI:RL feature.

So I think you can solve your issues like maxmize outputs through the model using AI:RL.

· 1
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Vothan Salomão avatar image Vothan Salomão commented ·

Hello @David Seo

Could you provide me with an example or guidance for my implementation? What parameters do you consider important for observations, actions, and rewards?


This is my model with the Global Table that I intend to use as an observation.

ReinforcementLearning_AGVs2.fsm

0 Likes 0 ·