question

russ.123 avatar image
0 Likes"
russ.123 asked Jeanette F commented

Reinforcement Learning with 2 agents

Hello!

I am trying to complete a project on using Reinforcement Learning with FlexSim to do the following:

4 sources will be used as inputs for the 4 distinct products into the production line ("Product 1", "Product 2", “Product 3” and “Product 4”). The 4 products are then collected in Queue 1 to be output into Processor 1 which has varying setup time when switching from processing one product to another, and different processing times based on the 4 distinct product types. (same as the FlexSim Reinforcement Learning tutorial) After Processor 1 has completed the procedure on the product, the product will proceed to Queue 2.

From Queue 2, they will be directed to 4 “specialized” processors: Processor 2, Processor 3, Processor 4, Processor 5. Each processor is “specialized” at processing one item type faster than all other processors, for example, Processor 2 processes "Product 1" faster than “Product 2”, while Processor 3 processes "Product 2" faster than “Product 1”. After processing has been completed, the product will then enter Sink 1 where the process is completed.

To optimize the system's efficiency through product scheduling and routing, reinforcement learning (RL) will be implemented at two key points:

1. Processor 1 – At this stage the agent will decide which product to pull into Processor 1 from Queue 1 depending on which sequence has the shortest total elapsed time (setup time and processing time)

2. Queue 2 – RL will be employed to optimize the routing of products to Processor 2 or Processor 3 or Processor 4 or Processor 5. The goal of the RL agent here is to send the product to the “specialized” processor or the next best processor to use if the “specialized” processor is currently processing a product.


From my understanding, the scripts made available by FlexSim in the Reinforcement Learning tutorial (flexsim_env.py and flexsim_training.py) are only for training one RL agent. As such, I have 2 identical models, but one model has RL agent implementation at Processor 1 only, while one model has RL agent implementation at Queue 2 only. The scripts are able to train the agent at Processor 1 but are unable to train the agent at Queue 2. Thus, I would like to check if I have done something wrong here.


Additionally, after validating that both models with one agent implemented in each are able to work, I would like to combine them. Is this possible? flexsim_env.pyflexsim_training.py

Changeover - RL1.fsmChangeover - RL2.fsm

FlexSim 23.2.0
reinforcement learning
changeover-rl1.fsm (40.4 KiB)
changeover-rl2.fsm (40.6 KiB)
flexsim-env.py (7.6 KiB)
· 1
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

1 Answer

Felix Möhlmann avatar image
0 Likes"
Felix Möhlmann answered Felix Möhlmann commented

In the second model the "LastItemType" parameter is updated upon exit of an item from Queue2. So the decision of where to send an item is currently based on the type of the previous item...which doesn't make much sense. The parameter should be updated in the On Entry trigger.

Since multiple items could be worked on, the way the reward is calculated also should probably be changed. If an new item was released from Queue2 shortly before a different item enters the sink, its reward will be very, regardless of its actual process time. I would store the total process time on a label on the item. When it enters the sink, add that value to an array label. The reward would be based on the oldest entry in that array (which is then discarded).

The current setup also does not satisfy your goal of learning to use the "next best" processor if the best one is not available. For that the agent actually needs to know which processors are available through the observations.

I haven't done this myself, but it should be possible to add two RL tools to a model that query actions from different ports. So you would run both agents in parallel.

· 4
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.