How do I do this Reinforcement learning training

Question

question

Ryan_Wei asked Mar 11, '23 Jeanette F commented Mar 21, '23

How do I do this Reinforcement learning training

I have done the tutorial. Now I'm trying to extend this example.

This is my model：

This is my SetupTime Table：

If I want to increase the model given in the tutorial from one production line to three, should I change the Observation Space and Action Space in Reinforcement Learning to MultiDiscrete, and set Decision Events to three?

Like this:

This is my model file：

practice2023.3.16.fsm

Software Version:

FlexSim 23.0.0

reinforcement learning

1678539112553.png (142.0 KiB)

1678539154674.png (4.6 KiB)

1678539216517.png (93.5 KiB)

practice2023316.fsm (31.4 KiB)

· 1

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jeanette F ♦♦ commented · Mar 21, 2023 at 09:46 PM

Hi @Ryan_Wei, was Felix Möhlmann's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always unaccept and comment back to reopen your question.

0 ·

Answer 1 · 2023-03-13T07:47:34Z

Felix Möhlmann answered Mar 13, '23 Felix Möhlmann commented Mar 16, '23

Since all three processors operate in the same way and are thus equivalent for the Reinforcement Learning agent, you shouldn't need to train a new agent if you already have trained one for one processor.

The easiest way to use more processors would probably be to copy the Reinforcement Learning Tool and only adjust which event triggers them, whose "lastitemtype" value is written to the parameter in the On Observation code and which reward value is returned (if they are different).

If you come up with a good way to know which processor triggered the observation (possibly keep track of who will finish (and thus pull) next), you could also do this with multiple events in a single window.

1678693496275.png (2.9 KiB)

1678693542809.png (60.4 KiB)

· 6

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Ryan_Wei commented · Mar 14, 2023 at 07:48 AM

So what I should do is to use three Reinforcement Learning Tools and a Parameter for each Observation and Action？

Like this：

Thank you for your answer, sorry to trouble you again

0 ·

1678780011896.png (122.0 KiB)

1678780045066.png (13.0 KiB)

1678780059364.png (12.2 KiB)

Felix Möhlmann Ryan_Wei commented · Mar 14, 2023 at 08:00 AM

If all three processors are equivalent in your model, yes.

If there is a difference (for example the items are not distributed equally or the processors use different tables for their setup times) then you should add a second observation parameter that provides the RL algorithm with the information which processor it is currently making a decision for.

0 ·

Ryan_Wei Felix Möhlmann commented · Mar 14, 2023 at 08:32 AM

But when I use this way to do my training, the result will be like this：

"ep_len_mean"and "ep_rew_mean" are all smaller than before training. I would like to know which part is wrong.

0 ·

1678782474983.png (5.6 KiB)

1678782503207.png (15.4 KiB)

1678782530755.png (15.4 KiB)

1678782559489.png (15.9 KiB)

Show more comments

question