question

Ryan_Wei avatar image
0 Likes"
Ryan_Wei asked Jeanette F commented

How do I do this Reinforcement learning training

I have done the tutorial. Now I'm trying to extend this example.

This is my model:

1678539112553.png

This is my SetupTime Table:

1678539154674.png

If I want to increase the model given in the tutorial from one production line to three, should I change the Observation Space and Action Space in Reinforcement Learning to MultiDiscrete, and set Decision Events to three?

Like this:

1678539216517.png

This is my model file:

practice2023.3.16.fsm


FlexSim 23.0.0
reinforcement learning
1678539112553.png (142.0 KiB)
1678539154674.png (4.6 KiB)
1678539216517.png (93.5 KiB)
practice2023316.fsm (31.4 KiB)
· 1
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jeanette F avatar image Jeanette F ♦♦ commented ·

Hi @Ryan_Wei, was Felix Möhlmann's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always unaccept and comment back to reopen your question.

0 Likes 0 ·

1 Answer

·
Felix Möhlmann avatar image
1 Like"
Felix Möhlmann answered Felix Möhlmann commented

Since all three processors operate in the same way and are thus equivalent for the Reinforcement Learning agent, you shouldn't need to train a new agent if you already have trained one for one processor.

The easiest way to use more processors would probably be to copy the Reinforcement Learning Tool and only adjust which event triggers them, whose "lastitemtype" value is written to the parameter in the On Observation code and which reward value is returned (if they are different).

1678693496275.png

1678693542809.png

If you come up with a good way to know which processor triggered the observation (possibly keep track of who will finish (and thus pull) next), you could also do this with multiple events in a single window.


1678693496275.png (2.9 KiB)
1678693542809.png (60.4 KiB)
· 6
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Ryan_Wei avatar image Ryan_Wei commented ·

So what I should do is to use three Reinforcement Learning Tools and a Parameter for each Observation and Action?

Like this:

1678780011896.png

1678780045066.png1678780059364.png

Thank you for your answer, sorry to trouble you again

0 Likes 0 ·
1678780011896.png (122.0 KiB)
1678780045066.png (13.0 KiB)
1678780059364.png (12.2 KiB)
Felix Möhlmann avatar image Felix Möhlmann Ryan_Wei commented ·
If all three processors are equivalent in your model, yes.

If there is a difference (for example the items are not distributed equally or the processors use different tables for their setup times) then you should add a second observation parameter that provides the RL algorithm with the information which processor it is currently making a decision for.

0 Likes 0 ·
Ryan_Wei avatar image Ryan_Wei Felix Möhlmann commented ·

But when I use this way to do my training, the result will be like this:

1678782474983.png

1678782503207.png

1678782530755.png

1678782559489.png

"ep_len_mean"and "ep_rew_mean" are all smaller than before training. I would like to know which part is wrong.

0 Likes 0 ·
1678782474983.png (5.6 KiB)
1678782503207.png (15.4 KiB)
1678782530755.png (15.4 KiB)
1678782559489.png (15.9 KiB)
Show more comments

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.