question

mark zhen avatar image
0 Likes"
mark zhen asked Andrew O commented

The Problem with Reinforcement Learning

I don't understand why my model

Replaced my production order and the result would be the same

Normally it would be different, right?

The situation I want to do now is that I have a trained agent and then give him different orders to see how my trained agent can perform.allcombos-22-0-1-fm.fsm

1676822699903.png

FlexSim 22.0.0
reinforcement learning
· 2
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

mark zhen avatar image mark zhen commented ·
0 Likes 0 ·
Andrew O avatar image Andrew O commented ·

Hi @mark zhen , was Felix Möhlmann's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always unaccept and comment back to reopen your question.

0 Likes 0 ·

1 Answer

Felix Möhlmann avatar image
0 Likes"
Felix Möhlmann answered Felix Möhlmann commented

The source is still assigning random types to the items in its trigger. Thus, any type values you enter in the source's table don't actually influence the model since they are overridden.

· 4
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

mark zhen avatar image mark zhen commented ·

I don't understand?? Can you explain again? So if what I want to do now is, I already have a trained agent. I want to use this proxy to solve different order problems how can I solve


0 Likes 0 ·
Felix Möhlmann avatar image Felix Möhlmann mark zhen commented ·

The "Set Label" option in the creation trigger sets the "Type" label to a random value, overwriting whatever value it had previously.

1676878727855.png

So replications with the same number (experimenter) will always have the same type order. I already explained this in your previous post.

Your model is also set to take a random action when a request is made (again resulting in the same result when the same random number seed is used). It does not use the RL agent at all currently.

1676878971697.png

0 Likes 0 ·
1676878727855.png (5.7 KiB)
1676878971697.png (9.6 KiB)
mark zhen avatar image mark zhen Felix Möhlmann commented ·

I figured it out, I am very sorry, now I still have one thing I want to know about the code in my set up.

treenode curvar = assertvariable(current, "f_lastlabelval");

I want to know what is the value of my function and how can I know it.

1676880018690.png

0 Likes 0 ·
1676880018690.png (13.9 KiB)
Show more comments