question

Sharan Nitin avatar image
0 Likes"
Sharan Nitin asked Sharan Nitin commented

Reinforcement leaning query on "flexsim_inference.py"?

  • What kind of information is loaded on FlexSim server when we execute the command model = PPO.load("ChangeoverTimesModel")?


  • As per the example if item type = 3 is on the machine and it is send as observation to the server, what kind value is predicted from the algorithm?


  • How is the return value used in the simulation?


@Phil BoBo






FlexSim 23.0.0
reinforcement learningartificial intelligence
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

1 Answer

Jordan Johnson avatar image
0 Likes"
Jordan Johnson answered Sharan Nitin commented

A reinforcement learning model is essentially a set of matrices that transform your observations into an action. The point of training is to fill out the matrices so that they give back good actions given the observation space. In gym/stable_baselines, those matrices are saved in a zip file after training. When you use PPO.load(RLModelFile) you are loading in the matrices.

The kind of value predicted by the policy will be one value for every value in your action space. In the case of the basic example in the help manual, the policy generally learns to match the previous type with its action.

FlexSim's RL tool uses the return value to set all the parameters in your action space. The model should then read those parameter values so that the model "takes the action". In the example in the help manual, the model uses the action to prioritize which item should be pulled next.

· 5
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.