question

sz avatar image
0 Likes"
sz asked Jeanette F commented

Reinforcement learning tutorial: state values and mismatch between messages

Hello,

I’ve been following this tutorial: https://docs.flexsim.com/en/22.1/ModelLogic/ReinforcementLearning/Training/Training.html und have successfully implemented and run the two Python scripts, flexsim_env.py and flexsim_training.py. However, I have trouble understanding parts of the output. I've attached a screenshot for reference.

1) In the FlexSim model, the "action" and "observation" parameters ("LastItemType“ and "ItemType“) are defined to have values between 1 and 5. However, in the output, the state values range from 0 to 4. Why is there this discrepancy between the expected state range in the model and the observed state values in the output?

2) At the beginning of each iteration, the "state" values from the Action and Observation messages don’t match. After a few simulation steps, the values do align, but why are they initially inconsistent?

Thank you!

1728925547594.png

Model.fsm

FlexSim 24.1.0
reinforcement learningreinforcement training
1728925547594.png (206.6 KiB)
model.fsm (32.6 KiB)
· 1
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jeanette F avatar image Jeanette F ♦♦ commented ·

Hi @sz, was Felix Möhlmann's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always comment back to reopen your question.

0 Likes 0 ·

1 Answer

Felix Möhlmann avatar image
0 Likes"
Felix Möhlmann answered

1) I believe a discrete parameter with N possible values is always mapped to the range [0, N-1]. For example, if the possible values were 3, 6, 9 and 12, the RL agent would "see" the values 0, 1, 2, 3.

2) Not all types of items will be available to pull at the start of run. When the requested type is not available the demo model will instead pull the first item in the queue.

5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.