I have a question about reinforcement learning example

Question

question

Ryan_Wei asked Sep 15, '23 Jeanette F commented Sep 21, '23

I have a question about reinforcement learning example

I want to know what this official website provides

int done = (Model.time > 1000);

Does this mean that it will calculate the value of Reward every 1000 seconds?

Software Version:

FlexSim 23.0.0

reinforcement learning

· 3

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Kavika F ♦ commented · Sep 15, 2023 at 07:17 PM

Hey @Ryan_Wei, could you please reupload your image? I'm unable to see it. Thank you!

0 ·

Ryan_Wei Kavika F ♦ commented · Sep 16, 2023 at 02:43 PM

I'm so sorry

Here is the image：

0 ·

1694875374763.png (12.5 KiB)

Jeanette F ♦♦ commented · Sep 21, 2023 at 08:46 PM

Hi @Ryan_Wei, was Felix Möhlmann's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always comment back to reopen your question.

0 ·

Answer 1 · 2023-09-18T06:00:31Z

Felix Möhlmann answered Sep 18, '23

The reward function passes an array with two elements to the reinforcement learning algorithm. The first value ist the reward itself. The second value controls whether the algorithm continues the current simulation run (0) or concludes the run and starts a new one (1).

(Model.time > 1000) evaluates either to 0 or 1, depending on the current time in the simulation. So the first time the reward is send after the simulation passes 1000s, a new replication will be started.

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

question