flexsim reinforcement learning quetion

Question

question

mark zhen asked Oct 16 2023 at 9:25 AM Jeanette F commented Oct 25 2023 at 4:46 PM

flexsim reinforcement learning quetion

I want to know some details about models and reinforcement learning

For example, episode corresponds to the meaning represented by the model

Or what timestep corresponds to in the model. Because I wanted to understand why I had a spike in rewards at the beginning when I was training.

Software Version:

FlexSim 23.0.0

reinforcement learning training

1697448272978.png (270.6 KiB)

· 5

Kavika F ♦ commented · Oct 18 2023 at 6:01 PM

@mark zhen, where did you get this graph? Did you plot this from python or FlexSim?

1 ·

Joerg Vogel commented · Oct 16 2023 at 9:49 AM

Probably a division by zero or near zero.

0 ·

mark zhen Joerg Vogel commented · Oct 16 2023 at 10:10 AM

So how do I solve this problem in the model

0 ·

Jason Lightfoot ♦♦ commented · Oct 17 2023 at 12:45 PM

I believe the term 'time-step' comes from the action/reward step in a Markov Decision Process, and their number is aligned to the number of cycles of action->simulate->observe->reward within your episode.

0 ·

Jeanette F ♦♦ commented · Oct 25 2023 at 4:46 PM

Hi @mark zhen ,

Were you able to solve your problem? If so, please add and accept an answer to let others know the solution. Or please respond to the previous comment so that we can continue to help you.

If we don't hear back in the next 3 business days, we'll assume you were able to solve your problem and we'll close this case in our tracker. You can always comment back at any time to reopen your question, or you can contact your local FlexSim distributor for phone or email help.

0 ·

Answer 1 · 2023-10-16T11:22:07Z

Joerg Vogel answered Oct 16 2023 at 11:22 AM mark zhen commented Oct 19 2023 at 2:26 PM

A quick and dirty way would be to work with a warmup time or you transmit rewards a bit later in your model runtime.

· 6

mark zhen commented · Oct 17 2023 at 11:05 AM

I don't quite understand the meaning of warm up and what impact it will have on the model.0926.fsm

0 ·

1697540691717.png (11.4 KiB)

0926.fsm (50.6 KiB)

Jason Lightfoot ♦♦ mark zhen commented · Oct 17 2023 at 12:08 PM

You can find the warmup description in the online documentation. It could somehow influence the timesteps in question if your rewards are based on model statistics. I'm not sure if it would explain the spike in your graph.

0 ·

mark zhen Jason Lightfoot ♦♦ commented · Oct 17 2023 at 12:52 PM

Thanks but I have other questions,

For example, what is the meaning of the flexsim model corresponding to each timestep?

What does each epoch and episode mean?

0 ·

Show more comments

question