question

mark zhen avatar image
0 Likes"
mark zhen asked Jeanette F commented

flexsim reinforcement learning quetion

I want to know some details about models and reinforcement learning

For example, episode corresponds to the meaning represented by the model

Or what timestep corresponds to in the model. Because I wanted to understand why I had a spike in rewards at the beginning when I was training.

1697448272978.png

FlexSim 23.0.0
reinforcement learningtraining
1697448272978.png (270.6 KiB)
· 5
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

1 Answer

Joerg Vogel avatar image
0 Likes"
Joerg Vogel answered mark zhen commented

A quick and dirty way would be to work with a warmup time or you transmit rewards a bit later in your model runtime.

· 6
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

mark zhen avatar image mark zhen commented ·

I don't quite understand the meaning of warm up and what impact it will have on the model.1697540691717.png0926.fsm

0 Likes 0 ·
1697540691717.png (11.4 KiB)
0926.fsm (50.6 KiB)
Jason Lightfoot avatar image Jason Lightfoot ♦♦ mark zhen commented ·

You can find the warmup description in the online documentation. It could somehow influence the timesteps in question if your rewards are based on model statistics. I'm not sure if it would explain the spike in your graph.

0 Likes 0 ·
mark zhen avatar image mark zhen Jason Lightfoot ♦♦ commented ·

Thanks but I have other questions,

For example, what is the meaning of the flexsim model corresponding to each timestep?

What does each epoch and episode mean?

0 Likes 0 ·
Show more comments