question

mark zhen avatar image
0 Likes"
mark zhen asked Jeanette F commented

flexsim reward calculation and warm up issues

0926.fsmThere are some problems with the reward function. When I add the penalty, there seems to be a problem with the calculation of reward in the env file.1698652886689.png

1698652897636.png

I added a penalty to my reward function, but I think my reward should not be a negative value because even if it is less than 0.1, there will still be a lot of negative values. The last problem is that there is something wrong with my warm up calculation process, warm up will not reset the calculation of custom labels as well.

FlexSim 23.0.0
reinforcement learningwarmupreward function
1698652886689.png (693.4 KiB)
1698652897636.png (1.5 MiB)
0926.fsm (71.9 KiB)
· 3
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

1 Answer

Kavika F avatar image
0 Likes"
Kavika F answered Kavika F commented

Hey @mark zhen , I think part of your problem is the Tardiness calculation. On your first item in the model, you have an item that starts with a date of 7001.21. Which is just your Model.time + 7000.

1698679955633.png

1698680252598.png

What does this date try to accomplish? Is it a target finish date? If so, please label that better, maybe "TargetFinishDate".

However, because you have such a large difference between "date" and "finish" when you do

  1. item.finish - item.date

you have a big negative number. I suspect that most of the initial items will have this large negative number. And because you're listening to every processor's pull strategy to assign a reward based on a single sink's tardiness label (which is set to a large negative number that only changes after an item finishes), you'll get a lot of rewards that are negative, even when the upstream items may actually have actual tardiness.


1698679955633.png (9.7 KiB)
1698680252598.png (5.2 KiB)
· 18
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.