The Problem with Reinforcement Learning

Question

question

mark zhen asked Jan 28 2023 at 7:37 AM Jeanette F commented Feb 15 2023 at 5:24 PM

The Problem with Reinforcement Learning

allcombos-22-0-1.fsmI'm currently having a few problems

The first problem is that I want to have a penalty mechanism for my current reward function, but I observe that there is no negative part in my VISUAL CODE and the reward function I set should only be an integer. I don’t understand why this is the case .

Also, I want to collect my training process into a graph. Is there a way to do it? Like the picture below (the source of the picture is from the Internet)

Software Version:

FlexSim 22.0.0

reinforcement learning python

1674891279251.png (3.5 KiB)

1674891304726.png (9.7 KiB)

1674891385651.png (220.2 KiB)

allcombos-22-0-1.fsm (335.1 KiB)

· 1

Answer 1 · 2023-01-30T08:24:43Z

Felix Möhlmann answered Jan 30 2023 at 8:24 AM Felix Möhlmann commented Feb 09 2023 at 4:02 PM

By writing double in front of the variable name you are declaring a new variable within the scope of the if/else-condition (if you did this in the same scope as the original "rewardA" variable the compiler would throw an error due to a duplicate variable). So currently the code is generating a new variable, sets its value and as soon as the code leaves the if/else-condition that variable vanishes.

To access the original "rewardA", only use the name.

You might be able to grab the data directly from the python code, though as this wouldn't directly involve FlexSim it might be better asked in a different programming forum.

You could also write data to an excel file whenever a run finishes (done == 1) during the training.

(The file path would obviously be different in your case)

1675066646780.png (3.3 KiB)

1675066913317.png (17.7 KiB)

· 28

mark zhen commented · Jan 30 2023 at 3:14 PM

What I want to do is, if my reward function is REWARD A is less than N, it will be punished. How can I express it with IF. Also, do I write the part about writing into EXCEL in my VISUL CODE? In this case, my reward should have a negative part, but I don’t see it in the above picture.

0 ·

Felix Möhlmann mark zhen commented · Jan 30 2023 at 7:09 PM

That is what you are currently doing: If rewardA is less than (or equal to) 300, it is set to -100, otherwise to 1.

The code is placed in the reward function. After determining the value of the "done" variable, but before returning the reward. The code is only run if "done" is not equal to zero, so at the end of each replication. As an example i chose to write "LastEntryTime" (which you use a a performance measure) to the excel file. You can of course also sum up all rewards in a global variable over the course of the run and then write that value to Excel.

0 ·

1675105741739.png (8.7 KiB)

mark zhen Felix Möhlmann commented · Jan 31 2023 at 5:10 PM

I want to know how your approach is implemented. Can you attach a file for my reference? And if I want to know the value of each variable, how should I read it. For example, I want to know the value of reward 1,2,3allcombos-22-0-1.fsm

0 ·

1675185014406.png (51.1 KiB)

allcombos-22-0-1.fsm (334.9 KiB)

Show more comments

question