Customize the reinforcement learning observation space

Question

question

JinhaoDu asked Feb 29, '24 JinhaoDu commented Mar 5, '24

Customize the reinforcement learning observation space

我的模型环境是一台SMT表面贴装机，里面包含CAP和CPSP问题，有5个动作对应5个启发式规则，这与flexsim官网提供的示例不同，我的问题观察空间是多维向量，错误发生在连接 env.py。123456789_29_autosave_2_autosave.fsm

Software Version:

FlexSim 23.2.2

reinforcement learning observation space

1709191132350.png (53.6 KiB)

123456789-29-autosave-2-autosave.fsm (1.2 MiB)

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Answer 1 · 2024-02-29T08:41:05Z

Nil Ns answered Feb 29, '24 JinhaoDu commented Mar 5, '24

Hello,

Jordan explains how to change the observation space in the following question custom observation space for RL - FlexSim Community. However, the Python file will need to be modified to accept and work with this data.

Another alternative could be to pass a table to a set of parameters using code like the following:

treenode toCopy = Model.find("Tools/ParameterTables/flujosCosteMinimo>variables/parameters/1");

Table costes = Table("MatrizCostes");

int num = 0;

for (int i = 1; i <= costes.numRows; i++){
   for (int j = 1; j <= costes.numCols; j++){

      num += 1;

      /* 
      //Run ths code the firts time you create the parameters. then coment it and reRun the code. 
      if(!(i == 1 && j == 1)){
      nodeinsertafter(toCopy);
      createcopy(toCopy,toCopy.next,0,0,0,1);
      }*/

      treenode Copied = Model.find("Tools/ParameterTables/flujosCosteMinimo>variables/parameters/"+num);


      Model.find(Copied.getPath()+"/Name").value = "flujosCosteMinimo[" + i + "][" + j + "]";

      Model.find(Copied.getPath()+"/Value").value = 4;

      Model.find(Copied.getPath()+"/Value/type").value = 2;
      Model.find(Copied.getPath()+"/Value/lowerBound").value = 0;
      Model.find(Copied.getPath()+"/Value/upperBound").value = 10000;

   }
}

Then, in each onObservation, the value should be modified using code like this:

Table costes = Table("MatrizCostes")

for (int i = 1; i <= costes.numRows; i++){
   for (int j = 1; j <= costes.numCols; j++){
      Model.parameters["flujosCosteMinimo[" + i + "][" + j + "]"].value = costes[i][j];

Thank you very much, and I hope it helps

· 6

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

JinhaoDu commented · Mar 04, 2024 at 05:04 AM

Thanks, if method 2 is used, does the Python code need to be modified?

0 ·

Nil Ns JinhaoDu commented · Mar 04, 2024 at 08:08 AM

No, in this second method, you will need to define the Observation Space as MultiDiscrete:

This should work. However, keep in mind that the number of possible Observation Spaces grows exponentially with the number of parameters. Extremely large observation spaces can make reinforcement learning (RL) training difficult and affect the quality of results.

0 ·

1709538970225.png (3.5 KiB)

JinhaoDu Nil Ns commented · Mar 04, 2024 at 11:16 AM

I originally conceived it according to the code logic in Figure 1, because I currently have 6 states to collect, and the size of each state is also different, for example, state 1 is a 3*3 spatial coordinate, state 2 is a 30*3 spatial coordinate, and so on. The MultiDiscrete type doesn't meet my needs, because my states are of matrix type, and I'm using the Array type instead.
Thank you very much for your reply

0 ·

1709550893571.png (14.7 KiB)

1709550929900.png (29.3 KiB)

Show more comments

JinhaoDu commented · Mar 04, 2024 at 05:05 AM

Is there a case study of modifying the ENV code?

0 ·

question