question

JinhaoDu avatar image
0 Likes"
JinhaoDu asked JinhaoDu commented

Customize the reinforcement learning observation space

我的模型环境是一台SMT表面贴装机,里面包含CAP和CPSP问题,有5个动作对应5个启发式规则,这与flexsim官网提供的示例不同,我的问题观察空间是多维向量,错误发生在连接 env.py。123456789_29_autosave_2_autosave.fsm1709191132350.png

FlexSim 23.2.2
reinforcement learningobservation space
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

1 Answer

Nil Ns avatar image
0 Likes"
Nil Ns answered JinhaoDu commented

Hello,

Jordan explains how to change the observation space in the following question custom observation space for RL - FlexSim Community. However, the Python file will need to be modified to accept and work with this data.


Another alternative could be to pass a table to a set of parameters using code like the following:

treenode toCopy = Model.find("Tools/ParameterTables/flujosCosteMinimo>variables/parameters/1");

Table costes = Table("MatrizCostes");

int num = 0;

for (int i = 1; i <= costes.numRows; i++){
   for (int j = 1; j <= costes.numCols; j++){

      num += 1;

      /* 
      //Run ths code the firts time you create the parameters. then coment it and reRun the code. 
      if(!(i == 1 && j == 1)){
      nodeinsertafter(toCopy);
      createcopy(toCopy,toCopy.next,0,0,0,1);
      }*/

      treenode Copied = Model.find("Tools/ParameterTables/flujosCosteMinimo>variables/parameters/"+num);


      Model.find(Copied.getPath()+"/Name").value = "flujosCosteMinimo[" + i + "][" + j + "]";

      Model.find(Copied.getPath()+"/Value").value = 4;

      Model.find(Copied.getPath()+"/Value/type").value = 2;
      Model.find(Copied.getPath()+"/Value/lowerBound").value = 0;
      Model.find(Copied.getPath()+"/Value/upperBound").value = 10000;

   }
}

Then, in each onObservation, the value should be modified using code like this:

Table costes = Table("MatrizCostes")

for (int i = 1; i <= costes.numRows; i++){
   for (int j = 1; j <= costes.numCols; j++){
      Model.parameters["flujosCosteMinimo[" + i + "][" + j + "]"].value = costes[i][j];


Thank you very much, and I hope it helps

· 6
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

JinhaoDu avatar image JinhaoDu commented ·

Thanks, if method 2 is used, does the Python code need to be modified?

0 Likes 0 ·
Nil Ns avatar image Nil Ns JinhaoDu commented ·

No, in this second method, you will need to define the Observation Space as MultiDiscrete:

1709538970225.png

This should work. However, keep in mind that the number of possible Observation Spaces grows exponentially with the number of parameters. Extremely large observation spaces can make reinforcement learning (RL) training difficult and affect the quality of results.


0 Likes 0 ·
1709538970225.png (3.5 KiB)
JinhaoDu avatar image JinhaoDu Nil Ns commented ·

I originally conceived it according to the code logic in Figure 1, because I currently have 6 states to collect, and the size of each state is also different, for example, state 1 is a 3*3 spatial coordinate, state 2 is a 30*3 spatial coordinate, and so on. The MultiDiscrete type doesn't meet my needs, because my states are of matrix type, and I'm using the Array type instead.
Thank you very much for your reply

1709550929900.png

1709550893571.png

0 Likes 0 ·
1709550893571.png (14.7 KiB)
1709550929900.png (29.3 KiB)
Show more comments
JinhaoDu avatar image JinhaoDu commented ·

Is there a case study of modifying the ENV code?

0 Likes 0 ·