DQN only use discrete action space

Question

question

Willie asked Feb 26 2025 at 11:34 AM Willie edited Feb 26 2025 at 12:22 PM

DQN only use discrete action space

I am using MultiDiscrete in the action space, and the PPO algorithm is running correctly.

However, when I used DQN, it seems that DQN does not support MultiDiscrete action space.

Is it possible to convert MultiDiscrete to Discrete action space?

Thank you for your advise or the other method in advance.

Software Version:

FlexSim 23.0.15

reinforcement learning dqn actionspace multidiscrete

1740568160325.png (31.5 KiB)

1740569324259.png (6.1 KiB)

smalldemo.fsm (87.4 KiB)

______

Cookie preferences

Your privacy is important to us and so is an optimal experience. To help us customize information and build applications, we collect data about your use of this site.

May we collect and use your data?

Learn more about the Third Party Services we use and our Privacy Statement.

Strictly necessary – required for our site to work and to provide services to you

These cookies allow us to record your preferences or login information, respond to your requests or fulfill items in your shopping cart.

YES

Improve your experience – allows us to show you what is relevant to you

These cookies enable us to provide enhanced functionality and personalization. They may be set by us or by third party providers whose services we use to deliver information and experiences tailored to you. If you do not allow these cookies, some or all of these services may not be available for you.

YES

NO

Customize your advertising – permits us to offer targeted advertising to you

These cookies collect data about you based on your activities and interests in order to show you relevant ads and to track effectiveness. By collecting this data, the ads you see will be more tailored to your interests. If you do not allow these cookies, you will experience less targeted advertising.

YES

NO

Are you sure you want a less customized experience?

We can access your data only if you select "yes" for the categories on the previous screen. This lets us tailor our marketing so that it's more relevant for you. You can change your settings at any time by visiting our privacy statement

Your experience. Your choice.

We care about your privacy. The data we collect helps us understand how you use our products, what information you might be interested in, and what we can improve to make your engagement with Autodesk more rewarding.

May we collect and use your data to tailor your experience?

Explore the benefits of a customized experience by managing your privacy settings for this site or visit our Privacy Statement to learn more about your options.

Answer 1 · 2025-02-26T11:58:12Z

Nil Ns answered Feb 26 2025 at 11:58 AM Nil Ns commented Feb 26 2025 at 12:18 PM

Hey Willie,

Yes, it's definitely possible to convert a MultiDiscrete action space into a Discrete one by combining OP1 and OP2 into a single value. Basically, you treat each combination of OP1 and OP2 as a unique action in a Discrete space.

For example, if OP1 has 10 (1,2,3...) possible values and OP2 has 5 possible values, you can map each pair (OP1, OP2) to a single discrete index using:

//MultiDiscrete to Discrete
int Op1 = 7;
int Op2 = 4;
int nOp2 = 5; // Num of posible actions of Op2
return Op1 * nOp2 + Op2; //the Discret input

//Discrete to MultiDiscrete 
int answer = 39; 
int nOp2 = 5; 
 
int OP1 = Math.floor(answer / nOp2 );
int OP2 = Math.fmod(answer ,nOp2);
return OP1 + "" + OP2;

This makes it possible to use DQN, but to be honest, it’s not the most efficient way. DQN treats each action as an independent choice, so combining multiple actions into one number can make it harder for the model to recognize patterns. This often leads to longer training times and more complex learning.

Since PPO is already working well for you, I'd recommend sticking with it if possible. But if DQN is a must, this method will get the job done—it just might take more effort to train.

Hope this helps!

· 2

Willie commented · Feb 26 2025 at 12:06 PM

Hello @Nil Ns , Thank you for your advise, this is very helpful for me.

But where should I put this code.In the FlexSim or Vs code's Training file ?

0 ·

Nil Ns Willie commented · Feb 26 2025 at 12:18 PM

The best approach would be to modify the OnObservation event before passing it to the RL model. You can use the first piece of code there.

Then, after the RL model returns an action (a single Discrete value), you’d need to convert it back to MultiDiscrete format in the OnRequestAction event. That’s where you can apply the second piece of code, after tho original OnRequestAction code.

0 ·

question