Reinforcement Learning Mixed Observation Space - Continuous and Integer Values

Question

question

Arthur Ml asked Oct 17, '22 Arthur Ml edited Oct 17, '22

Reinforcement Learning Mixed Observation Space - Continuous and Integer Values

Hallo everyone,

I want to use Reinforcement Learning to optimize a production schedule. My observation space contains parameters with the type "continuous" and "integer". When configuring the observation space in the RL tool, it is not possible to select both types of parameters.

Is there anyway to include mixed type parameters? I also do not want to change my integer parameters to continuous, because then information is lost. The model will then not convert the integer parameters to one-hot-encoded vectors which is a loss of information.

Can anyone help me out here? Thanks in advance

Software Version:

FlexSim 22.2.1

reinforcement learning

mixed-observation-space-not-possible.jpg (118.2 KiB)

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Answer 1 · 2022-10-17T09:54:35Z

Arthur Ml answered Oct 17, '22 Arthur Ml edited Oct 17, '22

I found a workaround. In case, somebody needs in future, this is what I did:

1. I put every observation parameter in one table as continuous values.

2. In my env.py, I split the observation to continuous and discrete values. The discrete values become a numpy array with dtype "int". Afterwards, the continuous and discrete part are put together in a dictionary. Here is the little method doing this:

    def _convert_and_normalize_observations(self, state):
        # split in continuous and discrete part
        state_cont = np.array(state)[[0, 2, 3, 5, 6, 8, 9, 11]]
        state_discrete = np.array(state, dtype=int)[[1, 4, 7, 10]]

        # normalize and clip continuous part
        state_cont = state_cont / self.state_norm
        state_cont = state_cont.clip(0, 1)
        return {'state_cont': state_cont, 'state_discrete': state_discrete}

3. The observation space must be a dictionary containing all types you are using in the observation. In my case, this is the following definition:

self.observation_space = gym.spaces.Dict({
            "state_cont": gym.spaces.Box(0, 1, shape=(8, )),
            "state_discrete": gym.spaces.MultiDiscrete((13, 13, 13, 13)),
})

This worked for me and needed only little coding. That way, my reinforcement learning framework (I use ray rllib) converts the discrete values in my observation vector to one-hot-encoded vectors. That is exactly, what I wanted.

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

question