question

Gabri avatar image
0 Likes"
Gabri asked Jason Lightfoot commented

Pull strategy over reinforcement learning

Hi,

I'm working on a reinforcement learning project. The basis is that of the reinforcement learning tutorial.

In the model I added a second source and a second queue so I have one source which generates 5 types of pallet and another one which generates 5 types of people. What I want is apply reinforcement learning, to maximize the occupancy of the pallets (25 places), taking from the second queue, the same type of people of the pallet that is getting processed.

As in the reinforcement learning tutorial, I wrote an input pull strategy in the processor, because without the python's code provided doesn't work. In the pull strategy I say to take the type that has more people in the queue than the others. The problem is that I think that the model doesn't learn, but follows only the pull strategy. I want that the model learn to do this, not that someone tell it what to do, can you help me with what I want?

Probably I have also problems with the settings of the parameters in the observation space.

I attach the model.

ChangeoverTimesRL_company.fsm

FlexSim 22.0.16
reinforcement learningpull strategyobservation space
· 1
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

1 Answer

Felix Möhlmann avatar image
0 Likes"
Felix Möhlmann answered Felix Möhlmann commented

You'd jsut use the same logic from the tutorial model. The RL algorithm sets an 'action' parameter which controls what type is pulled.

I do want to give one tip about the code you wrote. You put the for-loop that determines the most ubiquitous type of people inside the while-loop that runs through all available items. This means the most common type is checked again for each item. This is unnecessary and might well cost some performance when running the model as fast as possible. You can instead determine the "itemTypeValue" variable once before the while-loop.

And some notes based on my limited experience with Reinforcement Learning: You will get faster results by eliminating superfluous information from the observations. For example, the RL algorithm doesn't really need to know how many pallets of each type there are, just if at least one is available to be pulled.

· 6
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.