Phil BoBo avatar image
Phil BoBo posted Salma F2 commented

Reinforcement Learning Using Previous Versions

FlexSim 2022 introduced a Reinforcement Learning tool that enables you to configure your model to be used as an environment for reinforcement learning algorithms.

That tool makes connecting to FlexSim from a reinforcement learning algorithm easier, but that tool is not absolutely necessary for this type of connectivity. The same socket communication protocols that are used by that tool are available generally in FlexScript.

Attached (ChangeoverTimesRL_V22.0.fsm) is the FlexSim 2022 model that you build as part of the Using Reinforcement Learning documentation that walks you through the process of building and preparing a FlexSim model for reinforcement learning, training an agent within that model environment, evaluating the performance of the trained reinforcement learning model, and using that trained model in a real production environment.


Also attached (ChangeoverTimesRL_V6.0.fsm) is a model built with FlexSim 6.0.2 from 2012 that does the exact same thing, but with custom FlexScript user commands instead of the Reinforcement Learning tool. You can use this model with the example python scripts and FlexSim 6.0.2 in the same way that you can use the other model with those same scripts in FlexSim 2022.


I'm providing this FlexSim 6 model as an example that demonstrates how you can communicate between FlexSim and other programs. The Reinforcement Learning tool certainly makes this type of communication easier and simpler, with a nice UI for specifying RL-specific parameters, but the fundamental principles of how this works have been available in FlexSim for many years using FlexScript.

Hopefully this example can help teach and inspire those who wish to control or communicate with FlexSim from external sources for purposes other than just reinforcement learning. FlexSim is flexible, and the possibilities are endless.

reinforcement learningartificial intelligence
· 3
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Salma F2 avatar image Salma F2 commented ·

Thank you it is very interesting. But what I still do not understand is why we have to train the model to find optimal solution, because I was used to find optimal solution by coupling simulation model with Optquest; so I can not understand what does reinforcement learning give me more or better than this classic approach of simulation based optimization?

2 Likes 2 ·
Jordan Johnson avatar image Jordan Johnson ♦♦ Salma F2 commented ·

Reinforcement Learning and Optimization are certainly related, but they each fulfill different rolls. The Optimizer assumes that you'll set the parameter values once, before running the model. Then it measures the performance of the model when the model finishes running. It repeats that process over and over, trying different parameter values, looking for the best set of values. When it's done, you'll get an answer to your original question. You then apply that answer to the system, usually by changing the design (buying more equipment, hiring more employees, etc.). This makes the optimizer especially helpful during the planning phase.

In contrast, the purpose of RL is to produce a brain (an AI model) that can answer questions, rather than just producing a single answer. The end goal is to use that brain in the real system, to replace whatever decision-maker is already there. For example, in a warehouse, you might need to choose where to place each SKU. Or in a job shop, you might have a list of jobs to do, and need to decide which job to start next. In the real system, those decisions are being made dozens or hundreds of times a day, often by a human, following their best guess. While it is good to make data-driven decisions, it would be difficult to run the Optimizer fast enough to answer those questions.

Instead, you can use FlexSim as a playground to train an AI model. The simulation model runs with historical data, and asks the AI what to do. The AI tries different actions, eventually learning how to take good actions. Then, you'd deploy the trained AI to a real system. It would then be used operationally, and your system would see the benefit of consistently making better decisions.

So, to sum up, the optimizer is a great tool that answers design questions, that come up infrequently. RL produces a brain that answers operational questions.

5 Likes 5 ·
Salma F2 avatar image Salma F2 Jordan Johnson ♦♦ commented ·

Thanks Jordan, It is very clear now.

0 Likes 0 ·



phil.bobo contributed to this article

Related Articles