Capability of Trained Reinforcement Learning Brain

Question

question

Steven Chen asked Dec 6, '22 Jeanette F commented Dec 12, '22

Capability of Trained Reinforcement Learning Brain

Hello,

I am wondering if a trained Bonsai brain needs to be re-trained when observation parameters changed?

In the MinSetupTime sample, I changed 5 types of boxes to 6 types, does the brain still accept such parameters or a new training is required? I tried but FlexSim can't connect to brain in Docker (connection timed out).

Another question is if the arrival plan of source changed, or there are some cases that Bonsai brain never seen at training state, is the brain able to make good action?

Software Version:

FlexSim 23.0.0

reinforcement learning bonsai

· 1

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jeanette F ♦♦ commented · Dec 12, 2022 at 04:08 PM

Hi @Steven Chen, was Jordan Johnson's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always unaccept and comment back to reopen your question.

0 ·

Answer 1 · 2022-12-07T15:58:36Z

Jordan Johnson answered Dec 7, '22 Jordan Johnson edited Dec 7, '22

In the MinSetupTime example, changing the number of types is a breaking change; you'll need to train a new brain. This is because the brain trained with 5 types is not capable of choosing type 6; the inkling file does not enumerate that type in the SimAction section.

As far as general advice, here's a basic summary of what I understand:

If you change the "shape" of the observation space or action space by adding or removing parameters, you'll need to train a new brain. For enumerated values like in the MinSetupTime, changing the set of possible values requires a new brain.
It is unknown what a brain will do if it sees observation data that it never encountered in training. In some cases, especially if the cases are close to what happened during training, well-trained brains perform very well, especially if they've been exposed to a lot of randomness. But the further the new system drifts from the training system, the less likely good performance is.
It is always better to expose the brain to as many realistic situations as possible during training.

5 |100000

Attachments: Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

question