question

Noah Z avatar image
0 Likes"
Noah Z asked Noah Z commented

Experimenter not making it through all replications

I have a model that I set to run 50 replications in the experimenter. It gets through 17-20 replications on the single scenario I am running and then the remaining replications never start.

Unfortunately I can't upload the model to this public forum or by private question due to it's sensitive nature but I'm wondering if there are any ideas from the community about what might be going on and what might be done to mitigate. I do realize not having the model to look at significantly reduces the chances of this getting figured out here.

My hunch is that the replications that are being run in the background while experimenter is working aren't terminating properly as when I reset the experimenter and try to re-run the experiment again, no replications begin at all. It takes me restarting my computer to be able to get any replications working again with the same outcome seen again (i.e. stopping at ~17 or so replications).

Any ideas?

FlexSim 20.0.0
experimenterflexsim 20.0.0
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jordan Johnson avatar image
1 Like"
Jordan Johnson answered Noah Z commented

@jason.lightfoot's answer is an excellent guide to troubleshooting the experimenter. Another possibility is that you might be experiencing a bug with using the "restore original state" checkbox, which we fixed in version 20.0.7. If you upgrade to version 20.0.9 (the latest bug fix at time of writing), do you still experience the issue?

· 1
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jason Lightfoot avatar image
3 Likes"
Jason Lightfoot answered Noah Z commented

The replications are likely crashing and therefore not terminating correct as you guessed. If you start with 8 cores running then as each replication fails you have fewer cores picking up jobs - you only need 8 to fail and the experiment will stop.

Can you first check that the model is repeatable so that say replication 1 always gets the same result? Then check that you can run that replication interactively and get the same result? If you can then it means we should be able to diagnose each failing replication. If not then we need to diagnose why the model isn't repeatable.

In case you've not seen the option to select the replication number to run interactively, it's in the experimenter's advanced tab:

The next step is to then interactively run the replications that fail and look for problems that cause it to crash. I'd start with the replication that crashes earliest - so the one with the shortest green line on the progress indicator.

I'd try removing all running FlexSim instances using the task manager and see if that then allows you to start a new experiment without rebooting your machine. If you add the field 'commandline' to the task manager detailed process view you may be able to see which if any are child processes that have got stuck.

Please come back if you want more help with each step in this process and we'll see what we can do to guide you through it.


· 7
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.