question

pc neo avatar image
0 Likes"
pc neo asked Mischa Spelt commented

experimenter with floworks result not consistent

sample_opt_27Jul_InputSeq_reduceInput_try.fsmAttached is a model that is designed to change the material type of storage whenever a silo goes empty and when the original type of material is exhausted. A list of parameters is used to indicate the type of material to change to when such situation arises. Upon reset, the values in the parameter will be written into a global table named "SiloAllocation".


During my manual run, I have deduced 1 set of change of type of materials for the silos that runs with success. That is, the stop time is lesser than 16301 and Finishedflag as 1.


In order to find all possible combination of changes of type of materials, range-based experiments are setup. However, it is noted that using the experimenter at times the combination that I know will give success, shows as failure in the result of the experimenter. The below is done and purposefully included the combination of my manual run.


a) Setup with only 2 tasks


It is seen that when I setup range-based experiments with only 2 tasks, the result from experimenter is the same as when I manually runs the model. My combination is Scenario 1 and it gives success.


Result can be seen from xx_try1_1_1.sqlite (in attached zip file). It matches to the "Range Based1_1_1" job ( 2 total tasks)


b) Setup with 300 over task

It is seen that the result failed to flag my manual combination to have FinishedFlag value as 1 (refer to scenario 316). Instead the experiementer result indicate the combination that matches my manual run has a FinishedFlag as 0.


I then choose to replicate this scenario from experimenter, the manual result returns FinishedFlag 1.


Result can be seen from xx_try1_1.sqlite (in attached zip file). It matches to the "Range Based1_1" job (324 total tasks)


Question: Is there any advice on what is causing this inconsistency?

FlexSim 23.1.3
experimenterFloWorks
· 4
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jeanette F avatar image Jeanette F ♦♦ commented ·

Hello @Mischa Spelt, Could you take a look at this FloWorks question?

0 Likes 0 ·
Mischa Spelt avatar image Mischa Spelt Jeanette F ♦♦ commented ·

Thanks for tagging me @Jeanette F . I looked into the model and see some things I cannot explain.

1. I tried running the experimenter using the "Restore original state" checkbox and still get the same result as @pc neo . Does the range-based run also respect this setting?

2. I then tried running the same scenario through the experimenter. When I run around 36 scenarios both as Experiment and as Range-Based job, I still get different results. For example, here is PC Neo's original "scenario 316" :

1691480673366.png

The column on the left is the result of the range-based run, the column on the right is the experimenter result; the latter actually matches the output I get when I manually run the scenario.

3. I tried running all 300+ scenarios through the experimenter, but that keeps locking up. At one point I saw messages about the .sqlite file being locked and a transaction starting within a transaction but FlexSim actually froze and closed before I could copy the errors from the console.


Perhaps you can have a look at these issues and let me know how I can properly identify the difference between an scenario run and a range-based run.

0 Likes 0 ·
1691480673366.png (29.2 KiB)
Jordan Johnson avatar image Jordan Johnson ♦♦ Mischa Spelt commented ·
@Mischa Spelt The setting for "restore original" is set on the Experimenter, not on the job, so yes it works with Range-Based jobs. Also, the original state is restored if a child process changes scenarios. Since there's only one replication in this case, the original state is being restored every time anyways, so the checkbox doesn't really matter here.
0 Likes 0 ·
Julie Weller avatar image Julie Weller commented ·

Hi @pc neo, was Jordan Johnson's answer helpful? If so, please click the "Accept" button at the bottom of their answer. Or if you still have questions, add a comment and we'll continue the conversation.

If we haven't heard back from you within 3 business days we'll auto-accept an answer, but you can always comment back to reopen your question.

0 Likes 0 ·
Jordan Johnson avatar image
0 Likes"
Jordan Johnson answered Mischa Spelt commented

This does seem very strange. I am investigating this issue.

When I run the job Range Based1_1 (the one with 324 tasks), it looks like many of the scenarios at the start throw exceptions. For me, scenarios 8, 10, 12, 26, and many more have this message in the system console:

time: 4172.165277 exception: FlexScript exception: Invalid row number: 3 in Global Table "SiloAllocation" at MODEL:/Tools/ProcessFlow/ProcessFlow/Compute SupplyRowNo and availOrRequireSpace>variables/codeNode
time: 4172.165277 exception: FlexScript exception: label availOrRequireSpace doesn't exist on token id: 23991 at MODEL:/Tools/ProcessFlow/ProcessFlow/Go where?~2>variables/decision
time: 4172.165277 exception: FlexScript exception: Invalid row number: 3 in Global Table "SiloAllocation" at MODEL:/Tools/ProcessFlow/ProcessFlow/Compute SupplyRowNo and availOrRequireSpace~2>variables/codeNode
time: 4172.165277 exception: FlexScript exception: label availOrRequireSpace doesn't exist on token id: 23991 at MODEL:/Tools/ProcessFlow/ProcessFlow/Go where?~4>variables/decision
time: 4172.165277 exception: FlexScript exception: label ChangeArrivalDate doesn't exist on token id: 23991 at MODEL:/Tools/ProcessFlow/ProcessFlow/increase TimesOfChange on silo if needed>variables/codeNode
time: 4172.165277 exception: FlexScript exception: label SupplyRowNo doesn't exist on token id: 23991 at MODEL:/Tools/ProcessFlow/ProcessFlow/Assign PumpObj,SupplyTankName,ConsumeTankName~2>labels/2/2
time: 4172.165277 exception: Exception caught in start() of activity Flow from silo2xx to silo3xx/Supplying/Set flow trigger in process flow "ProcessFlow". Continuing throw...
time: 4172.165277 exception: Exception caught in Executive::processeventinlist().

I agree that it is strange that the running the model interactively gives a different result than running in a range-based job. My current theory is that this particular exception somehow corrupts something in FlexSim, and that the corruption is not cleared even by restoring the original tree. But I suggest fixing these exceptions first. We can continue our investigation at that point.

· 6
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Patrick Zweekhorst avatar image Patrick Zweekhorst commented ·
Hi @pc neo ,

With all the great help from @Jordan Johnson we were able to find the problem. The issue will be fixed in the next FloWorks release, but I can also already send you a dll file if you want to test and use it already. If that is the case just let me know. Our apologies that it took so long to fix the issue.

1 Like 1 ·
pc neo avatar image pc neo Patrick Zweekhorst commented ·
hi @Patrick Zweekhorst thank you for looking into this issue. I'll like to get the dll from you and test it.
0 Likes 0 ·
Patrick Zweekhorst avatar image Patrick Zweekhorst pc neo commented ·
dll is provided via email, just as an update
0 Likes 0 ·
Show more comments
Mischa Spelt avatar image Mischa Spelt commented ·
Based on my quick analysis yesterday, those exceptions occur mostly after the model should actually have stopped. Did you replace the stop() command with endreplication(), as per my earlier comment? Or could the problem be that the Range Based job runner does not end the replication when those commands get called?
0 Likes 0 ·
pc neo avatar image pc neo Mischa Spelt commented ·

@Mischa Spelt Thank you for looking into this. I'll try to work on the exception as well as replace the stop() with endreplication(). I'll update on the progress.

0 Likes 0 ·
pc neo avatar image
0 Likes"
pc neo answered pc neo commented

sample_opt_27Jul_InputSeq_reduceInput_try_edit.fsm

@Mischa Spelt @Jordan Johnson I've fixed the issue on exceptions as well as updated the model to call endreplication(1) when running experiment and stop() when running manually. sample_opt_27Jul_InputSeq_reduceInput_try_edit1_1.zipExperiments with job range is then run again.


Below is my observation:

a) Setup experiment with job range with total tasks of 2

The results collected in experimenter is consistent with results when run manually.

Result can be seen in xx_edit1_1_1.sqlite (in attached zip file)


b) Setup experiment with job range with total tasks of 36

It is observed that results collected from tasks 17 onwards are all "negative" result ie FinishedFlag is 0.


This is not correct as I know that scenario 21 should give a "positive" result ie FinishedFlag should be 1. As this is the result collected when the model is run manually.

Result can be seen in xx_edit1_1_2.sqlite (in attached zip file)


c) Setup experiment with job range with total tasks of 136

Observation is similar to (b) above.

Result can be seen in xx_edit1_1.sqlite (in attached zip file)


Hope to get some advice on what is causing the issue.

Thank you.


· 2
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jordan Johnson avatar image Jordan Johnson ♦♦ commented ·

I'm investigating this strange behavior. I have updated the model to use a basic experiment job, not a range based job, and I still get the issue. Specifically, if I run the basic experiment job using only 1 core, the first replication "finishes" and all others fail. This does not match the interactive results:

1692810433786.png

However, if I run using 7 or more cores, the results are different, and the results match the interactive results:

1692810494114.png

Here is my updated model with the experiment I made:

sample-opt-27jul-inputseq-reduceinput-try-edit_2.fsm

It appears that there is some state somewhere in FlexSim that is being changed during the model run, but that is not reset by restoring the original state or by resetting the model in a child process.

@Mischa Spelt (just keeping you in the loop)

0 Likes 0 ·
pc neo avatar image pc neo Jordan Johnson ♦♦ commented ·
@Jordan Johnson Thank you. Hope to get more input from @Mischa Spelt as I am not sure what is needed to be modified in the model in order to get to work.
0 Likes 0 ·