question

Lou Keller avatar image
0 Likes"
Lou Keller asked Matthew Gillespie commented

Box Plot Calculations

A model produced the following displays for a single, 24-hr, box plot and two, 12-hr box plots taken from the same data...

Note that the "Mean" value of the One-box plot = 26.66035, while the average of the Two-box plots = 29.669.

Questions: Why are the two means different? How are the different means calculated from existing data?

FlexSim 16.1.0
FlexSim HC 5.0.12
healthcarebox plotsmean
boxplots.jpg (79.9 KiB)
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

1 Answer

Matthew Gillespie avatar image
1 Like"
Matthew Gillespie answered Matthew Gillespie commented

The average of averages rarely produces the same result as the average of the whole.

For example:

5 patients of Acuity 1 arrive in the first 12 hours Average Acuity: 1

1 patient of Acuity 5 arrives in the last 12 hours Average Acuity: 5

Average of Averages: 3

Total Average: 1.666

I'd have to look at the data to give you a more specific answer.

· 4
5 |100000

Up to 12 attachments (including images) can be used with a maximum of 23.8 MiB each and 47.7 MiB total.

Jason Lightfoot avatar image Jason Lightfoot ♦♦ commented ·

What is this a statistic of, and is N the number of observations? If so there could be something wrong as the observations have quadrupled in the two box plot picture (they would normally still add up to 18).

1 Like 1 ·
Lou Keller avatar image Lou Keller commented ·

@Matthew Gillespie

Yes, I'm aware of what you wrote. Forgetting about the actual numbers themselves and using the example I described, what is the mathematical process (e.g. formula, calculation, algorythmnb, etc.) the program goes through to calculate the mean(s)?

0 Likes 0 ·
Matthew Gillespie avatar image Matthew Gillespie ♦♦ Lou Keller commented ·

I spent a long time yesterday looking into this and here's what I found:

Anytime an object changes state a new entry is added to the State History bundle and a bunch of stuff is recorded like the scenario, replication, time, duration, old state, new state, etc. The dashboard then goes through each entry in the bundle and checks if the entry corresponds to your sample set. If it does it records the duration of the state in a new bundle. This bundle has keeps track of the total utilized time, total idle time, and the observations of each period. The state's duration is added to either the total idle or total utilized time depending on how you classified the state. The percentage is then calculated and added as an observation, or if it already exists the percentage of the observation is updated. If the state duration extends across multiple periods then the duration is broken up and the duration corresponding to each period is added to the period. When the time wraps around (you go back to a period) the total utilized and idle times are cleared and a new observation is created. This is supposed to happen whenever it's a new day or a new scenario/replication. So you should have one observation per period for every day x replication x scenario.

As @jason.lightfoot noted, the number of observations is incorrect when you have multiple periods. The issue I found is that mechanism that clears the total times and makes a new observation is a little too trigger happy. When you have a duration that crosses multiple periods it updates each period. If you then have another duration that also crosses multiple periods the algorithm goes back to the first period and is assuming it's a new day and so makes a new observation.

So it looks like AJ's code is mostly working, but it's getting false positives for when a new observation should be recorded. I'm working on a fix for this.

2 Likes 2 ·
Lou Keller avatar image Lou Keller commented ·

Thanks, Matt. I very much appreciate the extra effort.

0 Likes 0 ·