Box Plot Calculations

Question

question

Lou Keller asked Jul 04 2016 at 12:01 AM Matthew Gillespie commented Jul 06 2016 at 6:33 PM

Box Plot Calculations

A model produced the following displays for a single, 24-hr, box plot and two, 12-hr box plots taken from the same data...

Note that the "Mean" value of the One-box plot = 26.66035, while the average of the Two-box plots = 29.669.

Questions: Why are the two means different? How are the different means calculated from existing data?

Software Version:

FlexSim 16.1.0

Software Version:

FlexSim HC 5.0.12

healthcare box plots mean

boxplots.jpg (79.9 KiB)

______

Cookie preferences

Your privacy is important to us and so is an optimal experience. To help us customize information and build applications, we collect data about your use of this site.

May we collect and use your data?

Learn more about the Third Party Services we use and our Privacy Statement.

Strictly necessary – required for our site to work and to provide services to you

These cookies allow us to record your preferences or login information, respond to your requests or fulfill items in your shopping cart.

YES

Improve your experience – allows us to show you what is relevant to you

These cookies enable us to provide enhanced functionality and personalization. They may be set by us or by third party providers whose services we use to deliver information and experiences tailored to you. If you do not allow these cookies, some or all of these services may not be available for you.

YES

NO

Customize your advertising – permits us to offer targeted advertising to you

These cookies collect data about you based on your activities and interests in order to show you relevant ads and to track effectiveness. By collecting this data, the ads you see will be more tailored to your interests. If you do not allow these cookies, you will experience less targeted advertising.

YES

NO

Are you sure you want a less customized experience?

We can access your data only if you select "yes" for the categories on the previous screen. This lets us tailor our marketing so that it's more relevant for you. You can change your settings at any time by visiting our privacy statement

Your experience. Your choice.

We care about your privacy. The data we collect helps us understand how you use our products, what information you might be interested in, and what we can improve to make your engagement with Autodesk more rewarding.

May we collect and use your data to tailor your experience?

Explore the benefits of a customized experience by managing your privacy settings for this site or visit our Privacy Statement to learn more about your options.

Answer 1 · 2016-07-04T03:46:54Z

Matthew Gillespie answered Jul 04 2016 at 3:46 AM Matthew Gillespie commented Jul 06 2016 at 6:33 PM

The average of averages rarely produces the same result as the average of the whole.

For example:

5 patients of Acuity 1 arrive in the first 12 hours Average Acuity: 1

1 patient of Acuity 5 arrives in the last 12 hours Average Acuity: 5

Average of Averages: 3

Total Average: 1.666

I'd have to look at the data to give you a more specific answer.

· 4

Jason Lightfoot ♦♦ commented · Jul 06 2016 at 8:11 AM

What is this a statistic of, and is N the number of observations? If so there could be something wrong as the observations have quadrupled in the two box plot picture (they would normally still add up to 18).

1 ·

Lou Keller commented · Jul 04 2016 at 4:14 AM

@Matthew Gillespie

Yes, I'm aware of what you wrote. Forgetting about the actual numbers themselves and using the example I described, what is the mathematical process (e.g. formula, calculation, algorythmnb, etc.) the program goes through to calculate the mean(s)?

0 ·

Matthew Gillespie ♦♦ Lou Keller commented · Jul 06 2016 at 5:07 PM

I spent a long time yesterday looking into this and here's what I found:

Anytime an object changes state a new entry is added to the State History bundle and a bunch of stuff is recorded like the scenario, replication, time, duration, old state, new state, etc. The dashboard then goes through each entry in the bundle and checks if the entry corresponds to your sample set. If it does it records the duration of the state in a new bundle. This bundle has keeps track of the total utilized time, total idle time, and the observations of each period. The state's duration is added to either the total idle or total utilized time depending on how you classified the state. The percentage is then calculated and added as an observation, or if it already exists the percentage of the observation is updated. If the state duration extends across multiple periods then the duration is broken up and the duration corresponding to each period is added to the period. When the time wraps around (you go back to a period) the total utilized and idle times are cleared and a new observation is created. This is supposed to happen whenever it's a new day or a new scenario/replication. So you should have one observation per period for every day x replication x scenario.

As @jason.lightfoot noted, the number of observations is incorrect when you have multiple periods. The issue I found is that mechanism that clears the total times and makes a new observation is a little too trigger happy. When you have a duration that crosses multiple periods it updates each period. If you then have another duration that also crosses multiple periods the algorithm goes back to the first period and is assuming it's a new day and so makes a new observation.

So it looks like AJ's code is mostly working, but it's getting false positives for when a new observation should be recorded. I'm working on a fix for this.

2 ·

Lou Keller commented · Jul 06 2016 at 6:32 PM

Thanks, Matt. I very much appreciate the extra effort.

0 ·

question