Your First SQL Query
In FlexSim, SQL queries are done using the query() command:
query(str queryStr[, ...])
Often you will use the query() command to search a single table. Take the following example global table.
Let's say you want to find all customers who have made more than 5 orders, sorted alphabetically by their name. To do that, you'd use the following command:
query("SELECT * FROM Customers /
WHERE [Total orders] > 5 /
ORDER BY Name ASC");
Some things to note:
- The SELECT statement tells the query what columns you want to pull out into your query result. By using SELECT * we are telling the query to put all the columns from the source table into the query result. Alternatively, if we were to use SELECT CustomerId, Name then that would put just the CustomerId and Name columns into the query result.
- The FROM statement defines the table that is to be queried. Often this will be just one table, but it may be multiple comma-separated tables, which effects a join operation, discussed later. Here we define FROM Customers. This causes the SQL parser to search the global table in the model named Customers.
- The WHERE statement defines a filter by which entries in the table are to be matched. The expression WHERE [Total orders] > 5 means that I only want rows whose value in the Total orders column is greater than 5. In a WHERE statement you can use math operators to define expressions, like +, -, *, / and logical operators like =, != (also <>), <, <=, >, >=, AND, and OR.
- The [ ] syntax is an SQL delimiter that allows you to define column names with spaces in them. If I had named the column Totalorders instead of Total orders, I could have just put Totalorders for the column name in the query instead of [Total orders].
- The ORDER BY statement defines how the result should be sorted. Here we sort by the Name column, which is a text column and thus will be sorted alphabetically. You can optionally put ASC or DESC to define if you want it sorted in ascending order or descending order. Default is ascending. You can also have multiple comma-separated expressions defining the sort. The additional expressions will define how to order things when there is a tie. For example, the expression ORDER BY [Total orders] DESC, Name ASC would first order descending by the number of total orders, and then for any ties (Cheryl Samson and Jacob Sorenson both have 10 total orders) it would sort alphabetically by name.
- The / escape character lets you extend a quoted string across multiple lines by using the at the end of a line within a string. Often with SQL queries, the query is too long to reasonably fit within a single-line quoted string. Also, using multiple lines with indentation can make the query more readable.
- A LIMIT statement, although not used in this example, can be added at the end of the query. This will limit the number of matches. If you only want the best 3 matches, add LIMIT 3 to the end of the query.
Getting Data Out of the Query
Now that you've done your query, there are several ways to get the results out. The simplest way is to use dumpquery():
dumpquery(node toNode, num asTable)
dumpquery() will dump the results of the last query() call to a table or bundle node. For this example we can create a global table named querydump, and then dump the result to that table. The asTable defines if we want it dumped as table data (1), as opposed to bundle data (0)
dumpquery(reftable("querydump"), 1);
Dumping the above query to the querydump table yields the following results:
Table Selection
In the example above, we used "SELECT * FROM Customers". The FROM tells the parser to look in global tables for a table named Customers and search that table. Alternatively, you can explicitly define the tables for search. This is done by using a special '$' identifier in the query and by passing additional parameters into the query() command. For example, I could have defined the exact same query as above, but instead defined the customers table explicitly as follows:
query("SELECT * FROM $1 /
WHERE [Total orders] > 5 /
ORDER BY Name ASC",
reftable("Customers"));
Notice that instead of using the table "Customers" I now define the table as $1. What this means is that the table I want searched is the table I pass in as the first additional parameter of the query command. $2 would correspond to the table passed into the second additional parameter of the command, and so on. Tables passed as additional parameters can have either regular table data or bundle data; FlexSim will automatically interpret the associated data.
By using this table specification method you are no longer bound to using global tables. For example, if the customers table happened to be on a label instead of a global table, the query is still pretty simple:
query("SELECT * FROM $1 /
WHERE [Total orders] > 5 /
ORDER BY Name ASC",
label(current, "Customers"));
FlexSim's SQL parser also allows you to simpify the SQL a bit for this single-table search scenario. The SQL standard requires a SELECT and FROM statement, but FlexSim's SQL parser isn't that picky. If you only pass one parameter as the table, it will automatically assume that you want to search the table you passed in. Thus you can leave out the SELECT and FROM statements. Leaving them out is essentially the same as using the statement SELECT * FROM $1.
query("WHERE [Total orders] > 5 ORDER BY Name ASC", label(current, "Customers"));
Additional Result Analysis Methods
While using dumpquery() is often useful when you're setting up a model or when you're testing various queries, filling out an entire table/bundle with data is non-trivial and requires memory space and CPU time for creating the table. Alternatively, if you don't want to use up the resources for creating an entire table, you can directly query the results programmatically using the following commands:
getquerymatchcount()
getquerycolcount()
getqueryvalue(num row, num/str col)
getquerymatchtablerow(num/strtable, num nthMatch)
As an example, let's say we want to do the query, then manually get the names of each the matching entry.
query("SELECT * FROM Customers /
WHERE [Total orders] > 5 /
ORDER BY Name ASC");
for (inti = 1; i <= getquerymatchcount(); i++) {
string name = getqueryvalue(i, "Name");
...
}
This performs the query and then goes through each match and gets the name by using getqueryvalue() with "Name" as the column identifier. You can also access the value by using a number as the column identifier.
query("SELECT * FROM Customers /
WHERE [Total orders] > 5 /
ORDER BY Name ASC");
for (inti = 1; i <= getquerymatchcount(); i++) {
string name = getqueryvalue(i, 2);
...
}
You can also just access the matching row numbers directly:
query("SELECT * FROM Customers /
WHERE [Total orders] > 5 /
ORDER BY Name ASC");
for (inti = 1; i <= getquerymatchcount(); i++) {
int matchRow = getquerymatchtablerow(1, i);
string name = gettablestr("Customers", matchRow, 2);
...
}
In this case we call getquerymatchtablerow(1, i). This returns the row of table 1 (the first and only table defined in the search) associated with the ith match of the query.
Joins
You can also use FlexSim's SQL parser to query relationships between multiple tables. To demonstrate this, we'll do another example using the Customers table and an additional orders Table.
In this example we want to find information associated with customers who ordered SKU 78946XU. For each order of SKU 78946XU we want to know the customer's name, the CustomerId, and the OrderId. Below is the query:
query("SELECT * FROM Customers, orders /
WHERE /
SKU = '78946XU' /
AND Customers.CustomerId = orders.CustomerId");
Things to note:
- Here we use the statement FROM Customers, orders. In SQL this is called an inner join. The SQL evaluator compares every row in the Customers table with every row in the orders table to see which row-to-row pairings match the WHERE filter.
- The filter we define is WHERE SKU = '78946XU' AND Customers.CustomerId = orders.CustomerId. We only want to match the rows in the orders table that have the SKU value '78946XU'. Secondly, for those rows that do match the SKU, we only want to match them with rows in the Customers table that correspond with the same CustomerId in the matched row of the orders table.
- Notice that for the SKU rule we just say SKU = '78946XU'. For SKU, since the orders table is the only table with an SKU column, the SQL evaluator will automatically recognize that the SKU column is associated with the orders table. We could explicitly define the table and column with orders.SKU, and sometimes that is preferable in order to make the query more readable/comprehensible. However, if you leave it out the evaluator will happily figure out the association on its own.
- The CustomerId rule, on the other hand, uses the . (dot) syntax to explicitly define table and column. This is because both the orders table and the Customers table have columns named CustomerId, and we want to explicitly compare the CustomerId column in Customers with the CustomerId column in orders. So we use the dot syntax to define table and column.
The result of the dumpquery() for this query:
For explicitly defined tables (labels for example) you'd use a query like:
query("SELECT * FROM $1 AS Customers, $2 AS orders /
WHERE /
SKU = '78946XU' /
AND Customers.CustomerId = orders.CustomerId",
label(current, "Customers"),
label(current, "orders"));
Aliases
In the previous example we use the AS construct to create an alias for our table. You can create aliases for both tables and column references. This can increase readability of the query especially if you are using the $ syntax. For table aliases you do not technically need the AS qualifier. Both of the following are valid:
SELECT * FROM $1 AS Customers, $2 AS orders
SELECT * FROM $1 Customers, $2 orders
Once an alias is defined you should use that alias instead of the table name in other references in the query.
Below are several examples of defining column aliases.
SELECT Customers.CustomerId AS ID, Customers.Name AS Name
WHERE ID > 5 AND ID < 10
Advanced Query Techniques
Often there are decision-making problems in a simulation that lend themselves well to using SQL-like constructs, such as searching, filtering and prioritizing. However, since a simulation is an inherently dynamic system, data in the simulation is often not structured in the standard way that a database is structured. FlexSim's advanced SQL querying functionality aims to bridge this gap, allowing modelers to use SQL's flexibility and expressiveness in dynamically querying the state of a model.
Let's take an example where you have flow items that queue at various locations in the model. It is quite often the case that modeling logic will try to search for a "best" flow item, and determining which flow item is the best may involve complex rules. It might assign certain eligibility criteria for each flow item. It could also weigh various things like the distance from some source to the location of the flowitem, the time that the flow item has been waiting in queue, an assigned priority for the flowitem, etc. For these types of problems, the SQL language is quite expressive and makes the problem relatively easy. So what if you could represent all of the candidate flowitems as a kind of quasi-database table, and then use SQL to search, filter, and prioritize entries in that table?
Here there are four queues, each with a set of flow items in it. Imagine each of those flow items has various labels defining data on that flowitem. A table of flow items representing all of that data might look something like the following:
Item | Location | DistFromMe | Priority | Step | ItemType | Time Waiting |
GrayBox | Queue1 | 9.85 | 3 | 5 | 6 | 5.4 |
PurpleBox | Queue2 | 8.5 | 2 | 2 | 8 | 8.1 |
TealBox | Queue2 | 8.5 | 8 | 12 | 5 | 7.2 |
OrangeBox | Queue4 | 12.5 | 4 | 1 | 4 | 1.2 |
GreenBox | Queue4 | 12.5 | 3 | 5 | 2 | 4 |
BlueBox | Queue4 | 12.5 | 6 | 6 | 3 | 22.5 |
PinkBox | Queue3 | 7.5 | 3 | 9 | 7 | 12.8 |
BlueBox | Queue3 | 7.5 | 6 | 10 | 3 | 3.4 |
BlueBox | Queue3 | 7.5 | 4 | 7 | 3 | 7.1 |
If we could represent the model structure in these quasi-table terms, we could use SQL to do complex filtering and prioritizing based on those tables.
First, let's start simple. We'll just search one queue of flow items, Queue4, and we'll only look at the Step value, which, we'll say is stored on the items' "Step" label. We want to find the flow items with a step value greater than 3. The more simplified table would look like:
Item | Step |
OrangeBox | 1 |
GreenBox | 5 |
BlueBox | 6 |
Here's the command.
query("SELECT $2 AS Item, $3 AS Step /
FROM $1 Queue /
WHERE Step > 3",
/*$1*/node("Queue4", model()),
/*$2*/$iter(1),
/*$3*/getlabel($iter(1), "Step"));
The results of dumpquery on this would be:
Here we introduce two new concepts: 1. object references as tables, and 2. individual column values being defined by the $ syntax, instead of just tables.
Object References as Tables
The query table is defined with:
FROM $1 Queue
And we bind the $1 reference with:
node("Queue4", model())
Here instead of referencing an actual table, we reference an object in the model. In doing this, we're associating each row of our "virtual table" with a sub-node of the object we reference, or in other words, a flow item in the queue. So the first row in the table is associated with the first flow item inside the queue, the second row with the second flow item, and so on.
$'s as Table Values and $iter()
The select statement looks like this:
SELECT $2 AS Item, $3 AS Step
And we bind $2 with:
$iter(1)
And $3 with:
getlabel($iter(1), "Step")
Again here's the "virtual table" we are searching, based on each flow item in Queue4 and its Step label.
Item | Step |
OrangeBox | 1 |
GreenBox | 5 |
BlueBox | 6 |
We use the $iter() command to determine the values of the table cells by traversing the content of Queue4. $iter() returns the iteration on a given table. $iter(1) is the iteration on $1. Since $1 is Queue4, $iter(1) is going to be the iteration on Queue4 associated with a given row in the table, or in other words, one of the flow item sub-nodes of Queue4.
When the evaluator needs to get the value of a certain row in the table, it sets $iter(1) to the flow item sub-node of Queue4 associated with that table row, and then re-evaluates the expression. So when it's on the GreenBox row in the table (row 2) and needs to get the Step value for that row, it sets $iter(1) to rank(Queue4, 2), and evaluates $3. $3 then essentially returns getlabel(rank(Queue4, 2), "Step"). That value is then used as the value of the table cell.
Creating a Table for All Flow Items
Now that we've done a query on a single queue, let's extend the query to search all queues. Let's assume the queues are all connected to the output ports of an upstream queue.
Now let's write the query.
query("SELECT $3 AS Item, $4 AS Step /
FROM $1x$2 AS Items /
WHERE Step > 3",
/*$1*/nrop(node("UpQueue", model())),
/*$2*/outobject(node("UpQueue", model()), $iter(1)),
/*$3*/$iter(2),
/*$4*/getlabel($iter(2), "Step"));
The results of dumpquery() on this query are as follows:
Note that we've shifted our column references down. The Item reference is now $3 and the Step label reference is $4. Also notice that we've "flattened" a two-dimensional model structure into a single virtual table by using the special table identifier $1x$2. The first dimension is the output ports of UpQueue, and the second dimension is the sub-node tree of each of the Queues connected to the output ports of UpQueue.
Numbers as Tables
For $1 we return nrop(node("UpQueue", model()). Unlike previously where we assigned $1 to a Queue object itself, now we define $1 as a straight number, namely the number of output ports of UpQueue. Subsequently, the iteration on $1 ($iter(1)) will also be a number.
"Flattening" Model Structures
When the SQL parser sees the special $1x$2 table reference, it will "build" the table by evaluating $1 and then iterating on $1, evaluating $2 for each iteration. In this example, it evaluates $1 which returns 3, the number of output ports of UpQueue. Then it iterates from 1 to 3, evaluating $2 each time. On the first iteration ($iter(1) == 1), $2 returns Queue1: outobject(UpQueue, 1). The evaluator then determines it's an object reference and adds 1 row (the number of flow items in Queue1) to the table. Then it continues to the next iteration on $1 ($iter(1) == 2). $2 returns Queue2: outobject(UpQueue, 2), and the evaluator adds 2 rows (the number of flow items in Queue2) to the table, and so on until it's built the entire table. Note that behind the scenes it's not really "building" a table. It's only storing off the total table size and what rows in the table are associated with what $ iterations. Once it's built the table, the evaluator then goes through the table and evaluates the query for each row in the table, calling $3 and $4 to figure out the associated values of each iteration.
In defining a table you can use any number of model structure "dimensions" to define a table, meaning you could have a table that is $1x$2x$3x$4, i.e. 4 model-structure dimensions, or more. You can do inner joins on these tables just like you would do with standard tables. Thus, by using these advanced querying techniques you can quickly reduce complex filtering and prioritizing problems into relatively simple SQL queries.