Subqueries
This guide covers using subqueries (nested queries) in various clauses with the fluent query syntax.
Overview
Subqueries are queries nested within other queries, allowing complex data retrieval and filtering. Velocity supports subqueries in:
- SELECT clause - Calculate values based on other tables
- FROM clause - Query from derived result sets
- WHERE clause - Filter based on subquery results
- GROUP BY clause - Group by subquery results
- HAVING clause - Filter groups based on subquery results
- INSERT/UPDATE - Use subquery results for data modification
Key components:
- Query - Can be used as a subquery
- Criterion<T> - Accepts Query for subquery comparisons
- ConditionalType - IN, EXISTS, comparison operators
Subqueries in SELECT Clause
Calculate values from other tables:
Scalar Subquery
Return a single value:
using YndigoBlue.Velocity.Model;
using YndigoBlue.Velocity.Enums;
Table orders = schema["orders"];
Table customers = schema["customers"];
// Subquery to get customer name for each order
Query customerSubquery = new Query()
.Select(customers["customer_name"])
.From(customers)
.Where(new Filter(
new Criterion<Column>(
customers["customer_id"],
ConditionalType.Equals,
orders["customer_id"]
)
));
Query mainQuery = new Query()
.Select([
orders["order_id"],
orders["order_date"]
])
.Select(customerSubquery, "customer_name")
.From(orders);
using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
foreach (Result row in results)
{
int orderId = row.GetFieldInteger("order_id");
DateTime orderDate = row.GetFieldDateTime("order_date");
string customerName = row.GetFieldString("customer_name");
Console.WriteLine($"Order #{orderId} - {customerName} ({orderDate:d})");
}
}
Generated SQL:
SELECT order_id,
order_date,
(SELECT customer_name FROM customers WHERE customers.customer_id = orders.customer_id) AS customer_name
FROM orders
Aggregate Subquery
Table products = schema["products"];
Table order_items = schema["order_items"];
// Calculate total quantity sold for each product
Expression sumQty = new Expression("quantity_sum",
new Aggregate(AggregateType.Sum, order_items["quantity"])
);
Query qtySubquery = new Query()
.Select(sumQty)
.From(order_items)
.Where(new Filter(
new Criterion<Column>(
order_items["product_id"],
ConditionalType.Equals,
products["product_id"]
)
));
Query mainQuery = new Query()
.Select([
products["product_name"],
products["price"]
])
.Select(qtySubquery, "total_sold")
.From(products);
Generated SQL:
SELECT product_name,
price,
(SELECT SUM(quantity) FROM order_items WHERE order_items.product_id = products.product_id) AS total_sold
FROM products
Subqueries in FROM Clause
Query from derived tables:
Basic FROM Subquery
Table orders = schema["orders"];
// Subquery: aggregate orders by customer
Expression totalSpent = new Expression("total_spent",
new Aggregate(AggregateType.Sum, orders["total"])
);
Expression orderCount = new Expression("order_count",
new Aggregate(AggregateType.Count, new ArithmeticOperator(ArithmeticType.All))
);
Query customerTotals = new Query()
.Select([
orders["customer_id"],
totalSpent,
orderCount
])
.From(orders)
.GroupBy(orders["customer_id"]);
// Use subquery as FROM source
Table customers = schema["customers"];
var join = new Join(customers, customerTotals, "customer_id");
Query mainQuery = new Query()
.Select([
customers["customer_name"],
customerTotals["total_spent"],
customerTotals["order_count"]
])
.From(join)
.OrderBy(customerTotals["total_spent"], OrderClauseType.Descending);
Note: Not all databases support subqueries in FROM clauses. Check your database's documentation.
Subqueries in WHERE Clause
Filter based on subquery results:
IN Subquery
Check if value exists in subquery results:
Table products = schema["products"];
Table order_items = schema["order_items"];
// Subquery: get product IDs that have been ordered
Query orderedProductIds = new Query()
.Select(order_items["product_id"])
.From(order_items)
.Distinct();
// Main query: find products that have been ordered
Filter inFilter = new Filter(
new Criterion<Query>(
products["product_id"],
ConditionalType.In,
orderedProductIds
)
);
Query mainQuery = new Query()
.Select([
products["product_name"],
products["price"]
])
.From(products)
.Where(inFilter);
Generated SQL:
SELECT product_name, price
FROM products
WHERE product_id IN (SELECT DISTINCT product_id FROM order_items)
NOT IN Subquery
Find rows not matching subquery:
Table products = schema["products"];
Table order_items = schema["order_items"];
// Find products that have never been ordered
Query orderedProductIds = new Query()
.Select(order_items["product_id"])
.From(order_items)
.Distinct();
Filter notInFilter = new Filter(
new Criterion<Query>(
products["product_id"],
ConditionalType.NotIn,
orderedProductIds
)
);
Query mainQuery = new Query()
.Select([
products["product_name"],
products["stock_quantity"]
])
.From(products)
.Where(notInFilter);
using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
Console.WriteLine("Products Never Ordered:");
foreach (Result row in results)
{
string name = row.GetFieldString("product_name");
int stock = row.GetFieldInteger("stock_quantity");
Console.WriteLine($" {name} ({stock} in stock)");
}
}
Generated SQL:
SELECT product_name, stock_quantity
FROM products
WHERE product_id NOT IN (SELECT DISTINCT product_id FROM order_items)
EXISTS Subquery
Check if subquery returns any rows:
Table customers = schema["customers"];
Table orders = schema["orders"];
// Subquery: check if customer has orders
Query customerOrders = new Query()
.Select(orders["order_id"])
.From(orders)
.Where(new Filter(
new Criterion<Column>(
orders["customer_id"],
ConditionalType.Equals,
customers["customer_id"]
)
));
// Find customers who have placed orders
Filter existsFilter = new Filter(
new Criterion<Query>(
null, // No column for EXISTS
ConditionalType.Exists,
customerOrders
)
);
Query mainQuery = new Query()
.Select([
customers["customer_name"],
customers["email"]
])
.From(customers)
.Where(existsFilter);
Generated SQL:
SELECT customer_name, email
FROM customers
WHERE EXISTS (SELECT order_id FROM orders WHERE orders.customer_id = customers.customer_id)
NOT EXISTS Subquery
// Find customers who have NOT placed orders
Filter notExistsFilter = new Filter(
new Criterion<Query>(
null,
ConditionalType.NotExists,
customerOrders
)
);
Query mainQuery = new Query()
.Select([
customers["customer_name"],
customers["email"]
])
.From(customers)
.Where(notExistsFilter);
Generated SQL:
SELECT customer_name, email
FROM customers
WHERE NOT EXISTS (SELECT order_id FROM orders WHERE orders.customer_id = customers.customer_id)
Comparison Subqueries
Compare values with subquery results:
Table products = schema["products"];
Table orders = schema["orders"];
// Find products more expensive than average order value
Expression avgTotal = new Expression("avg_total",
new Aggregate(AggregateType.Average, orders["total"])
);
Query avgQuery = new Query()
.Select(avgTotal)
.From(orders);
Filter greaterThanAvg = new Filter(
new Criterion<Query>(
products["price"],
ConditionalType.GreaterThan,
avgQuery
)
);
Query mainQuery = new Query()
.Select([
products["product_name"],
products["price"]
])
.From(products)
.Where(greaterThanAvg)
.OrderBy(products["price"], OrderClauseType.Descending);
Generated SQL:
SELECT product_name, price
FROM products
WHERE price > (SELECT AVG(total) FROM orders)
ORDER BY price DESC
Correlated Subqueries
Subqueries that reference the outer query:
Basic Correlated Subquery
Table orders = schema["orders"];
Table order_items = schema["order_items"];
// Find orders where total items > 3
Expression itemCount = new Expression("item_count",
new Aggregate(AggregateType.Count, new ArithmeticOperator(ArithmeticType.All))
);
Query countSubquery = new Query()
.Select(itemCount)
.From(order_items)
.Where(new Filter(
new Criterion<Column>(
order_items["order_id"],
ConditionalType.Equals,
orders["order_id"] // References outer query
)
));
Filter filter = new Filter(
new Criterion<Query>(
new Literal<int>(3),
ConditionalType.LessThan,
countSubquery
)
);
Query mainQuery = new Query()
.Select([
orders["order_id"],
orders["order_date"]
])
.From(orders)
.Where(filter);
Generated SQL:
SELECT order_id, order_date
FROM orders
WHERE 3 < (SELECT COUNT(*) FROM order_items WHERE order_items.order_id = orders.order_id)
Correlated EXISTS
Table products = schema["products"];
Table order_items = schema["order_items"];
// Find products ordered more than 100 units total
Expression totalQty = new Expression("total_qty",
new Aggregate(AggregateType.Sum, order_items["quantity"])
);
Query qtySubquery = new Query()
.Select(totalQty)
.From(order_items)
.Where(new Filter(
new Criterion<Column>(
order_items["product_id"],
ConditionalType.Equals,
products["product_id"]
)
))
.Having(new Filter(
new Criterion<int>(
totalQty,
ConditionalType.GreaterThan,
100
)
));
Filter existsFilter = new Filter(
new Criterion<Query>(
null,
ConditionalType.Exists,
qtySubquery
)
);
Query mainQuery = new Query()
.Select(products["product_name"])
.From(products)
.Where(existsFilter);
Subqueries in GROUP BY
Group by values from subqueries:
Table orders = schema["orders"];
Table customers = schema["customers"];
// Subquery to categorize customers by total spent
Query customerCategory = new Query()
.Select(new Expression("category", [
new Function("CASE",
new Criterion<decimal>(orders["total"], ConditionalType.GreaterThan, 1000M),
new Literal<string>("High"),
new Literal<string>("Regular")
)
]))
.From(orders);
// Group orders by customer category
// Note: This requires database-specific CASE expression support
Subqueries in HAVING Clause
Filter groups based on subquery results:
Table products = schema["products"];
Table order_items = schema["order_items"];
// Get average quantity per product
Expression avgQtyExpr = new Expression("avg_qty",
new Aggregate(AggregateType.Average, order_items["quantity"])
);
// Subquery: overall average quantity across all products
Query overallAvgQuery = new Query()
.Select(new Expression("overall_avg",
new Aggregate(AggregateType.Average, order_items["quantity"])
))
.From(order_items);
// Group by product and filter to products above overall average
Query mainQuery = new Query()
.Select([
order_items["product_id"],
avgQtyExpr
])
.From(order_items)
.GroupBy(order_items["product_id"])
.Having(new Filter(
new Criterion<Query>(
avgQtyExpr,
ConditionalType.GreaterThan,
overallAvgQuery
)
));
Generated SQL:
SELECT product_id, AVG(quantity) AS avg_qty
FROM order_items
GROUP BY product_id
HAVING AVG(quantity) > (SELECT AVG(quantity) FROM order_items)
Subqueries in INSERT
Use subquery results for inserts:
Table archived_orders = schema["archived_orders"];
Table orders = schema["orders"];
// Get old orders to archive
Query oldOrders = new Query()
.Select([
orders["order_id"],
orders["customer_id"],
orders["order_date"],
orders["total"]
])
.From(orders)
.Where(new Filter(
new Criterion<DateTime>(
orders["order_date"],
ConditionalType.LessThan,
DateTime.Now.AddYears(-2)
)
));
// Insert into archive table
using (Manager manager = new Manager(connection))
{
manager.InsertFromQuery(archived_orders, oldOrders);
}
Subqueries in UPDATE
Update based on subquery results:
Table products = schema["products"];
Table order_items = schema["order_items"];
// Subquery: calculate total quantity sold per product
Expression totalSold = new Expression("total_sold",
new Aggregate(AggregateType.Sum, order_items["quantity"])
);
Query salesSubquery = new Query()
.Select([
order_items["product_id"],
totalSold
])
.From(order_items)
.GroupBy(order_items["product_id"]);
// Update products with sales data
// This would be done through the DataCommand system
Complete Examples
Example 1: Customer Segmentation
Table customers = schema["customers"];
Table orders = schema["orders"];
// Subquery: calculate customer lifetime value
Expression lifetimeValue = new Expression("lifetime_value",
new Aggregate(AggregateType.Sum, orders["total"])
);
Query customerValueSubquery = new Query()
.Select(lifetimeValue)
.From(orders)
.Where(new Filter(
new Criterion<Column>(
orders["customer_id"],
ConditionalType.Equals,
customers["customer_id"]
)
));
// Subquery: average customer value
Query avgValueQuery = new Query()
.Select(new Expression("avg_value",
new Aggregate(AggregateType.Average, orders["total"])
))
.From(orders);
// Find high-value customers (above average)
Filter highValueFilter = new Filter(
new Criterion<Query>(
customerValueSubquery,
ConditionalType.GreaterThan,
avgValueQuery
)
);
Query mainQuery = new Query()
.Select([
customers["customer_name"],
customers["email"]
])
.Select(customerValueSubquery, "lifetime_value")
.From(customers)
.Where(highValueFilter)
.OrderBy(customerValueSubquery, OrderClauseType.Descending);
using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
Console.WriteLine("High-Value Customers (Above Average)");
Console.WriteLine("=====================================\n");
foreach (Result row in results)
{
string name = row.GetFieldString("customer_name");
string email = row.GetFieldString("email");
decimal value = row.GetFieldDecimal("lifetime_value");
Console.WriteLine($"{name} ({email})");
Console.WriteLine($" Lifetime Value: ${value:N2}");
Console.WriteLine();
}
}
Example 2: Product Performance Analysis
Table products = schema["products"];
Table order_items = schema["order_items"];
// Subquery: times ordered
Expression timesOrdered = new Expression("times_ordered",
new Aggregate(AggregateType.Count, new ArithmeticOperator(ArithmeticType.All))
);
Query orderedCountSubquery = new Query()
.Select(timesOrdered)
.From(order_items)
.Where(new Filter(
new Criterion<Column>(
order_items["product_id"],
ConditionalType.Equals,
products["product_id"]
)
));
// Subquery: total revenue
Expression lineTotal = new Expression("line_total", [
order_items["quantity"],
new ArithmeticOperator(ArithmeticType.Multiply),
order_items["unit_price"]
]);
Expression totalRevenue = new Expression("total_revenue",
new Aggregate(AggregateType.Sum, lineTotal)
);
Query revenueSubquery = new Query()
.Select(totalRevenue)
.From(order_items)
.Where(new Filter(
new Criterion<Column>(
order_items["product_id"],
ConditionalType.Equals,
products["product_id"]
)
));
// Filter: products ordered at least once
Filter orderedFilter = new Filter(
new Criterion<Query>(
new Literal<int>(0),
ConditionalType.LessThan,
orderedCountSubquery
)
);
Query mainQuery = new Query()
.Select([
products["product_name"],
products["price"]
])
.Select(orderedCountSubquery, "times_ordered")
.Select(revenueSubquery, "revenue")
.From(products)
.Where(orderedFilter)
.OrderBy(revenueSubquery, OrderClauseType.Descending);
using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
Console.WriteLine("Product Performance Report");
Console.WriteLine("==========================\n");
int rank = 1;
foreach (Result row in results.Take(20))
{
string name = row.GetFieldString("product_name");
decimal price = row.GetFieldDecimal("price");
int ordered = row.GetFieldInteger("times_ordered");
decimal revenue = row.GetFieldDecimal("revenue");
Console.WriteLine($"{rank}. {name} (${price:F2})");
Console.WriteLine($" Times Ordered: {ordered}");
Console.WriteLine($" Total Revenue: ${revenue:N2}");
Console.WriteLine();
rank++;
}
}
Example 3: Comparative Analysis
Table products = schema["products"];
Table order_items = schema["order_items"];
// Subquery: category average price
Expression categoryAvgPrice = new Expression("category_avg",
new Aggregate(AggregateType.Average, products["price"])
);
// Compare each product to its category average
Query mainQuery = new Query()
.Select([
products["product_name"],
products["category"],
products["price"]
])
.From(products);
// This would require a correlated subquery in SELECT for category average
// Then calculate difference in application code or use expressions
Best Practices
Use EXISTS instead of IN for better performance
- EXISTS stops at first match
- IN processes entire subquery
- Especially important for large subqueries
Limit subquery result sets
- Add WHERE clauses to subqueries
- Use DISTINCT when appropriate
- Prevents processing unnecessary data
Be careful with correlated subqueries
- Execute once per outer row
- Can be very slow on large datasets
- Consider joins as alternative
Index subquery columns
- Add indexes to columns in subquery WHERE
- Improves subquery performance
- Essential for correlated subqueries
Consider alternatives
- JOINs often faster than subqueries
- CTEs (Common Table Expressions) for readability
- Temporary tables for complex operations
Test with realistic data
- Subquery performance varies by size
- Test with production-scale data
- Monitor execution plans
Avoid subqueries in SELECT when possible
- Execute for every row
- Consider JOINs or window functions
- Pre-calculate when feasible
Use appropriate comparison operators
- EXISTS/NOT EXISTS for existence checks
- IN/NOT IN for value lists
- Comparison operators for scalar values
Next Steps
- Joins - Alternative to subqueries for many scenarios
- Grouping and Aggregation - Use subqueries with aggregates
- Performance - Optimize subquery performance