Table of Contents

Subqueries

This guide covers using subqueries (nested queries) in various clauses with the fluent query syntax.


Overview

Subqueries are queries nested within other queries, allowing complex data retrieval and filtering. Velocity supports subqueries in:

  • SELECT clause - Calculate values based on other tables
  • FROM clause - Query from derived result sets
  • WHERE clause - Filter based on subquery results
  • GROUP BY clause - Group by subquery results
  • HAVING clause - Filter groups based on subquery results
  • INSERT/UPDATE - Use subquery results for data modification

Key components:


Subqueries in SELECT Clause

Calculate values from other tables:

Scalar Subquery

Return a single value:

using YndigoBlue.Velocity.Model;
using YndigoBlue.Velocity.Enums;

Table orders = schema["orders"];
Table customers = schema["customers"];

// Subquery to get customer name for each order
Query customerSubquery = new Query()
    .Select(customers["customer_name"])
    .From(customers)
    .Where(new Filter(
        new Criterion<Column>(
            customers["customer_id"],
            ConditionalType.Equals,
            orders["customer_id"]
        )
    ));

Query mainQuery = new Query()
    .Select([
        orders["order_id"],
        orders["order_date"]
    ])
    .Select(customerSubquery, "customer_name")
    .From(orders);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
    foreach (Result row in results)
    {
        int orderId = row.GetFieldInteger("order_id");
        DateTime orderDate = row.GetFieldDateTime("order_date");
        string customerName = row.GetFieldString("customer_name");

        Console.WriteLine($"Order #{orderId} - {customerName} ({orderDate:d})");
    }
}

Generated SQL:

SELECT order_id,
       order_date,
       (SELECT customer_name FROM customers WHERE customers.customer_id = orders.customer_id) AS customer_name
FROM orders

Aggregate Subquery

Table products = schema["products"];
Table order_items = schema["order_items"];

// Calculate total quantity sold for each product
Expression sumQty = new Expression("quantity_sum",
    new Aggregate(AggregateType.Sum, order_items["quantity"])
);

Query qtySubquery = new Query()
    .Select(sumQty)
    .From(order_items)
    .Where(new Filter(
        new Criterion<Column>(
            order_items["product_id"],
            ConditionalType.Equals,
            products["product_id"]
        )
    ));

Query mainQuery = new Query()
    .Select([
        products["product_name"],
        products["price"]
    ])
    .Select(qtySubquery, "total_sold")
    .From(products);

Generated SQL:

SELECT product_name,
       price,
       (SELECT SUM(quantity) FROM order_items WHERE order_items.product_id = products.product_id) AS total_sold
FROM products

Subqueries in FROM Clause

Query from derived tables:

Basic FROM Subquery

Table orders = schema["orders"];

// Subquery: aggregate orders by customer
Expression totalSpent = new Expression("total_spent",
    new Aggregate(AggregateType.Sum, orders["total"])
);

Expression orderCount = new Expression("order_count",
    new Aggregate(AggregateType.Count, new ArithmeticOperator(ArithmeticType.All))
);

Query customerTotals = new Query()
    .Select([
        orders["customer_id"],
        totalSpent,
        orderCount
    ])
    .From(orders)
    .GroupBy(orders["customer_id"]);

// Use subquery as FROM source
Table customers = schema["customers"];
var join = new Join(customers, customerTotals, "customer_id");

Query mainQuery = new Query()
    .Select([
        customers["customer_name"],
        customerTotals["total_spent"],
        customerTotals["order_count"]
    ])
    .From(join)
    .OrderBy(customerTotals["total_spent"], OrderClauseType.Descending);

Note: Not all databases support subqueries in FROM clauses. Check your database's documentation.


Subqueries in WHERE Clause

Filter based on subquery results:

IN Subquery

Check if value exists in subquery results:

Table products = schema["products"];
Table order_items = schema["order_items"];

// Subquery: get product IDs that have been ordered
Query orderedProductIds = new Query()
    .Select(order_items["product_id"])
    .From(order_items)
    .Distinct();

// Main query: find products that have been ordered
Filter inFilter = new Filter(
    new Criterion<Query>(
        products["product_id"],
        ConditionalType.In,
        orderedProductIds
    )
);

Query mainQuery = new Query()
    .Select([
        products["product_name"],
        products["price"]
    ])
    .From(products)
    .Where(inFilter);

Generated SQL:

SELECT product_name, price
FROM products
WHERE product_id IN (SELECT DISTINCT product_id FROM order_items)

NOT IN Subquery

Find rows not matching subquery:

Table products = schema["products"];
Table order_items = schema["order_items"];

// Find products that have never been ordered
Query orderedProductIds = new Query()
    .Select(order_items["product_id"])
    .From(order_items)
    .Distinct();

Filter notInFilter = new Filter(
    new Criterion<Query>(
        products["product_id"],
        ConditionalType.NotIn,
        orderedProductIds
    )
);

Query mainQuery = new Query()
    .Select([
        products["product_name"],
        products["stock_quantity"]
    ])
    .From(products)
    .Where(notInFilter);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
    Console.WriteLine("Products Never Ordered:");
    foreach (Result row in results)
    {
        string name = row.GetFieldString("product_name");
        int stock = row.GetFieldInteger("stock_quantity");

        Console.WriteLine($"  {name} ({stock} in stock)");
    }
}

Generated SQL:

SELECT product_name, stock_quantity
FROM products
WHERE product_id NOT IN (SELECT DISTINCT product_id FROM order_items)

EXISTS Subquery

Check if subquery returns any rows:

Table customers = schema["customers"];
Table orders = schema["orders"];

// Subquery: check if customer has orders
Query customerOrders = new Query()
    .Select(orders["order_id"])
    .From(orders)
    .Where(new Filter(
        new Criterion<Column>(
            orders["customer_id"],
            ConditionalType.Equals,
            customers["customer_id"]
        )
    ));

// Find customers who have placed orders
Filter existsFilter = new Filter(
    new Criterion<Query>(
        null,  // No column for EXISTS
        ConditionalType.Exists,
        customerOrders
    )
);

Query mainQuery = new Query()
    .Select([
        customers["customer_name"],
        customers["email"]
    ])
    .From(customers)
    .Where(existsFilter);

Generated SQL:

SELECT customer_name, email
FROM customers
WHERE EXISTS (SELECT order_id FROM orders WHERE orders.customer_id = customers.customer_id)

NOT EXISTS Subquery

// Find customers who have NOT placed orders
Filter notExistsFilter = new Filter(
    new Criterion<Query>(
        null,
        ConditionalType.NotExists,
        customerOrders
    )
);

Query mainQuery = new Query()
    .Select([
        customers["customer_name"],
        customers["email"]
    ])
    .From(customers)
    .Where(notExistsFilter);

Generated SQL:

SELECT customer_name, email
FROM customers
WHERE NOT EXISTS (SELECT order_id FROM orders WHERE orders.customer_id = customers.customer_id)

Comparison Subqueries

Compare values with subquery results:

Table products = schema["products"];
Table orders = schema["orders"];

// Find products more expensive than average order value
Expression avgTotal = new Expression("avg_total",
    new Aggregate(AggregateType.Average, orders["total"])
);

Query avgQuery = new Query()
    .Select(avgTotal)
    .From(orders);

Filter greaterThanAvg = new Filter(
    new Criterion<Query>(
        products["price"],
        ConditionalType.GreaterThan,
        avgQuery
    )
);

Query mainQuery = new Query()
    .Select([
        products["product_name"],
        products["price"]
    ])
    .From(products)
    .Where(greaterThanAvg)
    .OrderBy(products["price"], OrderClauseType.Descending);

Generated SQL:

SELECT product_name, price
FROM products
WHERE price > (SELECT AVG(total) FROM orders)
ORDER BY price DESC

Correlated Subqueries

Subqueries that reference the outer query:

Basic Correlated Subquery

Table orders = schema["orders"];
Table order_items = schema["order_items"];

// Find orders where total items > 3
Expression itemCount = new Expression("item_count",
    new Aggregate(AggregateType.Count, new ArithmeticOperator(ArithmeticType.All))
);

Query countSubquery = new Query()
    .Select(itemCount)
    .From(order_items)
    .Where(new Filter(
        new Criterion<Column>(
            order_items["order_id"],
            ConditionalType.Equals,
            orders["order_id"]  // References outer query
        )
    ));

Filter filter = new Filter(
    new Criterion<Query>(
        new Literal<int>(3),
        ConditionalType.LessThan,
        countSubquery
    )
);

Query mainQuery = new Query()
    .Select([
        orders["order_id"],
        orders["order_date"]
    ])
    .From(orders)
    .Where(filter);

Generated SQL:

SELECT order_id, order_date
FROM orders
WHERE 3 < (SELECT COUNT(*) FROM order_items WHERE order_items.order_id = orders.order_id)

Correlated EXISTS

Table products = schema["products"];
Table order_items = schema["order_items"];

// Find products ordered more than 100 units total
Expression totalQty = new Expression("total_qty",
    new Aggregate(AggregateType.Sum, order_items["quantity"])
);

Query qtySubquery = new Query()
    .Select(totalQty)
    .From(order_items)
    .Where(new Filter(
        new Criterion<Column>(
            order_items["product_id"],
            ConditionalType.Equals,
            products["product_id"]
        )
    ))
    .Having(new Filter(
        new Criterion<int>(
            totalQty,
            ConditionalType.GreaterThan,
            100
        )
    ));

Filter existsFilter = new Filter(
    new Criterion<Query>(
        null,
        ConditionalType.Exists,
        qtySubquery
    )
);

Query mainQuery = new Query()
    .Select(products["product_name"])
    .From(products)
    .Where(existsFilter);

Subqueries in GROUP BY

Group by values from subqueries:

Table orders = schema["orders"];
Table customers = schema["customers"];

// Subquery to categorize customers by total spent
Query customerCategory = new Query()
    .Select(new Expression("category", [
        new Function("CASE",
            new Criterion<decimal>(orders["total"], ConditionalType.GreaterThan, 1000M),
            new Literal<string>("High"),
            new Literal<string>("Regular")
        )
    ]))
    .From(orders);

// Group orders by customer category
// Note: This requires database-specific CASE expression support

Subqueries in HAVING Clause

Filter groups based on subquery results:

Table products = schema["products"];
Table order_items = schema["order_items"];

// Get average quantity per product
Expression avgQtyExpr = new Expression("avg_qty",
    new Aggregate(AggregateType.Average, order_items["quantity"])
);

// Subquery: overall average quantity across all products
Query overallAvgQuery = new Query()
    .Select(new Expression("overall_avg",
        new Aggregate(AggregateType.Average, order_items["quantity"])
    ))
    .From(order_items);

// Group by product and filter to products above overall average
Query mainQuery = new Query()
    .Select([
        order_items["product_id"],
        avgQtyExpr
    ])
    .From(order_items)
    .GroupBy(order_items["product_id"])
    .Having(new Filter(
        new Criterion<Query>(
            avgQtyExpr,
            ConditionalType.GreaterThan,
            overallAvgQuery
        )
    ));

Generated SQL:

SELECT product_id, AVG(quantity) AS avg_qty
FROM order_items
GROUP BY product_id
HAVING AVG(quantity) > (SELECT AVG(quantity) FROM order_items)

Subqueries in INSERT

Use subquery results for inserts:

Table archived_orders = schema["archived_orders"];
Table orders = schema["orders"];

// Get old orders to archive
Query oldOrders = new Query()
    .Select([
        orders["order_id"],
        orders["customer_id"],
        orders["order_date"],
        orders["total"]
    ])
    .From(orders)
    .Where(new Filter(
        new Criterion<DateTime>(
            orders["order_date"],
            ConditionalType.LessThan,
            DateTime.Now.AddYears(-2)
        )
    ));

// Insert into archive table
using (Manager manager = new Manager(connection))
{
    manager.InsertFromQuery(archived_orders, oldOrders);
}

Subqueries in UPDATE

Update based on subquery results:

Table products = schema["products"];
Table order_items = schema["order_items"];

// Subquery: calculate total quantity sold per product
Expression totalSold = new Expression("total_sold",
    new Aggregate(AggregateType.Sum, order_items["quantity"])
);

Query salesSubquery = new Query()
    .Select([
        order_items["product_id"],
        totalSold
    ])
    .From(order_items)
    .GroupBy(order_items["product_id"]);

// Update products with sales data
// This would be done through the DataCommand system

Complete Examples

Example 1: Customer Segmentation

Table customers = schema["customers"];
Table orders = schema["orders"];

// Subquery: calculate customer lifetime value
Expression lifetimeValue = new Expression("lifetime_value",
    new Aggregate(AggregateType.Sum, orders["total"])
);

Query customerValueSubquery = new Query()
    .Select(lifetimeValue)
    .From(orders)
    .Where(new Filter(
        new Criterion<Column>(
            orders["customer_id"],
            ConditionalType.Equals,
            customers["customer_id"]
        )
    ));

// Subquery: average customer value
Query avgValueQuery = new Query()
    .Select(new Expression("avg_value",
        new Aggregate(AggregateType.Average, orders["total"])
    ))
    .From(orders);

// Find high-value customers (above average)
Filter highValueFilter = new Filter(
    new Criterion<Query>(
        customerValueSubquery,
        ConditionalType.GreaterThan,
        avgValueQuery
    )
);

Query mainQuery = new Query()
    .Select([
        customers["customer_name"],
        customers["email"]
    ])
    .Select(customerValueSubquery, "lifetime_value")
    .From(customers)
    .Where(highValueFilter)
    .OrderBy(customerValueSubquery, OrderClauseType.Descending);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
    Console.WriteLine("High-Value Customers (Above Average)");
    Console.WriteLine("=====================================\n");

    foreach (Result row in results)
    {
        string name = row.GetFieldString("customer_name");
        string email = row.GetFieldString("email");
        decimal value = row.GetFieldDecimal("lifetime_value");

        Console.WriteLine($"{name} ({email})");
        Console.WriteLine($"  Lifetime Value: ${value:N2}");
        Console.WriteLine();
    }
}

Example 2: Product Performance Analysis

Table products = schema["products"];
Table order_items = schema["order_items"];

// Subquery: times ordered
Expression timesOrdered = new Expression("times_ordered",
    new Aggregate(AggregateType.Count, new ArithmeticOperator(ArithmeticType.All))
);

Query orderedCountSubquery = new Query()
    .Select(timesOrdered)
    .From(order_items)
    .Where(new Filter(
        new Criterion<Column>(
            order_items["product_id"],
            ConditionalType.Equals,
            products["product_id"]
        )
    ));

// Subquery: total revenue
Expression lineTotal = new Expression("line_total", [
    order_items["quantity"],
    new ArithmeticOperator(ArithmeticType.Multiply),
    order_items["unit_price"]
]);

Expression totalRevenue = new Expression("total_revenue",
    new Aggregate(AggregateType.Sum, lineTotal)
);

Query revenueSubquery = new Query()
    .Select(totalRevenue)
    .From(order_items)
    .Where(new Filter(
        new Criterion<Column>(
            order_items["product_id"],
            ConditionalType.Equals,
            products["product_id"]
        )
    ));

// Filter: products ordered at least once
Filter orderedFilter = new Filter(
    new Criterion<Query>(
        new Literal<int>(0),
        ConditionalType.LessThan,
        orderedCountSubquery
    )
);

Query mainQuery = new Query()
    .Select([
        products["product_name"],
        products["price"]
    ])
    .Select(orderedCountSubquery, "times_ordered")
    .Select(revenueSubquery, "revenue")
    .From(products)
    .Where(orderedFilter)
    .OrderBy(revenueSubquery, OrderClauseType.Descending);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(mainQuery))
{
    Console.WriteLine("Product Performance Report");
    Console.WriteLine("==========================\n");

    int rank = 1;
    foreach (Result row in results.Take(20))
    {
        string name = row.GetFieldString("product_name");
        decimal price = row.GetFieldDecimal("price");
        int ordered = row.GetFieldInteger("times_ordered");
        decimal revenue = row.GetFieldDecimal("revenue");

        Console.WriteLine($"{rank}. {name} (${price:F2})");
        Console.WriteLine($"   Times Ordered: {ordered}");
        Console.WriteLine($"   Total Revenue: ${revenue:N2}");
        Console.WriteLine();

        rank++;
    }
}

Example 3: Comparative Analysis

Table products = schema["products"];
Table order_items = schema["order_items"];

// Subquery: category average price
Expression categoryAvgPrice = new Expression("category_avg",
    new Aggregate(AggregateType.Average, products["price"])
);

// Compare each product to its category average
Query mainQuery = new Query()
    .Select([
        products["product_name"],
        products["category"],
        products["price"]
    ])
    .From(products);

// This would require a correlated subquery in SELECT for category average
// Then calculate difference in application code or use expressions

Best Practices

  1. Use EXISTS instead of IN for better performance

    • EXISTS stops at first match
    • IN processes entire subquery
    • Especially important for large subqueries
  2. Limit subquery result sets

    • Add WHERE clauses to subqueries
    • Use DISTINCT when appropriate
    • Prevents processing unnecessary data
  3. Be careful with correlated subqueries

    • Execute once per outer row
    • Can be very slow on large datasets
    • Consider joins as alternative
  4. Index subquery columns

    • Add indexes to columns in subquery WHERE
    • Improves subquery performance
    • Essential for correlated subqueries
  5. Consider alternatives

    • JOINs often faster than subqueries
    • CTEs (Common Table Expressions) for readability
    • Temporary tables for complex operations
  6. Test with realistic data

    • Subquery performance varies by size
    • Test with production-scale data
    • Monitor execution plans
  7. Avoid subqueries in SELECT when possible

    • Execute for every row
    • Consider JOINs or window functions
    • Pre-calculate when feasible
  8. Use appropriate comparison operators

    • EXISTS/NOT EXISTS for existence checks
    • IN/NOT IN for value lists
    • Comparison operators for scalar values

Next Steps