Table of Contents

Joins

This guide covers combining data from multiple tables using JOIN operations with the fluent query syntax.


Overview

Joins combine rows from two or more tables based on related columns or predicates. Velocity provides comprehensive join support through:

  • Join - Join operation container
  • JoinType - Join types (INNER, LEFT, RIGHT, FULL, CROSS)
  • JoinOperatorType - Comparison operators for join conditions
  • JoinCondition - Complex join conditions

Join Types

JoinType enumeration provides all standard SQL join types:

JoinType SQL Equivalent Description
Inner INNER JOIN Returns only matching rows from both tables
LeftOuter LEFT JOIN Returns all rows from left table, matches from right
RightOuter RIGHT JOIN Returns all rows from right table, matches from left
FullOuter FULL OUTER JOIN Returns all rows from both tables
Cross CROSS JOIN Returns Cartesian product (all combinations)

Basic Joins

INNER JOIN

Returns only rows that have matching values in both tables:

using YndigoBlue.Velocity.Model;
using YndigoBlue.Velocity.Enums;

Table orders = schema["orders"];
Table customers = schema["customers"];

// Join on shared column name
var join = new Join(orders, customers, "customer_id");

Query query = new Query()
    .Select([
        orders["order_id"],
        orders["order_date"],
        customers["customer_name"],
        customers["email"]
    ])
    .From(join);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(query))
{
    foreach (Result row in results)
    {
        int orderId = row.GetFieldInteger("order_id");
        DateTime orderDate = row.GetFieldDateTime("order_date");
        string customerName = row.GetFieldString("customer_name");

        Console.WriteLine($"Order #{orderId} - {customerName} ({orderDate:d})");
    }
}

Generated SQL:

SELECT orders.order_id, orders.order_date, customers.customer_name, customers.email
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id

LEFT JOIN

Returns all rows from the left table, with matching rows from the right table (NULL if no match):

Table orders = schema["orders"];
Table customers = schema["customers"];

// Include all orders, even those without customer records
var join = new Join(orders, customers, "customer_id", JoinType.LeftOuter);

Query query = new Query()
    .Select([
        orders["order_id"],
        orders["total"],
        customers["customer_name"]
    ])
    .From(join);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(query))
{
    foreach (Result row in results)
    {
        int orderId = row.GetFieldInteger("order_id");
        decimal total = row.GetFieldDecimal("total");

        if (row.IsNull("customer_name"))
        {
            Console.WriteLine($"Order #{orderId} - Unknown Customer (${total:F2})");
        }
        else
        {
            string customerName = row.GetFieldString("customer_name");
            Console.WriteLine($"Order #{orderId} - {customerName} (${total:F2})");
        }
    }
}

Generated SQL:

SELECT orders.order_id, orders.total, customers.customer_name
FROM orders
LEFT JOIN customers ON orders.customer_id = customers.customer_id

RIGHT JOIN

Returns all rows from the right table, with matching rows from the left table:

Table customers = schema["customers"];
Table orders = schema["orders"];

// Include all customers, even those with no orders
var join = new Join(orders, customers, "customer_id", JoinType.RightOuter);

Expression orderCount = new Expression("order_count",
    new Aggregate(AggregateType.Count, orders["order_id"])
);

Query query = new Query()
    .Select([
        customers["customer_name"],
        orderCount
    ])
    .From(join)
    .GroupBy(customers["customer_name"]);

Generated SQL:

SELECT customers.customer_name, COUNT(orders.order_id) AS order_count
FROM orders
RIGHT JOIN customers ON orders.customer_id = customers.customer_id
GROUP BY customers.customer_name

FULL OUTER JOIN

Returns all rows from both tables, with NULLs where no match exists:

Table orders = schema["orders"];
Table customers = schema["customers"];

var join = new Join(orders, customers, "customer_id", JoinType.FullOuter);

Query query = new Query()
    .Select([
        orders["order_id"],
        customers["customer_name"]
    ])
    .From(join);

Generated SQL:

SELECT orders.order_id, customers.customer_name
FROM orders
FULL OUTER JOIN customers ON orders.customer_id = customers.customer_id

Joins with Different Column Names

When join columns have different names in each table:

Table orders = schema["orders"];
Table customers = schema["customers"];

// Join orders.customer_id with customers.id
var join = new Join(orders, "customer_id", customers, "id");

Query query = new Query()
    .Select([
        orders["order_id"],
        customers["customer_name"]
    ])
    .From(join);

Generated SQL:

SELECT orders.order_id, customers.customer_name
FROM orders
INNER JOIN customers ON orders.customer_id = customers.id

Table Aliases

Basic Aliases

Use aliases to shorten table references or disambiguate columns:

Table orders = schema["orders"];
Table customers = schema["customers"];

// Use short aliases 'o' and 'c'
var join = new Join(orders, "o", customers, "c", "customer_id");

Query query = new Query()
    .Select([
        orders["order_id"],
        customers["customer_name"]
    ])
    .From(join);

Generated SQL:

SELECT o.order_id, c.customer_name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id

Self-Join with Aliases

Join a table to itself (required aliases):

Table employees = schema["employees"];

// Find employees and their managers
var join = new Join(employees, "emp", employees, "mgr", "manager_id");

Query query = new Query()
    .Select([
        (employees["employee_name"], "employee_name"),
        (employees["employee_name"], "manager_name")  // Second reference needs alias context
    ])
    .From(join);

Generated SQL:

SELECT emp.employee_name AS employee_name, mgr.employee_name AS manager_name
FROM employees emp
INNER JOIN employees mgr ON emp.manager_id = mgr.manager_id

Multi-Table Joins

Three or More Tables

Join multiple tables in a single query:

Table orders = schema["orders"];
Table customers = schema["customers"];
Table order_items = schema["order_items"];
Table products = schema["products"];

// Join using tuple list - simplest for shared column names
var join = new Join([
    (orders, customers, "customer_id"),
    (orders, order_items, "order_id"),
    (order_items, products, "product_id")
]);

Query query = new Query()
    .Select([
        customers["customer_name"],
        orders["order_date"],
        products["product_name"],
        order_items["quantity"]
    ])
    .From(join);

Generated SQL:

SELECT customers.customer_name, orders.order_date, products.product_name, order_items.quantity
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id
INNER JOIN order_items ON orders.order_id = order_items.order_id
INNER JOIN products ON order_items.product_id = products.product_id

Multi-Table Joins with Different Column Names

var join = new Join([
    (orders, "o", "customer_id", customers, "c", "id"),
    (orders, "o", "order_id", order_items, "oi", "order_id"),
    (order_items, "oi", "product_id", products, "p", "id")
]);

Query query = new Query()
    .Select([
        customers["customer_name"],
        orders["order_date"],
        products["product_name"]
    ])
    .From(join);

Generated SQL:

SELECT c.customer_name, o.order_date, p.product_name
FROM orders o
INNER JOIN customers c ON o.customer_id = c.id
INNER JOIN order_items oi ON o.order_id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id

Non-Equi Joins

Joins using operators other than equality:

Greater Than Join

Table products = schema["products"];
Table price_tiers = schema["price_tiers"];

// Join products to their price tier based on range
var join = new Join(products, "p", "price", price_tiers, "pt", "min_price", JoinType.Inner, JoinOperatorType.GreaterThanOrEqualTo);

Query query = new Query()
    .Select([
        products["product_name"],
        products["price"],
        price_tiers["tier_name"]
    ])
    .From(join);

Generated SQL:

SELECT p.product_name, p.price, pt.tier_name
FROM products p
INNER JOIN price_tiers pt ON p.price >= pt.min_price

Range Join

Table orders = schema["orders"];
Table shipping_rates = schema["shipping_rates"];

// Join orders to shipping rates based on order total range
var join = new Join();

// This requires multiple conditions - see Multi-Condition Joins
Filter rangeFilter = new Filter([
    new Criterion<Column>(
        orders["total"],
        ConditionalType.GreaterThanOrEqualTo,
        shipping_rates["min_amount"]
    ),
    new BooleanItem(BooleanType.And),
    new Criterion<Column>(
        orders["total"],
        ConditionalType.LessThan,
        shipping_rates["max_amount"]
    )
]);

var complexJoin = new Join(orders, shipping_rates, rangeFilter);

Query query = new Query()
    .Select([
        orders["order_id"],
        orders["total"],
        shipping_rates["rate"]
    ])
    .From(complexJoin);

Not Equal Join

Table products = schema["products"];
Table product_alternatives = schema["product_alternatives"];

// Find product alternatives (different products in same category)
var join = new Join(products, "p1", "category", products, "p2", "category");

// Add filter for product_id != alternative_product_id
Filter notSameProduct = new Filter(
    new Criterion<Column>(
        products["product_id"],
        ConditionalType.NotEquals,
        products["product_id"]  // Reference to second table
    )
);

Query query = new Query()
    .Select([
        (products["product_name"], "product"),
        (products["product_name"], "alternative")  // Second reference
    ])
    .From(join)
    .Where(notSameProduct);

Multi-Condition Joins

Join on multiple columns or complex conditions:

Multiple Column Equality

Table orders = schema["orders"];
Table customers = schema["customers"];

// Join on both customer_id AND region
Filter multiCondition = new Filter([
    new Criterion<Column>(
        orders["customer_id"],
        ConditionalType.Equals,
        customers["customer_id"]
    ),
    new BooleanItem(BooleanType.And),
    new Criterion<Column>(
        orders["region"],
        ConditionalType.Equals,
        customers["region"]
    )
]);

var join = new Join(orders, customers, multiCondition);

Query query = new Query()
    .Select([
        orders["order_id"],
        customers["customer_name"],
        customers["region"]
    ])
    .From(join);

Generated SQL:

SELECT orders.order_id, customers.customer_name, customers.region
FROM orders
INNER JOIN customers ON (orders.customer_id = customers.customer_id AND orders.region = customers.region)

Complex Join Conditions

Table employees = schema["employees"];

// Self-join with multiple conditions for organizational hierarchy
Filter hierarchyCondition = new Filter([
    new Criterion<Column>(
        employees["manager_id"],
        ConditionalType.Equals,
        employees["employee_id"]
    ),
    new BooleanItem(BooleanType.And),
    new Criterion<Column>(
        employees["department"],
        ConditionalType.Equals,
        employees["department"]
    )
]);

var join = new Join(employees, "e", employees, "m", hierarchyCondition);

Query query = new Query()
    .Select([
        (employees["employee_name"], "employee"),
        (employees["employee_name"], "manager"),
        (employees["department"], "department")
    ])
    .From(join);

Function-Based Joins

Use functions as join predicates (especially for geospatial operations):

Geospatial Join

Table regions = schema["regions"];
Table locations = schema["locations"];

// Join where location intersects region
var join = new Join(
    regions,
    locations,
    new Function(FunctionType.SpatialIntersects, regions["geom"], locations["geom"])
);

Query query = new Query()
    .Select([
        (regions["name"], "region_name"),
        (locations["name"], "location_name")
    ])
    .From(join);

Generated SQL:

SELECT regions.name AS region_name, locations.name AS location_name
FROM regions
INNER JOIN locations ON ST_Intersects(regions.geom, locations.geom)

Multi-Table Function-Based Joins

Table regions = schema["regions"];
Table locations = schema["locations"];
Table routes = schema["routes"];

// Find regions with intersecting locations and routes
var join = new Join([
    (regions, locations, new Function(FunctionType.SpatialIntersects, regions["geom"], locations["geom"])),
    (regions, routes, new Function(FunctionType.SpatialIntersects, regions["geom"], routes["geom"]))
]);

Query query = new Query()
    .Select([
        (regions["name"], "region"),
        (locations["name"], "location"),
        (routes["name"], "route")
    ])
    .From(join);

Generated SQL:

SELECT regions.name AS region, locations.name AS location, routes.name AS route
FROM regions
INNER JOIN locations ON ST_Intersects(regions.geom, locations.geom)
INNER JOIN routes ON ST_Intersects(regions.geom, routes.geom)

CROSS JOIN

Create Cartesian product (all combinations):

Basic Cross Join

Table colors = schema["colors"];
Table sizes = schema["sizes"];

// Generate all color/size combinations
var join = new Join(colors, sizes, JoinType.Cross);

Query query = new Query()
    .Select([
        colors["color_name"],
        sizes["size_name"]
    ])
    .From(join);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(query))
{
    // If colors has 5 rows and sizes has 3 rows, result has 15 rows
    foreach (Result row in results)
    {
        string color = row.GetFieldString("color_name");
        string size = row.GetFieldString("size_name");

        Console.WriteLine($"{color} - {size}");
    }
}

Generated SQL:

SELECT colors.color_name, sizes.size_name
FROM colors
CROSS JOIN sizes

Output:

Red - Small
Red - Medium
Red - Large
Blue - Small
Blue - Medium
Blue - Large
...

Joins with WHERE Clauses

Filter joined results:

Filter After Join

Table orders = schema["orders"];
Table customers = schema["customers"];

var join = new Join(orders, customers, "customer_id");

// Filter for recent high-value orders
Filter filter = new Filter([
    new Criterion<DateTime>(
        orders["order_date"],
        ConditionalType.GreaterThan,
        DateTime.Now.AddMonths(-6)
    ),
    new BooleanItem(BooleanType.And),
    new Criterion<decimal>(
        orders["total"],
        ConditionalType.GreaterThan,
        1000.00M
    )
]);

Query query = new Query()
    .Select([
        orders["order_id"],
        orders["order_date"],
        orders["total"],
        customers["customer_name"]
    ])
    .From(join)
    .Where(filter)
    .OrderBy(orders["order_date"], OrderClauseType.Descending);

Generated SQL:

SELECT orders.order_id, orders.order_date, orders.total, customers.customer_name
FROM orders
INNER JOIN customers ON orders.customer_id = customers.customer_id
WHERE orders.order_date > '2024-09-01' AND orders.total > 1000.00
ORDER BY orders.order_date DESC

Joins with Subqueries

Use subqueries in joined queries:

Subquery in JOIN

Table orders = schema["orders"];
Table customers = schema["customers"];

// Subquery to get customer IDs with > 5 orders
Query customerSubquery = new Query()
    .Select(orders["customer_id"])
    .From(orders)
    .GroupBy(orders["customer_id"])
    .Having(new Filter(
        new Criterion<int>(
            new Expression("order_count", new Aggregate(AggregateType.Count, new ArithmeticOperator(ArithmeticType.All))),
            ConditionalType.GreaterThan,
            5
        )
    ));

// Join customers with high-order-volume customer IDs
var join = new Join(customers, orders, "customer_id");

Filter inSubqueryFilter = new Filter(
    new Criterion<Query>(
        customers["customer_id"],
        ConditionalType.In,
        customerSubquery
    )
);

Query query = new Query()
    .Select([
        customers["customer_name"],
        orders["order_date"],
        orders["total"]
    ])
    .From(join)
    .Where(inSubqueryFilter);

Complete Examples

Example 1: Order Details Report

Table orders = schema["orders"];
Table customers = schema["customers"];
Table order_items = schema["order_items"];
Table products = schema["products"];

// Join all related tables
var join = new Join([
    (orders, customers, "customer_id"),
    (orders, order_items, "order_id"),
    (order_items, products, "product_id")
]);

// Calculate line total
Expression lineTotal = new Expression("line_total", [
    order_items["quantity"],
    new ArithmeticOperator(ArithmeticType.Multiply),
    order_items["unit_price"]
]);

Query query = new Query()
    .Select([
        orders["order_id"],
        customers["customer_name"],
        products["product_name"],
        order_items["quantity"],
        order_items["unit_price"],
        lineTotal
    ])
    .From(join)
    .OrderBy(orders["order_id"], OrderClauseType.Ascending);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(query))
{
    int currentOrderId = -1;
    decimal orderTotal = 0;

    foreach (Result row in results)
    {
        int orderId = row.GetFieldInteger("order_id");

        // Print order header when order changes
        if (orderId != currentOrderId)
        {
            if (currentOrderId != -1)
            {
                Console.WriteLine($"    Order Total: ${orderTotal:F2}");
                Console.WriteLine();
            }

            currentOrderId = orderId;
            orderTotal = 0;

            string customerName = row.GetFieldString("customer_name");
            Console.WriteLine($"Order #{orderId} - {customerName}");
            Console.WriteLine("----------------------------------------");
        }

        // Print order item
        string product = row.GetFieldString("product_name");
        int qty = row.GetFieldInteger("quantity");
        decimal price = row.GetFieldDecimal("unit_price");
        decimal total = row.GetFieldDecimal("line_total");

        Console.WriteLine($"  {product}: {qty} × ${price:F2} = ${total:F2}");

        orderTotal += total;
    }

    // Print last order total
    if (currentOrderId != -1)
    {
        Console.WriteLine($"    Order Total: ${orderTotal:F2}");
    }
}

Example 2: Customer Lifetime Value

Table customers = schema["customers"];
Table orders = schema["orders"];

var join = new Join(customers, orders, "customer_id", JoinType.LeftOuter);

Expression orderCount = new Expression("order_count",
    new Aggregate(AggregateType.Count, orders["order_id"])
);

Expression totalSpent = new Expression("total_spent",
    new Aggregate(AggregateType.Sum, orders["total"])
);

Expression avgOrderValue = new Expression("avg_order_value",
    new Aggregate(AggregateType.Average, orders["total"])
);

Expression lastOrderDate = new Expression("last_order_date",
    new Aggregate(AggregateType.Max, orders["order_date"])
);

Query query = new Query()
    .Select([
        customers["customer_id"],
        customers["customer_name"],
        customers["email"],
        orderCount,
        totalSpent,
        avgOrderValue,
        lastOrderDate
    ])
    .From(join)
    .GroupBy([
        customers["customer_id"],
        customers["customer_name"],
        customers["email"]
    ])
    .OrderBy(totalSpent, OrderClauseType.Descending);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(query))
{
    Console.WriteLine("Customer Lifetime Value Report");
    Console.WriteLine("==============================\n");

    foreach (Result row in results)
    {
        int customerId = row.GetFieldInteger("customer_id");
        string name = row.GetFieldString("customer_name");
        string email = row.GetFieldString("email");
        int orders = row.GetFieldInteger("order_count");

        Console.WriteLine($"{name} ({email})");
        Console.WriteLine($"  Customer ID: {customerId}");
        Console.WriteLine($"  Total Orders: {orders}");

        if (orders > 0)
        {
            decimal total = row.GetFieldDecimal("total_spent");
            decimal avg = row.GetFieldDecimal("avg_order_value");
            DateTime lastOrder = row.GetFieldDateTime("last_order_date");

            Console.WriteLine($"  Lifetime Value: ${total:N2}");
            Console.WriteLine($"  Avg Order: ${avg:F2}");
            Console.WriteLine($"  Last Order: {lastOrder:d}");
        }
        else
        {
            Console.WriteLine($"  No orders yet");
        }

        Console.WriteLine();
    }
}

Example 3: Employee Hierarchy

Table employees = schema["employees"];

// Self-join to find employees and their managers
var join = new Join(employees, "e", employees, "m", "manager_id", JoinType.LeftOuter);

Query query = new Query()
    .Select([
        (employees["employee_id"], "employee_id"),
        (employees["employee_name"], "employee_name"),
        (employees["title"], "employee_title"),
        (employees["employee_name"], "manager_name"),
        (employees["title"], "manager_title")
    ])
    .From(join)
    .OrderBy(employees["employee_name"], OrderClauseType.Ascending);

using (Manager manager = new Manager(connection))
using (ResultSet results = manager.Retrieve(query))
{
    Console.WriteLine("Employee Hierarchy");
    Console.WriteLine("==================\n");

    foreach (Result row in results)
    {
        int empId = row.GetFieldInteger("employee_id");
        string empName = row.GetFieldString("employee_name");
        string empTitle = row.GetFieldString("employee_title");

        Console.Write($"{empName} ({empTitle})");

        if (!row.IsNull("manager_name"))
        {
            string mgrName = row.GetFieldString("manager_name");
            string mgrTitle = row.GetFieldString("manager_title");
            Console.Write($" → Reports to: {mgrName} ({mgrTitle})");
        }
        else
        {
            Console.Write($" → Top Level");
        }

        Console.WriteLine();
    }
}

Best Practices

  1. Choose the right join type

    • INNER for only matching records
    • LEFT/RIGHT to include all from one side
    • FULL OUTER for all from both sides
    • CROSS for Cartesian products (use sparingly)
  2. Index join columns

    • Add indexes to columns used in JOIN conditions
    • Dramatically improves query performance
    • Essential for large tables
  3. Use table aliases

    • Makes queries more readable
    • Required for self-joins
    • Helpful with many tables
  4. Filter early when possible

    • Use WHERE to reduce rows before joining
    • Can significantly improve performance
    • Less data to process in the join
  5. Be careful with CROSS JOINs

    • Creates rows × rows combinations
    • Can generate enormous result sets
    • Use WHERE to filter results
  6. Mind NULL handling in OUTER JOINs

    • Always check IsNull() for outer-joined columns
    • NULL values indicate no match
    • Can affect aggregates and calculations
  7. Avoid joining on functions

    • Function-based joins can't use indexes
    • May have poor performance on large tables
    • Consider pre-calculating values when possible
  8. Test with realistic data

    • Verify join conditions return expected results
    • Check for duplicate or missing rows
    • Validate performance with production-size data

Next Steps