CK Tile Index#
CK Tile documentation structure:
- Introduction and Motivation - Why Tile Distribution Matters
- CK Tile buffer view
- Tensor Views - Multi-Dimensional Structure
- Tile Distribution - The Core API
- Overview
- Complete Tile Distribution System Overview
- Coordinate System Architecture
- What is Tile Distribution?
- Problem Space Mapping
- Creating a TileDistribution
- Hierarchical Decomposition
- Advanced Example: Matrix Multiplication Distribution
- Work Distribution Pattern
- Memory Access Patterns
- Transformation Pipeline
- Performance Comparison
- Summary
- Next Steps
- Coordinate Systems - The Mathematical Foundation
- Overview
- The Five Coordinate Spaces
- Thread Identification
- Logical Work Organization
- Physical Tensor Coordinates
- The Core Transformation: P + Y → X
- Replication and Cooperation
- Memory Linearization
- Complete Pipeline Example
- Real-World Example: Matrix Multiplication
- Performance Implications
- Summary
- Next Steps
- Terminology Reference - Key Concepts and Definitions
- Tensor Adaptors - Chaining Transformations
- Overview
- TensorAdaptor Basics
- Transpose Adaptor: Dimension Reordering
- Single-Stage Adaptors: Custom Transform Chains
- Chaining Adaptors: Building Complex Transformations
- Transform Addition: Extending Existing Adaptors
- Advanced Patterns
- Common Transform Chains
- Key Concepts Summary
- Key C++ Patterns in Composable Kernel
- Next Steps
- Individual Transform Operations
- Tensor Descriptors - Complete Tensor Specifications
- Tile Window - Data Access Gateway
- Overview
- TileWindow Architecture
- What is a TileWindow?
- LoadStoreTraits - The Access Pattern Engine
- Space-Filling Curves for Memory Access
- TileWindow Data Flow
- Creating and Using TileWindow
- The Load Operation Deep Dive
- Load Operation Architecture
- Memory Access Patterns
- Window Movement and Updates
- Store Operations with Vectorization
- Complete Load-Compute-Store Pipeline
- Performance Characteristics
- Best Practices
- Summary
- Next Steps
- LoadStoreTraits - Memory Access Optimization Engine
- Space-Filling Curves - Optimal Memory Traversal
- Static Distributed Tensor
- Convolution Implementation with CK Tile
- Advanced Coordinate Movement
- Load Datat Share Index Swapping
- Memory Swizzling with Morton Ordering
- Tensor Coordinates
- Sweep Tile
- Encoding Internals
- Thread Mapping - Connecting to Hardware
- CK Tile Hardware Documentation