shard step action splits large migrations into multiple PRs by evaluating which files a codemod would modify and grouping them into shards. Each shard becomes a matrix task that creates its own PR.
How it works
Sharding workflows follow a two-node pattern:- evaluate-shards — Runs the
shardstep to discover applicable files, group them, and write shard assignments to workflow state. - apply-transforms (matrix) — Iterates over the shards with a matrix strategy, applying the codemod and creating one PR per shard.
workflow.yaml
File discovery
The engine pre-filters files before passing them to any shard method. This is shared across both built-in and custom methods. Whenjs-ast-grep is set on the shard step, the engine:
- Globs files matching
includeundertarget - Dry-runs the codemod against each file
- Keeps only files where the transform produces changes
Root directory to scan for files. Defaults to the workflow run target (project root).
State key to write shard results to. Must match the state schema key referenced by
from_state in the matrix node.Glob pattern for eligible files. Used when
js-ast-grep is not set.JSSG codemod configuration for pre-filtering. When set, the engine dry-runs the codemod and only shards files where the transform produces changes. See JSSG step parameters for the full field reference.
Built-in methods
Built-in methods handle grouping and bin-packing automatically. Set themethod.type to choose an algorithm.
Codeowner
Groups files by their owning team from.github/CODEOWNERS (or root CODEOWNERS), then bin-packs into shards.
name—"{team}-{index}"(e.g."platform-team-0")team— the owning team_meta_files— files in the shard
Method parameters
directory or codeowner.Target number of files per shard.
Minimum shard size. Trailing shards smaller than this are merged into the previous shard.
Custom shard functions
For grouping logic that built-in methods can’t express (e.g. dependency-aware clustering), pointmethod.function to a JS/TS file that runs inside the jssg engine.
Directory
Groups files by their immediate subdirectory undertarget, then bin-packs into shards.
name—"{directory}-{index}"(e.g."components-0")directory— the subdirectory path_meta_files— files in the shard
Function signature
ShardInput
| Field | Type | Description |
|---|---|---|
files | string[] | Relative paths of eligible files (pre-filtered by the engine) |
targetDir | string | Absolute path to the target directory |
previousShards | ShardResult[] | Previous shard assignments for incremental re-evaluation |
ShardResult
| Field | Type | Description |
|---|---|---|
name | string | Shard identifier (used in PR titles, branch names) |
_meta_shard | number | Shard index |
_meta_files | string[] | Files in this shard |
| Any other key | unknown | Exposed as ${{ matrix.<key> }} variables |
Fields prefixed with
_meta_ are excluded from the matrix hash. This means re-indexing shards won’t invalidate existing task identity.Available APIs
Custom shard functions run in the jssg engine with full access to:codemod:ast-grep— Parse files, match patterns, navigate ASTs. Useful for building dependency graphs from import statements.codemod:workflow— Types (ShardInput,ShardResult) and state management APIs.
Re-evaluation
When the target repo changes (files added, moved, or deleted), retry the evaluate-shards task. The engine re-evaluates shards with incremental stability:- Existing assignments are preserved — a file already assigned to shard 1 stays in shard 1.
- New files go to new shards — they never get added to shards whose tasks are already completed or in progress.
- Empty shards are dropped — if all files in a shard were deleted, its task is marked
WontDo.
previousShards in the input so the function can implement its own incremental logic.