Contributing#

Thank you for your interest in contributing to chronocratic-models. This guide covers the development workflow and tooling used in the project.

Development Setup#

The project uses uv for environment and dependency management.

# Clone the repository
git clone https://github.com/chronocratic/chronocratic-models.git
cd chronocratic-models

# Sync the development environment
uv sync

Running Tests#

# Run the full test suite
uv run pytest tests/

# Run tests with coverage
uv run pytest tests/ --cov=src/chronocratic/models --cov-report=xml

Linting and Formatting#

The project uses ruff for linting and formatting.

# Check for linting issues
uv run ruff check src/ tests/

# Auto-fix linting issues
uv run ruff check --fix src/ tests/

# Format code
uv run ruff format src/ tests/

Type Checking#

The project uses ty for static type checking.

# Run type checking
uv run ty check src/

Building Documentation#

# Install documentation dependencies
uv sync --extra docs

# Build the documentation
uv run sphinx-build -b html docs/ docs/_build/

Adding Changelog Fragments#

The project uses towncrier for managing changelog entries. Each PR should include a changelog fragment in the changelog.d/ directory.

# Create a fragment (e.g., for a new feature in PR #42)
echo "Added new TimeVAE model for generative time-series encoding." > changelog.d/42.added.md

Fragment types: added, changed, deprecated, removed, fixed, security.

# Verify fragments before merging
uv run towncrier check --compare-with origin/dev

See changelog.d/README.md for detailed fragment instructions.

Code Style#

  • Use snake_case for functions and variables, PascalCase for classes.

  • Write Google-style docstrings for all public functions and classes.

  • Use type hints for all function signatures and return types.

  • Prefer functional programming patterns and modular code organization.

  • Use keyword arguments for all function calls.

Parameters Consistency#

All model classes and their config dataclasses follow shared parameter conventions. These rules ensure that Model(**vars(ModelParameters(...))) works across the entire library.

Canonical Hyperparameter Names#

Use these exact names. Do not invent alternatives.

Canonical Name

Description

Do NOT Use

input_dims

Number of input features/channels

feat_dim, n_in, input_channels

hidden_dims

Hidden representation size

d_model

depth

Number of layers

num_layers

dropout_rate

Dropout probability

dropout

num_heads

Attention head count

n_heads

feedforward_dims

FFN intermediate dimension

dim_feedforward

sequence_length

Temporal dimension

max_seq_len, seq_len

conv_kernel_size

Convolution kernel size

kernel_size

weight_decay

L2 regularization

l2_reg

output_dims

Output/embedding dimension

final_out_channels

reconstruction_weight

VAE reconstruction term

reconstruction_wt

Model-specific names are acceptable only when genuinely unique to that model (e.g., latent_dim for TimeVAE, embedding_dims for Series2Vec).

Config-to-Model Contract#

Config dataclasses and model __init__ signatures must mirror each other exactly.

  • Every config field must have a matching __init__ parameter with the same name and same default value.

  • Use @dataclass(kw_only=True) on all config classes.

  • Defaults must be declared in both the config dataclass and the model __init__. This allows partial config instantiation.

  • Verify with Model(**vars(ModelParameters(...))) — it must not raise.

  • Use save_hyperparameters(ignore=["augmentation"]) in Lightning modules. Non-callable config values should not be ignored.

Example:

# Config
@dataclass(kw_only=True)
class MyModelParameters:
    input_dims: int
    hidden_dims: int = 64
    dropout_rate: float = 0.1

# Model
class MyModel(pl.LightningModule):
    def __init__(
        self,
        input_dims: int,
        hidden_dims: int = 64,
        dropout_rate: float = 0.1,
    ):
        super().__init__()
        self._input_dims = input_dims
        self._hidden_dims = hidden_dims
        self._dropout_rate = dropout_rate
        save_hyperparameters(ignore=["augmentation"])

Tuple over List for Sequence Defaults#

All list-typed hyperparameters use tuple[T, ...] instead of list[T].

Hyperparameter sequences are never mutated at runtime — only iterated or indexed. Tuples allow direct defaults without field(default_factory=...) boilerplate and enforce immutability.

Do:

kernel_sizes: tuple[int, ...] = (1, 2, 4, 8, 16, 32, 64, 128)
hidden_layer_sizes: tuple[int, ...] = (50, 100, 200)
lr_step: tuple[int, ...] | None = None  # | None only when truly optional

Don’t:

kernel_sizes: list[int] = [1, 2, 4]                   # mutable default
kernel_sizes: list[int] = field(default_factory=...)  # unnecessary boilerplate

If an internal component expects a list, convert at the boundary: list(self._kernel_sizes). Use Sequence[T] in internal type annotations to match the config layer.

Default Value Sourcing#

Source reference repository code, not papers. Papers omit implementation details; cloned repos are ground truth.

  1. Clone the original implementation’s repository.

  2. Check actual constructor defaults and CLI argument defaults.

  3. Document any deliberate divergence in .planning/audits/.

Hardcoded Constant Extraction#

Architecture-defining constants (channel widths, kernel sizes, dilation rates, projection dimensions) must be extracted to config parameters rather than hardcoded in encoder/decoder implementations.

Do:

# Config
encoder_channels: tuple[int, ...] = (128, 256, 128)
encoder_kernels: tuple[int, ...] = (7, 5, 3)

# Encoder
for channels, kernel in zip(self._encoder_channels, self._encoder_kernels):
    layer = nn.Conv1d(...)

Don’t:

# Hardcoded in encoder
nn.Conv1d(in_channels, 128, kernel_size=7),
nn.Conv1d(128, 256, kernel_size=5),
nn.Conv1d(256, 128, kernel_size=3),

Out of scope for extraction: optimizer types, gradient clipping norms, LayerNorm epsilon values, structural invariants (e.g., fixed MaxPool kernel sizes that are part of the architecture definition).

self._{name} Attribute Storage#

Store all hyperparameters as private attributes with the self._{name} prefix in model __init__.

def __init__(self, input_dims: int, hidden_dims: int = 64):
    self._input_dims = input_dims
    self._hidden_dims = hidden_dims

Public attributes are reserved for computed values (e.g., self.criterion, self.loss_fn) and submodules (self._encoder, self._decoder).

Literal vs. Enum Choices#

Pattern

When to Use

Example

Enum (StrEnum)

Closed, small set of values

MaskMode, RecurrentCellType

str with default

Broad options, open-ended

pos_encoding: str = "fixed", activation: str = "gelu"

Unconstrained numeric

Values users may override freely

dropout_rate: float = 0.01

Literal

Only in config dataclasses, not model __init__

OptimizerName = Literal["Adam", "RAdam", "AdamW"]

Never use Literal to restrict numeric values — users may legitimately override them. Keep model __init__ signatures using str or concrete Enum types; reserve Literal for config-layer type narrowing.

Cross-Model Consistency Checklist#

Before merging a model change, verify:

  • [ ] All parameter names match the canonical names table above.

  • [ ] Config dataclass fields exactly match __init__ parameter names.

  • [ ] Default values are identical in config and model signatures.

  • [ ] Model(**vars(ModelParameters(...))) instantiates without error.

  • [ ] Sequence-typed HPs use tuple[T, ...], not list[T].

  • [ ] All HPs stored as self._{name} private attributes.

  • [ ] save_hyperparameters(ignore=["augmentation"]) is called (if applicable).

  • [ ] Default values are sourced from reference repos, not guessed.

  • [ ] Architecture constants are extracted to config, not hardcoded.

  • [ ] Added/updated tests cover the config splat contract.

Tensor Shape Convention#

All model entry points in this library use (B, T, C) (batch, time, channels) as the input tensor layout. This matches PyTorch’s DataLoader output convention and the transformers ecosystem.

Encoder-Owns-the-Transpose Rule#

Conv1d-based encoders must transpose (B, T, C) to (B, C, T) as the first line of their forward() method. The model wrapper, training step, and loss functions should never transpose.

The encoder owns the transpose.

def forward(self, x: torch.Tensor) -> torch.Tensor:
    """Encode (B, T, C) input into (B, output_dims) representation."""
    x = x.transpose(1, 2)  # (B, T, C) -> (B, C, T) for Conv1d
    return self.layers(x)

Existing examples in the codebase:

  • TimeVAEEncoder.forward()transpose(1, 2) at entry

  • Series2VecNetwork._to_channels_first() — layout conversion helper

  • Dilated encoders — transpose(1, 2) in _common_forward()

  • FCNEncoder.forward()transpose(1, 2) at entry (D-01)

  • TCCEncoder.forward()transpose(1, 2) at entry (D-01)

Augmentation Axes#

Augmentation primitives (Scaling, Permutation) operate on the raw (B, T, C) data before encoding. Configure axis parameters accordingly:

  • ScalingParameters(channel_dim=-1) — scales along the channel axis (dim=2 in 3-D)

  • PermutationParameters(time_dim=1) — permutes along the time axis

Testing#

Always use asymmetric shapes (T != C) in encoder tests to catch transpose regressions. For example, torch.randn(4, 50, 3) for (B, T, C) with T=50 and C=3 will crash if the encoder drops its transpose, because Conv1d would see 50 channels instead of 3.

Device Compatibility (CPU / CUDA / MPS)#

Code in this library must work correctly on all PyTorch backends. Follow these five rules to ensure cross-device compatibility.

1. Create on input’s device, don’t transfer after#

Build auxiliary tensors on the same device as the input, instead of creating on CPU and then calling .to().

Do:

labels = torch.eye(k - 1, dtype=torch.float32, device=z1.device)
mask = torch.zeros(batch_size, seq_len, device=x.device)

Don’t:

labels = torch.eye(k - 1, dtype=torch.float32).to(z1.device)  # CPU allocation then transfer

2. Loss functions inherit device from first tensor argument#

Every loss function must derive its working device from its first tensor argument. Tensors created inside forward() or loss computation must use device=input.device.

Gold standard pattern: NTXentLoss._correlated_mask() in tstcc/losses.py:

mask = ~torch.eye(n, dtype=torch.bool, device=device)
idx = torch.arange(batch_size, device=device)

3. Host-side libraries need explicit round-trip#

Libraries like SciPy only accept host (numpy) arrays. On MPS tensors, calling .numpy() without .cpu() first raises RuntimeError. Explicitly round-trip through CPU.

from scipy.signal import lfilter
import numpy as np

def _filter_on_device(b: np.ndarray, a: np.ndarray, data: torch.Tensor) -> torch.Tensor:
    filtered = lfilter(b, a, data.cpu().numpy())
    return torch.as_tensor(filtered, dtype=torch.float32, device=data.device)

4. CUDA-only kernels fall back to CPU on MPS — that is acceptable#

If a kernel has no MPS equivalent (e.g., SoftDTW’s CUDA kernel), falling back to CPU when x.is_cuda is False is the correct behavior. Document the fallback with a comment to prevent future “fixes” that duplicate the logic.

See: Series2Vec._build_soft_dtw() — MPS tensors have is_cuda=False, so they correctly use the CPU path.

5. pin_memory=True only when no gradients flow#

pin_memory=True in DataLoader stages a CPU buffer for non-blocking H2D copies. When gradients are enabled, pinning can warn or error because tensors require grad and pinning allocates pagelocked memory.

Do:

loader = DataLoader(dataset, pin_memory=not gradient_enabled)

Lint Guard#

Run bash scripts/check_device.sh to detect bare tensor constructors (torch.eye, torch.zeros, torch.ones, torch.arange, etc.) without device= in model source files. Legitimate exceptions are annotated with # device-ok.