AtlantisV2 treats tests as part of the architecture. The tests are not just regression checks. They are how the core boundary is kept honest while legacy Atlantis behavior is extracted.
Required Loop
For core behavior, migrations, and bug fixes:
failing test or parity fixture
-> smallest implementation
-> refactor
-> verify
Documentation-only edits, mechanical formatting, generated files, and pure renames are exempt. Adapter wiring can be pragmatic, but adapter boundaries still need tests.
What Must Be Tested
Every core slice should test at least:
- deterministic output
- reference-time cutoff
- missing, stale, invalid, and withheld handling
- reason codes for rejects and defers
- lineage and schema version propagation
- old/new parity when replacing legacy behavior
- no venue, broker, database, or UI leakage into core objects
Current Coverage Shape
The current suite covers:
| Test group | Protects |
|---|---|
test_time_model.py | reference time, clocks, cycle/session/freshness scaffolds |
test_market_snapshot.py | typed capability blocks and strict capability lookup |
test_snapshot_builder.py | event dedupe, ordering, invalid payloads, freshness, future cutoff |
test_feature_kernel_* | feature registry execution, lookback windows, future safety |
test_pattern_engine.py | availability, staged matching, definition hashes, evidence |
test_scoring_engine.py | confluence, conflict, quality penalties, mixed-underlying guard |
test_signal_intent.py | thesis identity, preferred exposure, invalidation, target, evidence |
test_permission_decision.py | status precedence, kill switch, duplicate suppression, reason codes |
test_lifecycle_* | admission, opening, marks, closes, outcomes, idempotency |
test_adapter_boundary.py | raw adapter payloads stop at EventEnvelope |
test_legacy_outcome_parity.py | legacy path-vs-economic outcome semantics |
Agent Rules
Codex and Claude Code should follow the same policy:
- Read the relevant core docs and current implementation.
- Identify the legacy behavior if the slice migrates old Atlantis logic.
- Write the failing test or parity fixture first.
- Implement the smallest contract or behavior needed.
- Keep adapters and persistence out of core modules.
- Run
python -m pytest,ruff check ., andruff format --check .. - Update
tasks/todo.md. - Add durable lessons to
tasks/lessons.mdafter corrections.
Stop Conditions
Stop and re-plan if:
- core code starts importing Delta, Zerodha, OpenAlgo, FastAPI, TimescaleDB, or React
- a feature reads hidden latest data instead of reference-time-bounded input
- a missing value is collapsed into
0.0 - a broker order leaks into signal intent, permission, lifecycle, or outcome records
- replay and live would require different object shapes
Those are architectural failures, not implementation details.