docs: mark god-file split complete in remediation plan

This commit is contained in:
유창욱 2026-06-20 22:36:02 +09:00
parent 6184d0f464
commit 2eb7bd3b8b

View file

@ -530,21 +530,32 @@ All HIGH/MEDIUM/LOW findings implemented, tested, and committed on branch
- Phase 8 (frontend safeUrl/apiJson/onerror) → `f8aa10f`; protocol-relative
URL fix after security review → `7317bfb`.
### Phase 9 (god-file split) — partial
Done as behavior-preserving module extractions (suite-gated), `sqlite_store.py`
5333 → ~4955 lines:
- `store_url_utils.py` (URL helpers) → `e66f9d5`
- `store_remote_fetch.py` (fetch + SSRF + opener) and `store_schema.py`
(DDL/typed-columns/constraint migration) → `da91775`
### Phase 9 (god-file split) — complete
Behavior-preserving, each step gated on the full suite (400 passed). The
5119-line `sqlite_store.py` god file is now **724 lines** (the `CopyrighterStore`
facade: `__init__`, `initialize`, seed, bootstrap, `_search_coverage`, review,
decision, providers, media, audit) composed from focused modules + mixins:
Remaining (in progress / follow-up): the `CopyrighterStore` class methods are
the bulk and need class-level decomposition (mixin modules), which first
requires extracting the shared module-level helpers (serialization/text/id +
the 670-line `_PageImageParser` + css/page extraction, currently coupled to
`_normalized_image_url`/`_unique_texts`) into shared modules to avoid circular
imports. NOT a pure no-op like the prior extractions; gate each step on the
full suite. Cross-file URL-helper dedup (Task 28) is intentionally NOT done —
the integration adapters' suffix policy diverges from the store's (`.svg`).
Helper modules (stateless): `store_url_utils` (52), `store_text` (27),
`store_constants` (113), `store_remote_fetch` (125, fetch + SSRF opener),
`store_schema` (257, DDL/migration), `store_serialization` (597, payload/domain
helpers), `store_page_scrape` (976, HTML/CSS/JSON image-URL extraction).
Class mixins (composed into `CopyrighterStore` via inheritance):
`StorePersistenceMixin` (313, connection/transaction/_put/_get),
`StoreQueueMixin` (170), `StoreSearchCandidatesMixin` (743, similarity +
candidate storage + rescore), `StoreEnrichmentMixin` (789, provider sync + face
+ rerun + auto-search), `StoreOperationsMixin` (761, knowledge lifecycle +
manual/rerun search + collection).
All modules are <800 lines except `store_page_scrape` (976) a single cohesive
HTML/CSS/JSON/srcset image-URL parser whose helpers are tightly coupled
(parser ↔ css ↔ json ↔ srcset predicates); splitting it further would fragment
one responsibility, so it is intentionally left whole. Tests that monkeypatch
`HeuristicFacePersonDetector` were updated to also patch `store_enrichment`
(where face detection now runs). Cross-file URL-helper dedup (Task 28) is
intentionally NOT done — the integration adapters' suffix policy diverges from
the store's (`.svg`).
## Deferred (explicitly out of this plan, log as follow-ups)
- Full `GovernancePolicyRegistry` role enforcement on every read/serve path (Task 2 + Task 9 cover the high-impact subset).