--- title: "feat: Add Evidence Quality And Watchlist Growth" type: feat status: implemented date: 2026-05-26 origin: docs/brainstorms/2026-05-26-evidence-quality-watchlist-requirements.md --- # feat: Add Evidence Quality And Watchlist Growth ## Summary Add a decision-first feedback loop around evidence status, watchlist candidate generation, strong watchlist matching, and candidate management. The implementation will keep the existing SQLite JSON-payload pattern and extend the current operator console instead of introducing a new persistence layer. --- ## Problem Frame The current console can collect many internal, Google, Naver, and face-area web evidence items, but it does not yet let operators mark which evidence actually informed a case decision. It also promotes rejected cases too bluntly and does not create strong watchlist signals from held cases. --- ## Requirements - R1. Evidence items support operator status: used for judgment, irrelevant, false positive, and pending. (Origin R1-R3, F1, AE1-AE2) - R2. Evidence status never creates a DB candidate by itself; candidate creation happens only after case decision. (Origin R2, R4, AE2) - R3. Held and rejected case decisions automatically create persistent watchlist candidates. (Origin R5-R6, F2, AE1) - R4. Approved decisions do not create automatic candidates. (Origin R7, AE3) - R5. Watchlist candidates strongly affect future risk scoring, at roughly the same strength as confirmed DB image matches. (Origin R8, F3, AE4) - R6. Watchlist candidate matches are visually distinct from confirmed DB matches. (Origin R9, AE4) - R7. Watchlist signals never change case status automatically. (Origin R10) - R8. Operators can promote watchlist candidates to confirmed DB entries or exclude them as false positives. (Origin R11-R13, F4, AE5-AE6) - R9. Confirmed DB, watchlist candidates, and excluded candidates retain status, source decision, source evidence, and contribution counts. (Origin R14-R17) **Origin actors:** A1 operator, A2 rights risk filter, A3 DB administrator **Origin flows:** F1 evidence status marking, F2 decision-driven watchlist creation, F3 watchlist-based rediscovery, F4 candidate promotion and exclusion **Origin acceptance examples:** AE1, AE2, AE3, AE4, AE5, AE6 --- ## Scope Boundaries - No automatic approval, hold, or rejection. - No face embeddings, face similarity database, biometric template storage, or identity recognition. - No Google Image Search, Google Lens, Naver web UI automation, or scraping. - No applicant-facing exposure of evidence statuses, watchlist candidates, scoring rules, or internal reasons. - No new relational migration framework; this iteration keeps the existing JSON payload tables. ### Deferred to Follow-Up Work - Bulk watchlist cleanup, analytics dashboards, and advanced merge suggestions can follow after the core loop is proven. - Domain-wide false-positive suppression is deferred because it can hide valid evidence from large sites. --- ## Context & Research ### Relevant Code and Patterns - `src/rights_filter/server/sqlite_store.py` is the persistence and orchestration boundary. It stores JSON payloads in `submissions`, `evidence`, `knowledge_entries`, `collection_candidates`, `corrections`, and `audit_events`. - `CopyrighterStore.record_decision` currently updates case decision and creates a rejected-reference knowledge entry only for rejected cases. - `CopyrighterStore._knowledge_repository` rebuilds an in-memory repository from active `knowledge_entries` and feeds `InternalAnalyzer`. - `src/rights_filter/analysis/internal_analyzer.py` emits fingerprint evidence for knowledge-base image similarity. - `src/rights_filter/analysis/risk_scoring.py` already gives high weight to strong fingerprint matches and ignores non-contributing/queued evidence. - `web/operator-gui/app.js`, `index.html`, and `styles.css` implement the current static console, evidence grouping, decision actions, candidate collection, and knowledge DB management. - `tests/rights_filter/server/test_sqlite_store.py` is the main integration test surface for persistence behavior. - `tests/operator_gui/test_static_workbench.py` protects the UI contract without browser runtime dependencies. ### Institutional Learnings - No `docs/solutions/` directory exists in this workspace. - No `STRATEGY.md` exists; the active product strategy is captured in the brainstorm requirements documents. ### External References - No new external APIs are introduced. Existing Google/Naver/Ollama boundaries and no-scraping policy remain unchanged. --- ## Key Technical Decisions - Use `knowledge_entries` for persistent watchlist state: watchlist candidates are persistent risk references, not transient keyword collection results, so they should not live in `collection_candidates`, which is cleared on each keyword search. - Add status fields instead of new tables: JSON payload storage lets us add `entryStatus`, `originDecisionStatus`, `sourceSubmissionId`, `sourceEvidenceIds`, and `contributionCount` without schema migration complexity. - Generate candidates from the local submission image when available: the decision API passes the local image store into `record_decision`, which stores a perceptual sample fingerprint for held/rejected watchlist entries. If the image store is unavailable, the candidate is still recorded but cannot participate in image similarity until a sample fingerprint is added. - Strong watchlist scoring: watchlist similarity should use the same high-risk path as rejected-image similarity, but with separate reason text and UI group so operators can see it is not confirmed DB evidence. - False-positive suppression scope: start with exact evidence identity, URL/image URL/title, and candidate fingerprint. Do not suppress an entire provider domain from one false-positive action. - Decision-driven default evidence set: use evidence marked `used_for_judgment` when available; if none are marked, generate the watchlist candidate from the case fingerprint and top contributing evidence so held/rejected decisions still strengthen future detection. --- ## Open Questions ### Resolved During Planning - Watchlist score strength: use the same high-confidence fingerprint match behavior as confirmed/rejected DB references, with separate UI labeling. - UI distinction: add a dedicated watchlist/주의 후보 evidence group and badges rather than mixing it into confirmed internal DB evidence. - False-positive propagation: suppress exact evidence/candidate patterns first, not whole domains. ### Deferred to Implementation - Exact Korean microcopy can be adjusted while fitting existing console labels. - Exact CSS treatment should follow the existing evidence group and chip styles after visual verification. --- ## High-Level Technical Design > *This illustrates the intended approach and is directional guidance for review, not implementation specification.* ```mermaid flowchart TB Evidence[Collected evidence] --> Mark[Operator marks evidence status] Mark --> Decision[Operator decides approve / hold / reject] Decision -->|approved| NoCandidate[No automatic candidate] Decision -->|held or rejected| Watchlist[Create watchlist candidate] Watchlist --> Analyze[Future internal analysis] Confirmed[Confirmed DB entries] --> Analyze Analyze --> Score[Risk scoring] Score --> UI[Separate confirmed vs watchlist evidence groups] Watchlist --> Promote[Promote to confirmed DB] Watchlist --> Exclude[Exclude as false positive] Exclude --> Score ``` --- ## Implementation Units ```mermaid flowchart TB U1[U1 Evidence status API and payload] U2[U2 Decision-driven watchlist creation] U3[U3 Watchlist matching and scoring] U4[U4 Candidate promotion and exclusion] U5[U5 Operator UI controls] U6[U6 Docs and verification] U1 --> U2 U2 --> U3 U3 --> U4 U1 --> U5 U3 --> U5 U4 --> U5 U5 --> U6 ``` ### U1. Evidence Status API And Payload **Goal:** Let operators mark evidence as used for judgment, irrelevant, false positive, or pending without changing case decision or DB state. **Requirements:** R1, R2 **Dependencies:** None **Files:** - Modify: `src/rights_filter/server/sqlite_store.py` - Modify: `src/rights_filter/server/http_app.py` - Modify: `web/operator-gui/app.js` - Modify: `web/operator-gui/index.html` - Modify: `web/operator-gui/styles.css` - Test: `tests/rights_filter/server/test_sqlite_store.py` - Test: `tests/rights_filter/server/test_http_app.py` - Test: `tests/operator_gui/test_static_workbench.py` **Approach:** - Add a store method that updates an existing evidence payload with an operator evidence status and optional note. - Add an HTTP route for evidence status updates. - Keep evidence status inside each evidence payload so existing bootstrap/review responses include it automatically. - Treat false-positive and irrelevant evidence as non-contributing during rescore. - Keep pending evidence visible but non-final. **Execution note:** Test-first. Start with store-level tests proving status changes do not create candidates and do affect rescore contribution. **Patterns to follow:** - Existing `record_decision`, `_put`, `_evidence_by_submission`, and HTTP body parsing patterns in `src/rights_filter/server/sqlite_store.py` and `src/rights_filter/server/http_app.py`. **Test scenarios:** - Happy path: marking a Google evidence item as used for judgment persists in `review()` and `bootstrap()`. - Happy path: marking evidence as irrelevant sets it non-contributing and rescore omits its points. - Edge case: marking a missing evidence ID returns a not-found error. - Edge case: unsupported evidence status returns a validation error. - Integration: HTTP evidence status route updates the review payload. **Verification:** - Evidence status is visible in the API payload and does not create any knowledge entry or watchlist candidate by itself. --- ### U2. Decision-Driven Watchlist Creation **Goal:** Create persistent watchlist candidates automatically after held or rejected decisions, using case fingerprint evidence and judgment-used evidence. **Requirements:** R2, R3, R4, R9 **Dependencies:** U1 **Files:** - Modify: `src/rights_filter/server/sqlite_store.py` - Test: `tests/rights_filter/server/test_sqlite_store.py` **Approach:** - Extend `record_decision` so `held` and `rejected` decisions create or update a watchlist entry. - Stop treating rejected decisions as immediately confirmed DB entries; rejected decisions should create watchlist entries first, then operators can promote them. - Populate watchlist payloads with source submission, origin decision status, source evidence IDs, sample fingerprints, memo, active/excluded state, and contribution count. - Use the case's generated fingerprint evidence as the primary sample fingerprint source. - Prefer evidence marked used for judgment; if none is marked, fallback to top contributing evidence plus the case fingerprint so strict detection still grows. - Ensure repeated decisions update the existing source-submission watchlist entry instead of creating duplicates. **Execution note:** Test-first around decision outcomes before changing the existing rejected-entry behavior. **Patterns to follow:** - Existing automatic rejected-reference creation in `record_decision`. - Existing knowledge-entry payload shape from `register_manual_knowledge_entry` and candidate promotion methods. **Test scenarios:** - Happy path: held decision creates one active watchlist entry with source submission and fingerprint. - Happy path: rejected decision creates one active watchlist entry with source evidence IDs. - Happy path: approved decision creates no watchlist entry. - Edge case: repeating held/rejected decision for the same submission updates one candidate, not duplicates. - Edge case: no used evidence still creates an incomplete watchlist entry from available fingerprint evidence. **Verification:** - Held and rejected decisions create persistent watchlist entries, approval does not, and candidate provenance is visible in `knowledgeEntries`. --- ### U3. Watchlist Matching And Scoring **Goal:** Make watchlist candidates strongly affect future risk while remaining distinguishable from confirmed DB entries. **Requirements:** R5, R6, R7, R9 **Dependencies:** U2 **Files:** - Modify: `src/rights_filter/domain/records.py` - Modify: `src/rights_filter/analysis/internal_analyzer.py` - Modify: `src/rights_filter/analysis/risk_scoring.py` - Modify: `src/rights_filter/server/sqlite_store.py` - Test: `tests/rights_filter/analysis/test_internal_analyzer.py` - Test: `tests/rights_filter/analysis/test_risk_scoring.py` - Test: `tests/rights_filter/server/test_sqlite_store.py` **Approach:** - Carry knowledge entry status into internal fingerprint evidence so matches can be labeled as watchlist or confirmed. - Keep watchlist entries active for matching unless excluded. - Score watchlist image similarity at the same high-risk level as confirmed rejected-image similarity when similarity is high. - Use distinct evidence reason/data for watchlist matches so UI grouping can separate them. - Increment contribution count when a watchlist entry contributes to a rescore or analysis result. **Execution note:** Test scoring and reason text before wiring UI labels. **Patterns to follow:** - `InternalAnalyzer` knowledge-base similarity loop. - `RiskScorer` fingerprint evidence handling and non-contributing evidence checks. **Test scenarios:** - Happy path: image similar to a watchlist entry emits watchlist similarity evidence. - Happy path: watchlist similarity at or above threshold produces high-risk score. - Happy path: matched watchlist evidence does not change `decisionStatus`. - Edge case: excluded watchlist entry is not included in repository matching. - Integration: contribution count increases only when watchlist evidence contributes to the case score. **Verification:** - Watchlist matches raise risk strongly while remaining labeled as watchlist-derived evidence. --- ### U4. Candidate Promotion And False-Positive Exclusion **Goal:** Let operators promote watchlist candidates to confirmed DB entries or exclude them so future matching is suppressed. **Requirements:** R8, R9 **Dependencies:** U2, U3 **Files:** - Modify: `src/rights_filter/server/sqlite_store.py` - Modify: `src/rights_filter/server/http_app.py` - Test: `tests/rights_filter/server/test_sqlite_store.py` - Test: `tests/rights_filter/server/test_http_app.py` **Approach:** - Add store methods and HTTP routes for promoting a watchlist entry and excluding a watchlist entry. - Promotion changes the entry status to confirmed while preserving source decision and evidence history. - Exclusion changes the entry status to excluded, disables matching, and stores an exclusion reason. - Apply false-positive evidence status to exact evidence/candidate patterns, image fingerprint, URL/image URL, and title where available. - Add audit events for promotion and exclusion. **Execution note:** Characterize existing manual/collection promotion behavior first, then add watchlist-specific paths. **Patterns to follow:** - Existing `promote_collection_candidate`, `promote_collection_candidates`, and knowledge entry active/deactivation patterns. **Test scenarios:** - Happy path: promoting a watchlist entry makes it confirmed and keeps sample fingerprints. - Happy path: excluding a watchlist entry prevents future similarity evidence from that entry. - Edge case: promoting an excluded entry requires explicit unexclude or returns validation error. - Edge case: missing candidate ID returns not found. - Integration: audit log records promote/exclude actions. **Verification:** - Operators can move candidates between watchlist, confirmed, and excluded states without losing provenance. --- ### U5. Operator UI Controls And Evidence Grouping **Goal:** Make evidence status, watchlist matches, and candidate actions clear in the operator console. **Requirements:** R1, R3, R6, R8, R9 **Dependencies:** U1, U3, U4 **Files:** - Modify: `web/operator-gui/index.html` - Modify: `web/operator-gui/app.js` - Modify: `web/operator-gui/styles.css` - Test: `tests/operator_gui/test_static_workbench.py` **Approach:** - Add evidence-row controls for 판단에 사용, 무관, 오탐, 보류. - Hide or de-emphasize irrelevant and false-positive evidence by default while preserving a details view. - Add a dedicated 주의 후보 근거 group for watchlist matches. - Add watchlist status chips in the knowledge DB list: 주의 후보, 확정 기준, 오탐 제외. - Add promote/exclude actions for watchlist rows. - Keep controls dense and consistent with the existing operator dashboard; avoid introducing a separate landing or wizard. **Execution note:** Follow frontend design checks after implementation: load the local 9500 page with Playwright and check for console errors and obvious layout breakage. **Patterns to follow:** - Existing evidence group rendering, details overflow, candidate cards, and knowledge rows in `web/operator-gui/app.js`. - Existing compact panel and row styles in `web/operator-gui/styles.css`. **Test scenarios:** - Static contract: UI exposes evidence status action handlers and API paths. - Static contract: watchlist group label and knowledge status chips are present. - Static contract: irrelevant/false-positive evidence handling is represented in rendering functions. - Browser check: page loads on desktop viewport without console errors after server restart. **Verification:** - Operators can mark evidence status, see watchlist evidence separately, and manage watchlist entries without confusing them with confirmed DB entries. --- ### U6. Documentation And Regression Verification **Goal:** Update operations guidance and verify the feature end to end. **Requirements:** R1-R9 **Dependencies:** U1-U5 **Files:** - Modify: `docs/operations/copyrighter-operation-worklist.md` - Test: `tests/rights_filter/server/test_sqlite_store.py` - Test: `tests/rights_filter/server/test_http_app.py` - Test: `tests/operator_gui/test_static_workbench.py` **Approach:** - Document the operator flow: mark evidence, decide case, watchlist creation, promotion, exclusion. - State that watchlist matching is strong but not automatic case disposition. - Run full test suite. - Restart the 9500 server and verify `/health`, provider state, and browser load. **Execution note:** Preserve the active `.env` and existing local data. Do not reset DB unless the user explicitly asks. **Patterns to follow:** - Existing operations doc format and local server verification pattern. **Test scenarios:** - Integration: full `pytest` passes. - Browser: 9500 page loads without console errors. - Operational: `/health` returns ok after restart. **Verification:** - Feature is documented, tests pass, and the local server is running with the updated code. --- ## System-Wide Impact - **Interaction graph:** Case decisions now trigger watchlist updates; evidence status affects scoring contribution; internal analysis reads active confirmed/watchlist entries. - **Error propagation:** Invalid evidence status, missing evidence, missing candidate, or invalid promotion/exclusion should return clear API errors without corrupting stored payloads. - **State lifecycle risks:** Repeated held/rejected decisions must be idempotent per submission. Promotion and exclusion must not lose source decision provenance. - **API surface parity:** Bootstrap, review, knowledge list, and evidence rows all need the new fields so the static UI stays in sync with server state. - **Integration coverage:** Store tests must cover decision-to-watchlist-to-analysis; UI static tests must cover controls and grouping. - **Unchanged invariants:** No automatic final case disposition, no applicant exposure, no biometric face storage, no scraping. --- ## Risks & Dependencies | Risk | Mitigation | |------|------------| | Watchlist candidates over-amplify false positives | Keep watchlist visually distinct, add exclusion flow, and do not apply domain-wide suppression. | | Rejected-entry behavior changes existing expectations | Update tests to make watchlist the automatic intermediate state and promotion the explicit confirmed state. | | JSON payload fields drift across old records | Use default values when fields are absent and normalize in rendering/scoring paths. | | UI becomes crowded | Use compact segmented evidence actions and keep weak/irrelevant evidence collapsed. | --- ## Documentation / Operational Notes - Update the operations doc with the decision-first flow and the difference between 주의 후보 and 확정 기준 DB. - Keep the current `.env` behavior unchanged. - Restart the 9500 server after implementation so the operator console uses the updated route handlers and JS. --- ## Sources & References - **Origin document:** [docs/brainstorms/2026-05-26-evidence-quality-watchlist-requirements.md](docs/brainstorms/2026-05-26-evidence-quality-watchlist-requirements.md) - Related plan: [docs/plans/2026-05-25-002-feat-image-rights-review-enrichment-plan.md](docs/plans/2026-05-25-002-feat-image-rights-review-enrichment-plan.md) - Related code: `src/rights_filter/server/sqlite_store.py` - Related code: `src/rights_filter/analysis/internal_analyzer.py` - Related code: `src/rights_filter/analysis/risk_scoring.py` - Related UI: `web/operator-gui/app.js`