Image rights / copyright detection system: SQLite store, HTTP app, search integrations (Naver, Google Custom Search, Google Cloud Vision web detection), image analysis (fingerprints, face/person detection, evidence enrichment, risk scoring), an admin/review layer, governance and retention policies, batch jobs, and a browser-based operator GUI. This baseline incorporates a full code-review remediation pass (46 fixes; 358 tests passing). Highlights: CRITICAL - Prevent evidence cascade-delete during the schema-constraint migration by disabling FK enforcement around the table rebuild. Security - Sandbox served media (neutralize stored XSS from uploaded/collected SVGs) via CSP + nosniff on the untrusted media routes. - Strip embedded EXIF/GPS from external image derivatives before they are sent to third-party APIs. - Return a clean 404 (not an uncaught StopIteration) for PATCH on an unknown provider. Correctness - LLM-summary failures no longer add +30 to the risk score. - Decode only explicit JS escapes so Korean image URLs are not mangled. - Consume search quota only after a successful request. - Naver/Google adapters map responses inside the failure boundary, so a malformed response degrades to evidence instead of crashing enrichment. - Domain-aware provider attribution; face-box IoU de-duplication; count searches (not result items); per-box crop isolation; clamp evidence confidence and Google CSE num; real submittedEpoch; and more. Robustness - Offline LLM connect fast-fails (short connect timeout) so seed/reload requests are not stalled; full read timeout preserved for generation. - Malformed numeric env vars fall back to defaults instead of crashing startup. Performance - Per-submission evidence reads (no full-table scan per rescore), audit-log LIMIT, lazy active-store lookup, hoisted timestamps. Tests - ~24 regression tests added pinning the above fixes. Runtime data (data/, outputs/, *.sqlite3, *.log), secrets (.env), and node_modules are gitignored.
21 KiB
| title | type | status | date | origin |
|---|---|---|---|---|
| feat: Add Evidence Quality And Watchlist Growth | feat | implemented | 2026-05-26 | docs/brainstorms/2026-05-26-evidence-quality-watchlist-requirements.md |
feat: Add Evidence Quality And Watchlist Growth
Summary
Add a decision-first feedback loop around evidence status, watchlist candidate generation, strong watchlist matching, and candidate management. The implementation will keep the existing SQLite JSON-payload pattern and extend the current operator console instead of introducing a new persistence layer.
Problem Frame
The current console can collect many internal, Google, Naver, and face-area web evidence items, but it does not yet let operators mark which evidence actually informed a case decision. It also promotes rejected cases too bluntly and does not create strong watchlist signals from held cases.
Requirements
- R1. Evidence items support operator status: used for judgment, irrelevant, false positive, and pending. (Origin R1-R3, F1, AE1-AE2)
- R2. Evidence status never creates a DB candidate by itself; candidate creation happens only after case decision. (Origin R2, R4, AE2)
- R3. Held and rejected case decisions automatically create persistent watchlist candidates. (Origin R5-R6, F2, AE1)
- R4. Approved decisions do not create automatic candidates. (Origin R7, AE3)
- R5. Watchlist candidates strongly affect future risk scoring, at roughly the same strength as confirmed DB image matches. (Origin R8, F3, AE4)
- R6. Watchlist candidate matches are visually distinct from confirmed DB matches. (Origin R9, AE4)
- R7. Watchlist signals never change case status automatically. (Origin R10)
- R8. Operators can promote watchlist candidates to confirmed DB entries or exclude them as false positives. (Origin R11-R13, F4, AE5-AE6)
- R9. Confirmed DB, watchlist candidates, and excluded candidates retain status, source decision, source evidence, and contribution counts. (Origin R14-R17)
Origin actors: A1 operator, A2 rights risk filter, A3 DB administrator
Origin flows: F1 evidence status marking, F2 decision-driven watchlist creation, F3 watchlist-based rediscovery, F4 candidate promotion and exclusion
Origin acceptance examples: AE1, AE2, AE3, AE4, AE5, AE6
Scope Boundaries
- No automatic approval, hold, or rejection.
- No face embeddings, face similarity database, biometric template storage, or identity recognition.
- No Google Image Search, Google Lens, Naver web UI automation, or scraping.
- No applicant-facing exposure of evidence statuses, watchlist candidates, scoring rules, or internal reasons.
- No new relational migration framework; this iteration keeps the existing JSON payload tables.
Deferred to Follow-Up Work
- Bulk watchlist cleanup, analytics dashboards, and advanced merge suggestions can follow after the core loop is proven.
- Domain-wide false-positive suppression is deferred because it can hide valid evidence from large sites.
Context & Research
Relevant Code and Patterns
src/rights_filter/server/sqlite_store.pyis the persistence and orchestration boundary. It stores JSON payloads insubmissions,evidence,knowledge_entries,collection_candidates,corrections, andaudit_events.CopyrighterStore.record_decisioncurrently updates case decision and creates a rejected-reference knowledge entry only for rejected cases.CopyrighterStore._knowledge_repositoryrebuilds an in-memory repository from activeknowledge_entriesand feedsInternalAnalyzer.src/rights_filter/analysis/internal_analyzer.pyemits fingerprint evidence for knowledge-base image similarity.src/rights_filter/analysis/risk_scoring.pyalready gives high weight to strong fingerprint matches and ignores non-contributing/queued evidence.web/operator-gui/app.js,index.html, andstyles.cssimplement the current static console, evidence grouping, decision actions, candidate collection, and knowledge DB management.tests/rights_filter/server/test_sqlite_store.pyis the main integration test surface for persistence behavior.tests/operator_gui/test_static_workbench.pyprotects the UI contract without browser runtime dependencies.
Institutional Learnings
- No
docs/solutions/directory exists in this workspace. - No
STRATEGY.mdexists; the active product strategy is captured in the brainstorm requirements documents.
External References
- No new external APIs are introduced. Existing Google/Naver/Ollama boundaries and no-scraping policy remain unchanged.
Key Technical Decisions
- Use
knowledge_entriesfor persistent watchlist state: watchlist candidates are persistent risk references, not transient keyword collection results, so they should not live incollection_candidates, which is cleared on each keyword search. - Add status fields instead of new tables: JSON payload storage lets us add
entryStatus,originDecisionStatus,sourceSubmissionId,sourceEvidenceIds, andcontributionCountwithout schema migration complexity. - Generate candidates from the local submission image when available: the decision API passes the local image store into
record_decision, which stores a perceptual sample fingerprint for held/rejected watchlist entries. If the image store is unavailable, the candidate is still recorded but cannot participate in image similarity until a sample fingerprint is added. - Strong watchlist scoring: watchlist similarity should use the same high-risk path as rejected-image similarity, but with separate reason text and UI group so operators can see it is not confirmed DB evidence.
- False-positive suppression scope: start with exact evidence identity, URL/image URL/title, and candidate fingerprint. Do not suppress an entire provider domain from one false-positive action.
- Decision-driven default evidence set: use evidence marked
used_for_judgmentwhen available; if none are marked, generate the watchlist candidate from the case fingerprint and top contributing evidence so held/rejected decisions still strengthen future detection.
Open Questions
Resolved During Planning
- Watchlist score strength: use the same high-confidence fingerprint match behavior as confirmed/rejected DB references, with separate UI labeling.
- UI distinction: add a dedicated watchlist/주의 후보 evidence group and badges rather than mixing it into confirmed internal DB evidence.
- False-positive propagation: suppress exact evidence/candidate patterns first, not whole domains.
Deferred to Implementation
- Exact Korean microcopy can be adjusted while fitting existing console labels.
- Exact CSS treatment should follow the existing evidence group and chip styles after visual verification.
High-Level Technical Design
This illustrates the intended approach and is directional guidance for review, not implementation specification.
flowchart TB
Evidence[Collected evidence] --> Mark[Operator marks evidence status]
Mark --> Decision[Operator decides approve / hold / reject]
Decision -->|approved| NoCandidate[No automatic candidate]
Decision -->|held or rejected| Watchlist[Create watchlist candidate]
Watchlist --> Analyze[Future internal analysis]
Confirmed[Confirmed DB entries] --> Analyze
Analyze --> Score[Risk scoring]
Score --> UI[Separate confirmed vs watchlist evidence groups]
Watchlist --> Promote[Promote to confirmed DB]
Watchlist --> Exclude[Exclude as false positive]
Exclude --> Score
Implementation Units
flowchart TB
U1[U1 Evidence status API and payload]
U2[U2 Decision-driven watchlist creation]
U3[U3 Watchlist matching and scoring]
U4[U4 Candidate promotion and exclusion]
U5[U5 Operator UI controls]
U6[U6 Docs and verification]
U1 --> U2
U2 --> U3
U3 --> U4
U1 --> U5
U3 --> U5
U4 --> U5
U5 --> U6
U1. Evidence Status API And Payload
Goal: Let operators mark evidence as used for judgment, irrelevant, false positive, or pending without changing case decision or DB state.
Requirements: R1, R2
Dependencies: None
Files:
- Modify:
src/rights_filter/server/sqlite_store.py - Modify:
src/rights_filter/server/http_app.py - Modify:
web/operator-gui/app.js - Modify:
web/operator-gui/index.html - Modify:
web/operator-gui/styles.css - Test:
tests/rights_filter/server/test_sqlite_store.py - Test:
tests/rights_filter/server/test_http_app.py - Test:
tests/operator_gui/test_static_workbench.py
Approach:
- Add a store method that updates an existing evidence payload with an operator evidence status and optional note.
- Add an HTTP route for evidence status updates.
- Keep evidence status inside each evidence payload so existing bootstrap/review responses include it automatically.
- Treat false-positive and irrelevant evidence as non-contributing during rescore.
- Keep pending evidence visible but non-final.
Execution note: Test-first. Start with store-level tests proving status changes do not create candidates and do affect rescore contribution.
Patterns to follow:
- Existing
record_decision,_put,_evidence_by_submission, and HTTP body parsing patterns insrc/rights_filter/server/sqlite_store.pyandsrc/rights_filter/server/http_app.py.
Test scenarios:
- Happy path: marking a Google evidence item as used for judgment persists in
review()andbootstrap(). - Happy path: marking evidence as irrelevant sets it non-contributing and rescore omits its points.
- Edge case: marking a missing evidence ID returns a not-found error.
- Edge case: unsupported evidence status returns a validation error.
- Integration: HTTP evidence status route updates the review payload.
Verification:
- Evidence status is visible in the API payload and does not create any knowledge entry or watchlist candidate by itself.
U2. Decision-Driven Watchlist Creation
Goal: Create persistent watchlist candidates automatically after held or rejected decisions, using case fingerprint evidence and judgment-used evidence.
Requirements: R2, R3, R4, R9
Dependencies: U1
Files:
- Modify:
src/rights_filter/server/sqlite_store.py - Test:
tests/rights_filter/server/test_sqlite_store.py
Approach:
- Extend
record_decisionsoheldandrejecteddecisions create or update a watchlist entry. - Stop treating rejected decisions as immediately confirmed DB entries; rejected decisions should create watchlist entries first, then operators can promote them.
- Populate watchlist payloads with source submission, origin decision status, source evidence IDs, sample fingerprints, memo, active/excluded state, and contribution count.
- Use the case's generated fingerprint evidence as the primary sample fingerprint source.
- Prefer evidence marked used for judgment; if none is marked, fallback to top contributing evidence plus the case fingerprint so strict detection still grows.
- Ensure repeated decisions update the existing source-submission watchlist entry instead of creating duplicates.
Execution note: Test-first around decision outcomes before changing the existing rejected-entry behavior.
Patterns to follow:
- Existing automatic rejected-reference creation in
record_decision. - Existing knowledge-entry payload shape from
register_manual_knowledge_entryand candidate promotion methods.
Test scenarios:
- Happy path: held decision creates one active watchlist entry with source submission and fingerprint.
- Happy path: rejected decision creates one active watchlist entry with source evidence IDs.
- Happy path: approved decision creates no watchlist entry.
- Edge case: repeating held/rejected decision for the same submission updates one candidate, not duplicates.
- Edge case: no used evidence still creates an incomplete watchlist entry from available fingerprint evidence.
Verification:
- Held and rejected decisions create persistent watchlist entries, approval does not, and candidate provenance is visible in
knowledgeEntries.
U3. Watchlist Matching And Scoring
Goal: Make watchlist candidates strongly affect future risk while remaining distinguishable from confirmed DB entries.
Requirements: R5, R6, R7, R9
Dependencies: U2
Files:
- Modify:
src/rights_filter/domain/records.py - Modify:
src/rights_filter/analysis/internal_analyzer.py - Modify:
src/rights_filter/analysis/risk_scoring.py - Modify:
src/rights_filter/server/sqlite_store.py - Test:
tests/rights_filter/analysis/test_internal_analyzer.py - Test:
tests/rights_filter/analysis/test_risk_scoring.py - Test:
tests/rights_filter/server/test_sqlite_store.py
Approach:
- Carry knowledge entry status into internal fingerprint evidence so matches can be labeled as watchlist or confirmed.
- Keep watchlist entries active for matching unless excluded.
- Score watchlist image similarity at the same high-risk level as confirmed rejected-image similarity when similarity is high.
- Use distinct evidence reason/data for watchlist matches so UI grouping can separate them.
- Increment contribution count when a watchlist entry contributes to a rescore or analysis result.
Execution note: Test scoring and reason text before wiring UI labels.
Patterns to follow:
InternalAnalyzerknowledge-base similarity loop.RiskScorerfingerprint evidence handling and non-contributing evidence checks.
Test scenarios:
- Happy path: image similar to a watchlist entry emits watchlist similarity evidence.
- Happy path: watchlist similarity at or above threshold produces high-risk score.
- Happy path: matched watchlist evidence does not change
decisionStatus. - Edge case: excluded watchlist entry is not included in repository matching.
- Integration: contribution count increases only when watchlist evidence contributes to the case score.
Verification:
- Watchlist matches raise risk strongly while remaining labeled as watchlist-derived evidence.
U4. Candidate Promotion And False-Positive Exclusion
Goal: Let operators promote watchlist candidates to confirmed DB entries or exclude them so future matching is suppressed.
Requirements: R8, R9
Dependencies: U2, U3
Files:
- Modify:
src/rights_filter/server/sqlite_store.py - Modify:
src/rights_filter/server/http_app.py - Test:
tests/rights_filter/server/test_sqlite_store.py - Test:
tests/rights_filter/server/test_http_app.py
Approach:
- Add store methods and HTTP routes for promoting a watchlist entry and excluding a watchlist entry.
- Promotion changes the entry status to confirmed while preserving source decision and evidence history.
- Exclusion changes the entry status to excluded, disables matching, and stores an exclusion reason.
- Apply false-positive evidence status to exact evidence/candidate patterns, image fingerprint, URL/image URL, and title where available.
- Add audit events for promotion and exclusion.
Execution note: Characterize existing manual/collection promotion behavior first, then add watchlist-specific paths.
Patterns to follow:
- Existing
promote_collection_candidate,promote_collection_candidates, and knowledge entry active/deactivation patterns.
Test scenarios:
- Happy path: promoting a watchlist entry makes it confirmed and keeps sample fingerprints.
- Happy path: excluding a watchlist entry prevents future similarity evidence from that entry.
- Edge case: promoting an excluded entry requires explicit unexclude or returns validation error.
- Edge case: missing candidate ID returns not found.
- Integration: audit log records promote/exclude actions.
Verification:
- Operators can move candidates between watchlist, confirmed, and excluded states without losing provenance.
U5. Operator UI Controls And Evidence Grouping
Goal: Make evidence status, watchlist matches, and candidate actions clear in the operator console.
Requirements: R1, R3, R6, R8, R9
Dependencies: U1, U3, U4
Files:
- Modify:
web/operator-gui/index.html - Modify:
web/operator-gui/app.js - Modify:
web/operator-gui/styles.css - Test:
tests/operator_gui/test_static_workbench.py
Approach:
- Add evidence-row controls for 판단에 사용, 무관, 오탐, 보류.
- Hide or de-emphasize irrelevant and false-positive evidence by default while preserving a details view.
- Add a dedicated 주의 후보 근거 group for watchlist matches.
- Add watchlist status chips in the knowledge DB list: 주의 후보, 확정 기준, 오탐 제외.
- Add promote/exclude actions for watchlist rows.
- Keep controls dense and consistent with the existing operator dashboard; avoid introducing a separate landing or wizard.
Execution note: Follow frontend design checks after implementation: load the local 9500 page with Playwright and check for console errors and obvious layout breakage.
Patterns to follow:
- Existing evidence group rendering, details overflow, candidate cards, and knowledge rows in
web/operator-gui/app.js. - Existing compact panel and row styles in
web/operator-gui/styles.css.
Test scenarios:
- Static contract: UI exposes evidence status action handlers and API paths.
- Static contract: watchlist group label and knowledge status chips are present.
- Static contract: irrelevant/false-positive evidence handling is represented in rendering functions.
- Browser check: page loads on desktop viewport without console errors after server restart.
Verification:
- Operators can mark evidence status, see watchlist evidence separately, and manage watchlist entries without confusing them with confirmed DB entries.
U6. Documentation And Regression Verification
Goal: Update operations guidance and verify the feature end to end.
Requirements: R1-R9
Dependencies: U1-U5
Files:
- Modify:
docs/operations/copyrighter-operation-worklist.md - Test:
tests/rights_filter/server/test_sqlite_store.py - Test:
tests/rights_filter/server/test_http_app.py - Test:
tests/operator_gui/test_static_workbench.py
Approach:
- Document the operator flow: mark evidence, decide case, watchlist creation, promotion, exclusion.
- State that watchlist matching is strong but not automatic case disposition.
- Run full test suite.
- Restart the 9500 server and verify
/health, provider state, and browser load.
Execution note: Preserve the active .env and existing local data. Do not reset DB unless the user explicitly asks.
Patterns to follow:
- Existing operations doc format and local server verification pattern.
Test scenarios:
- Integration: full
pytestpasses. - Browser: 9500 page loads without console errors.
- Operational:
/healthreturns ok after restart.
Verification:
- Feature is documented, tests pass, and the local server is running with the updated code.
System-Wide Impact
- Interaction graph: Case decisions now trigger watchlist updates; evidence status affects scoring contribution; internal analysis reads active confirmed/watchlist entries.
- Error propagation: Invalid evidence status, missing evidence, missing candidate, or invalid promotion/exclusion should return clear API errors without corrupting stored payloads.
- State lifecycle risks: Repeated held/rejected decisions must be idempotent per submission. Promotion and exclusion must not lose source decision provenance.
- API surface parity: Bootstrap, review, knowledge list, and evidence rows all need the new fields so the static UI stays in sync with server state.
- Integration coverage: Store tests must cover decision-to-watchlist-to-analysis; UI static tests must cover controls and grouping.
- Unchanged invariants: No automatic final case disposition, no applicant exposure, no biometric face storage, no scraping.
Risks & Dependencies
| Risk | Mitigation |
|---|---|
| Watchlist candidates over-amplify false positives | Keep watchlist visually distinct, add exclusion flow, and do not apply domain-wide suppression. |
| Rejected-entry behavior changes existing expectations | Update tests to make watchlist the automatic intermediate state and promotion the explicit confirmed state. |
| JSON payload fields drift across old records | Use default values when fields are absent and normalize in rendering/scoring paths. |
| UI becomes crowded | Use compact segmented evidence actions and keep weak/irrelevant evidence collapsed. |
Documentation / Operational Notes
- Update the operations doc with the decision-first flow and the difference between 주의 후보 and 확정 기준 DB.
- Keep the current
.envbehavior unchanged. - Restart the 9500 server after implementation so the operator console uses the updated route handlers and JS.
Sources & References
- Origin document: docs/brainstorms/2026-05-26-evidence-quality-watchlist-requirements.md
- Related plan: docs/plans/2026-05-25-002-feat-image-rights-review-enrichment-plan.md
- Related code:
src/rights_filter/server/sqlite_store.py - Related code:
src/rights_filter/analysis/internal_analyzer.py - Related code:
src/rights_filter/analysis/risk_scoring.py - Related UI:
web/operator-gui/app.js