--- title: "feat: Add Image Rights Review Enrichment" type: feat status: completed date: 2026-05-25 origin: docs/brainstorms/2026-05-25-image-rights-review-enrichment-requirements.md --- # feat: Add Image Rights Review Enrichment ## Summary Extend the existing portable `rights_filter` core with search-enriched evidence, internal LLM-assisted query and summary boundaries, and a detailed operator review view model. Because this workspace has no real web admin app or database, the plan delivers backend contracts and tests that a target admin UI can render without re-deciding product behavior. --- ## Problem Frame The current filter can analyze images, score risk, and expose a basic operator summary. Operators still need a single detailed review surface where internal evidence, Korean search evidence, provider evidence, LLM summaries, failures, and final manual actions are grouped coherently. --- ## Requirements - R1. Provide a detailed operator review representation containing image reference, 0-100 score, band, top reasons, evidence groups, provider status, analysis failures, and manual decision actions. (Origin R1-R5, F2, AE7) - R2. Add Naver search evidence as official text-query evidence only; do not upload images or automate/scrape Naver web UI. (Origin R6-R10, R20, AE1, AE2) - R3. Add internal LLM assistance for query generation, search-result structuring, contradiction/deduplication, and operator summaries only. (Origin R11-R14, AE3) - R4. Ensure LLM output is never a standalone scoring source or final decision authority. (Origin R12-R13, AE3) - R5. Integrate enrichment failures and skipped providers as operator-visible reasons without reducing existing high-risk evidence. (Origin R5, R21, R23, AE4) - R6. Keep approval, hold, and rejection as explicit operator actions; automated analysis must not change status. (Origin R15, AE5) - R7. Preserve applicant isolation: applicants cannot see scores, search evidence, LLM summaries, provider details, or analysis failure reasons. (Origin R19, AE7) - R8. Support rejection-derived knowledge accumulation and correction/deactivation paths that prevent bad decisions from poisoning future matching. (Origin R16-R18, F3, F4, AE5, AE6) - R9. Keep existing Google Web Detection compliance gates and no-scraping boundaries intact. (Origin R20-R22) **Origin actors:** A1 신청자, A2 운영자, A3 권리 리스크 필터, A4 내부 LLM, A5 Naver 검색 API, A6 Google Cloud Vision Web Detection **Origin flows:** F1 검색 보강 분석, F2 상세 검토, F3 판정 기반 기준 DB 누적, F4 정정 및 오염 방지 **Origin acceptance examples:** AE1, AE2, AE3, AE4, AE5, AE6, AE7 --- ## Scope Boundaries - No real web admin frontend is built in this workspace. The deliverable is a backend review view model and integration contract that a target app can render. - No Naver image-upload reverse search. - No Google Image Search, Google Lens, Naver web UI automation, or scraping. - No external LLM integration in this iteration. - No LLM-based legal judgment, score calculation, automatic approval, or automatic rejection. - No applicant-facing explanation, rights-evidence upload, or appeal UI. - No dedicated brand/logo/trademark/stock-image detector; strong incidental evidence can still appear as operator evidence. - No face recognition, celebrity identification from faces, face embeddings, or biometric template storage. ### Deferred to Follow-Up Work - Target admin UI implementation: wire the detailed view model into the actual application once the app framework, routes, auth, and database exist. - Full criteria database management screen: keep only the backend hooks needed for review feedback and contamination control here. - Search-quality calibration: run pilot samples and tune query generation, result promotion, and scoring weights after the adapters exist. --- ## Context & Research ### Relevant Code and Patterns - `src/rights_filter/domain/records.py` defines immutable-ish evidence records, analysis runs, knowledge-base entries, review statuses, data classes, and the in-memory repository. - `src/rights_filter/jobs/batch_analyzer.py` orchestrates internal analysis, external derivative creation, Cloud Vision evidence, scoring, and run persistence. - `src/rights_filter/integrations/cloud_vision_web_detection.py` uses a fake client boundary and maps provider results into `Evidence` records. - `src/rights_filter/integrations/external_policy.py` provides a simple policy gate with disabled/compliance/metadata/online/quota checks. - `src/rights_filter/admin/review_handlers.py` already separates operator-visible summaries from applicant summaries and creates automatic rejected-image entries on rejection. - `src/rights_filter/analysis/risk_scoring.py` scores by evidence source and keeps failures from acting as exculpatory evidence. - `tests/rights_filter/test_public_module_layout.py` protects the public module boundary; new planned modules should be added there. ### Institutional Learnings - No `docs/solutions/` directory or prior institutional learning notes are present. - There is no `STRATEGY.md`, `AGENTS.md`, or `CLAUDE.md` file in the workspace root; the active instructions are from the conversation context. ### External References - Naver image search API is a REST API that accepts search terms and conditions as query-string data, not image uploads. It documents image search result fields such as title, link, thumbnail, size, and daily Search API quota. https://developers.naver.com/docs/serviceapi/search/image/image.md - Naver Search API product page lists web, news, blog, image, encyclopedia, and other search surfaces, with a 25,000/day processing limit. https://developers.naver.com/products/service-api/search/search.md - Naver API terms state API use is subject to provided conditions, allowed counts, client ID management, and policy compliance. https://developers.naver.com/products/terms - Google Cloud Vision Data Usage FAQ explains the data-handling distinction that must remain part of the external enablement checklist. https://docs.cloud.google.com/vision/docs/data-usage - Google Cloud Vision Web Detection can return web entities, matching images, visually similar images, pages with matching images, and best guess labels. https://docs.cloud.google.com/vision/docs/detecting-web --- ## Key Technical Decisions - Backend view model first: since no admin app exists locally, implement the detailed review surface as structured presenter output rather than a web page. - Evidence-source expansion over special-case strings: represent Naver search, LLM summaries, and enrichment skipped/failure states as first-class evidence sources so grouping, scoring, and presentation remain explicit. - Text-only Naver adapter: mirror the existing fake-client adapter pattern, but accept only query text and provider options; image payload types must not be part of the Naver adapter interface. - LLM as evidence organizer: use an internal LLM boundary that emits candidate queries and summaries tied to source evidence IDs or source URLs; never let LLM output feed scoring directly. - Search-result promotion is conservative: Naver evidence contributes meaningful risk only when linked to named persons, works, characters, broadcasts, webtoons, games, official pages, or repeated matching-image sources. - Enrichment orchestration is separate from the existing batch analyzer until the flow is proven: keep a focused enrichment job that can be called after or within batch analysis without destabilizing the internal-only path. - Correction is provenance-driven: automatic rejection-derived entries must stay distinguishable from manual entries and deactivatable from the source decision path. --- ## Open Questions ### Resolved During Planning - Detailed review screen form: implement a backend presenter/view model now; defer a real web admin screen because no app shell exists in this workspace. - Naver role: use official text-query search only, with fake-client tests; do not perform image upload reverse search. - LLM role: query generation, result structuring, and operator summary only; no score or final status authority. - Rejection feedback default: create explicit automatic entries or candidates with provenance, and require correction/deactivation mechanics in the same iteration. ### Deferred to Implementation - Exact Naver request tuning: final display count, sort order, and query variants should be validated with pilot samples and quota behavior. - Internal LLM runtime: choose the actual local/internal model, prompt storage, logging policy, and deployment boundary in the target environment. - Real admin app integration: map the review view model to framework routes, components, auth, and storage once the target app exists. - Final score weights: tune Naver evidence contribution after sample outcomes are collected; start conservative. --- ## Output Structure src/ rights_filter/ analysis/ evidence_enrichment.py llm_assistance.py search_result_promoter.py admin/ detailed_review_presenter.py knowledge_base_handlers.py correction_handlers.py integrations/ naver_search.py search_policy.py jobs/ review_enrichment_job.py tests/ rights_filter/ analysis/ admin/ integrations/ jobs/ docs/ operations/ The tree shows proposed additions. The implementer may adjust names to fit the target application, but should preserve the same boundaries. --- ## High-Level Technical Design > *This illustrates the intended approach and is directional guidance for review, not implementation specification. The implementing agent should treat it as context, not code to reproduce.* ```mermaid flowchart TB Run[Existing analysis run] --> Query[Internal LLM query generation] Query --> NaverGate{Naver policy allows?} NaverGate -->|yes| Naver[Naver text-query search] NaverGate -->|no| SearchSkipped[Search skipped evidence] Naver --> Promote[Search result promotion] Promote --> Ledger[Evidence ledger] SearchSkipped --> Ledger Run --> Ledger Ledger --> Summary[Internal LLM evidence summary] Summary --> Review[Detailed operator review view model] Review --> Decision[Manual approve / hold / reject] Decision --> Knowledge[Knowledge-base feedback] Knowledge --> Correction[Correction / deactivation] ``` --- ## Implementation Units ```mermaid flowchart TB U1[U1 Evidence model extensions] U2[U2 Naver search adapter] U3[U3 Internal LLM assistance] U4[U4 Enrichment orchestration] U5[U5 Scoring and reason promotion] U6[U6 Detailed review presenter] U7[U7 Decision feedback and correction] U8[U8 Governance docs and module layout] U1 --> U2 U1 --> U3 U2 --> U4 U3 --> U4 U4 --> U5 U5 --> U6 U6 --> U7 U1 --> U8 U7 --> U8 ``` ### U1. Evidence Model Extensions **Goal:** Extend the domain records so search evidence, LLM summaries, provider skips, and source-linked summaries can be represented without stringly-typed workarounds. **Requirements:** R1, R2, R3, R4, R5, R7 **Dependencies:** Existing domain model **Files:** - Modify: `src/rights_filter/domain/records.py` - Modify: `src/rights_filter/governance/policies.py` - Test: `tests/rights_filter/domain/test_records.py` - Test: `tests/rights_filter/governance/test_policies.py` **Approach:** - Add evidence source categories for Naver search, LLM summary, search skipped, and enrichment failure while keeping existing source names stable. - Add data-class coverage for search evidence and LLM summary output so governance policies can distinguish them from provider metadata and operator notes. - Preserve the simple `Evidence` shape, but standardize expected `data` keys by source in tests and presenters. - Keep the repository in-memory for this workspace; do not invent a database layer. **Execution note:** Write tests first for the privacy and governance boundary: LLM summary records must point back to source evidence, and Naver records must not contain uploaded-image payloads. **Patterns to follow:** - Follow the enum/dataclass style in `src/rights_filter/domain/records.py`. - Follow policy mapping style in `src/rights_filter/governance/policies.py`. **Test scenarios:** - Happy path: a Naver search evidence record stores query, result URL, image URL, thumbnail URL, title, rank, and retrieved timestamp. - Happy path: an LLM summary evidence record references source evidence IDs or source URLs and is classified as operator-only data. - Edge case: Naver evidence data does not accept original-image or derivative-image payload markers. - Error path: governance validation rejects LLM summary data that claims standalone authority without source references. - Regression: existing fingerprint, face/person, web detection, external skipped, and failure evidence remain importable and scoreable. **Verification:** - The domain layer can represent every enrichment requirement without exposing applicant-visible fields or storing biometric artifacts. ### U2. Naver Text-Query Search Adapter **Goal:** Add a Naver integration boundary that performs official text-query search through a fake-client contract in tests and records provider outcomes as evidence candidates. **Requirements:** R2, R5, R9 **Dependencies:** U1 **Files:** - Create: `src/rights_filter/integrations/search_policy.py` - Create: `src/rights_filter/integrations/naver_search.py` - Test: `tests/rights_filter/integrations/test_naver_search.py` - Test: `tests/rights_filter/integrations/test_search_policy.py` **Approach:** - Model a search policy with disabled state, compliance approval, daily quota, calls made, and allowed provider names. - Implement a fake-client adapter pattern similar to `CloudVisionWebDetectionAdapter`. - Accept only text query requests and search parameters; keep image payload types out of the public method boundary. - Map Naver result items into evidence records with source, query, rank, result URLs, thumbnail, title/description, and provider status. - Record skipped, quota-exhausted, and provider-error states as operator-visible evidence. **Execution note:** Start with adapter contract tests using a fake Naver client; do not add real credentials or live outbound calls. **Patterns to follow:** - `src/rights_filter/integrations/cloud_vision_web_detection.py` fake client and response mapper. - `src/rights_filter/integrations/external_policy.py` simple policy-gate style. **Test scenarios:** - Happy path: approved text query returns multiple Naver result evidence records with query and rank preserved. - Edge case: empty result set creates a low-confidence "no results" evidence record instead of disappearing. - Error path: disabled policy, quota exhaustion, or provider exception records a skipped/failure evidence item and makes no client call when policy blocks it. - Policy: passing an image payload-shaped input to the Naver adapter boundary is rejected before any provider call. - Integration: Naver evidence can be stored on an analysis run alongside fingerprint and Google Web Detection evidence. **Verification:** - No test path sends original images or derivatives to Naver, and skipped search still appears in operator review data. ### U3. Internal LLM Query and Summary Assistance **Goal:** Add an internal LLM boundary that can generate Korean search queries and summarize evidence for operators while remaining source-linked and non-authoritative. **Requirements:** R3, R4, R5, R7 **Dependencies:** U1 **Files:** - Create: `src/rights_filter/analysis/llm_assistance.py` - Create: `src/rights_filter/analysis/search_query_generation.py` - Test: `tests/rights_filter/analysis/test_llm_assistance.py` - Test: `tests/rights_filter/analysis/test_search_query_generation.py` **Approach:** - Define an internal assistant interface with deterministic fake implementation for tests. - Generate query candidates from existing evidence, OCR/label placeholders when present, knowledge-base names/aliases, and Google Web Detection entities. - Structure assistant summaries as evidence records that cite source evidence IDs or URLs. - Add guardrails that mark ungrounded assistant claims as notes, not risk reasons. - Keep all assistant outputs operator-only and exclude them from applicant summaries. **Execution note:** Test first that LLM output without citations cannot become a score reason. **Patterns to follow:** - Existing analysis service classes under `src/rights_filter/analysis/`. - Existing applicant/operator separation in `src/rights_filter/admin/review_handlers.py`. **Test scenarios:** - Happy path: web entity "아이유" and alias "IU" produce Korean query candidates that include person/work context without duplicates. - Happy path: LLM summary cites Naver and Google evidence URLs and appears in the operator review model. - Edge case: duplicate or contradictory search results are summarized as uncertainty, not collapsed into one definitive claim. - Error path: LLM provider failure records an enrichment failure evidence item and does not block existing analysis. - Policy: source-less LLM claims are excluded from scoring reasons and applicant summaries. **Verification:** - Internal LLM assistance reduces operator reading work while remaining auditable and non-authoritative. ### U4. Review Enrichment Orchestration **Goal:** Orchestrate query generation, Naver search, result promotion, LLM summary creation, and evidence persistence around existing analysis runs. **Requirements:** R1, R2, R3, R5 **Dependencies:** U1, U2, U3 **Files:** - Create: `src/rights_filter/analysis/evidence_enrichment.py` - Create: `src/rights_filter/jobs/review_enrichment_job.py` - Modify: `src/rights_filter/jobs/batch_analyzer.py` - Test: `tests/rights_filter/analysis/test_evidence_enrichment.py` - Test: `tests/rights_filter/jobs/test_review_enrichment_job.py` - Test: `tests/rights_filter/jobs/test_batch_analyzer.py` **Approach:** - Keep enrichment idempotent by analysis version and provider/query signature. - Allow the batch analyzer to remain useful in internal-only mode; enrichment should be callable after a run is created or disabled entirely. - Store each query and provider outcome as evidence so the detailed review surface can explain missing or skipped evidence. - Preserve partial success: one failed query or provider should not invalidate other evidence. - Record operational counters for generated queries, executed searches, skipped searches, provider failures, and summary failures. **Patterns to follow:** - `src/rights_filter/jobs/batch_analyzer.py` summary counters and idempotency check. - Evidence append flow through `AnalysisRun.add_evidence`. **Test scenarios:** - Happy path: an analysis run with Google entity evidence generates Naver queries, stores Naver evidence, stores an LLM summary, and recomputes a score. - Happy path: rerunning enrichment with the same analysis version and same query signature does not duplicate evidence. - Edge case: Naver disabled still records search-skipped evidence and creates an LLM summary from internal/Google evidence when possible. - Error path: corrupt or missing analysis run returns a failure summary without creating a misleading low-risk result. - Integration: batch analysis can run without enrichment, and enrichment can run later against stored runs. **Verification:** - Search enrichment can be enabled, disabled, retried, and audited independently of the internal-only baseline. ### U5. Search Result Promotion, Scoring, and Reasons **Goal:** Convert promoted search evidence into conservative score contributions and operator-readable reasons while preventing LLM output from directly affecting the score. **Requirements:** R1, R2, R3, R4, R5 **Dependencies:** U1, U2, U3, U4 **Files:** - Create: `src/rights_filter/analysis/search_result_promoter.py` - Modify: `src/rights_filter/analysis/risk_scoring.py` - Modify: `src/rights_filter/analysis/reason_builder.py` - Test: `tests/rights_filter/analysis/test_search_result_promoter.py` - Test: `tests/rights_filter/analysis/test_risk_scoring.py` - Test: `tests/rights_filter/analysis/test_reason_builder.py` **Approach:** - Promote Naver results only when they connect the query/result to named people, works, characters, broadcasts, webtoons, games, official sources, or repeated matching-image sources. - Treat low-confidence Naver evidence as operator context, not high-risk proof. - Ensure LLM summary evidence is visible but contributes zero direct score points. - Keep failures and skips visible; failures add uncertainty where appropriate but never reduce stronger high-risk evidence. - Preserve historical score behavior for existing fingerprint, face/person, and Google evidence. **Execution note:** Add regression tests that face/person evidence alone and LLM-only evidence do not produce high risk. **Patterns to follow:** - `src/rights_filter/analysis/risk_scoring.py` source-based scoring and unique reason handling. **Test scenarios:** - Happy path: Naver evidence linking a Korean celebrity name and repeated image sources contributes medium or high review risk with clear reasons. - Happy path: Naver evidence for a known character/work plus Google matching-image evidence reaches high risk. - Edge case: generic image results with no named person/work/character remain context-only and do not create high risk. - Error path: LLM summary claiming a celebrity match without source references contributes no score reason. - Regression: external provider failure does not lower a prior rejected-image similarity score. **Verification:** - Operators see why search evidence mattered, and scores remain explainable without trusting LLM prose. ### U6. Detailed Operator Review Presenter **Goal:** Build the backend representation of the detailed review screen, grouping evidence and actions so a future web admin UI can render it directly. **Requirements:** R1, R6, R7 **Dependencies:** U1, U5 **Files:** - Create: `src/rights_filter/admin/detailed_review_presenter.py` - Modify: `src/rights_filter/admin/review_handlers.py` - Modify: `src/rights_filter/admin/review_presenters.py` - Test: `tests/rights_filter/admin/test_detailed_review_presenter.py` - Test: `tests/rights_filter/admin/test_review_handlers.py` **Approach:** - Produce a detailed review dictionary or dataclass containing submission ID, image reference, score, band, top reasons, grouped evidence, provider statuses, LLM summaries, failures, and allowed manual actions. - Group evidence into internal, Naver, Google, LLM, failure/skipped, and knowledge-base sections. - Keep applicant summaries minimal and unchanged except for explicit regression coverage. - Surface missing analysis as a review state rather than hiding the submission. - Keep action affordances separate from automated recommendation data. **Patterns to follow:** - Existing `operator_summary_for` and `applicant_summary_for` behavior in `src/rights_filter/admin/review_handlers.py`. - Existing tests in `tests/rights_filter/admin/test_review_handlers.py`. **Test scenarios:** - Happy path: detailed review output includes image reference, score, band, top reasons, Naver evidence group, Google evidence group, and LLM summary group. - Happy path: high-risk analysis appears to operators but does not change review status automatically. - Edge case: missing analysis returns a review model with analysis unavailable and manual actions still controlled by the host workflow. - Error path: failed Naver, failed LLM, or disabled Google evidence appears under provider status/failure groups. - Security: applicant summary excludes score, reasons, evidence, LLM summaries, provider status, and failure details. **Verification:** - A real admin UI can be built from the presenter output without exposing internal evidence to applicants. ### U7. Decision Feedback and Contamination Control **Goal:** Strengthen rejection-derived knowledge accumulation and correction flows so automatic entries are useful but reversible. **Requirements:** R6, R8 **Dependencies:** U1, U6 **Files:** - Create: `src/rights_filter/admin/knowledge_base_handlers.py` - Create: `src/rights_filter/admin/correction_handlers.py` - Modify: `src/rights_filter/admin/decision_feedback.py` - Modify: `src/rights_filter/domain/knowledge_base.py` - Test: `tests/rights_filter/admin/test_knowledge_base_handlers.py` - Test: `tests/rights_filter/admin/test_correction_handlers.py` - Test: `tests/rights_filter/domain/test_knowledge_base.py` **Approach:** - Keep automatic rejection-derived entries distinct from manual operator registrations and search-result candidates. - Add correction handling that deactivates entries derived from a corrected decision while preserving audit history. - Allow manual entity registration with names, aliases, related keywords, policy memo, exception notes, and sample fingerprints. - For this workspace, implement repository-level behavior only; defer role-based UI and persistent audit tables to target app integration. **Execution note:** Test first around stale automatic entries: a corrected rejection must stop influencing future matching. **Patterns to follow:** - `InMemoryRightsFilterRepository.create_rejected_image_entry`. - `KnowledgeBaseEntry` provenance and active/deactivation fields in `src/rights_filter/domain/records.py`. **Test scenarios:** - Happy path: rejecting a submission creates an automatic rejected-image entry with source decision provenance. - Happy path: manually registering a celebrity or character entry creates a manual entry with aliases and policy memo. - Edge case: automatic and manual entries with similar names remain separate and independently deactivatable. - Error path: correcting a rejection deactivates derived automatic entries but leaves manual entries untouched. - Privacy: attempts to register face embeddings or biometric templates are rejected by governance validation. **Verification:** - The knowledge base can grow from real operator decisions without making false positives permanent. ### U8. Governance, Operations, and Public Module Layout **Goal:** Update operations guidance, public module tests, and governance policies so the new enrichment modes remain discoverable and safe to operate. **Requirements:** R2, R3, R4, R5, R7, R9 **Dependencies:** U1, U7 **Files:** - Modify: `docs/operations/image-rights-risk-filter.md` - Modify: `tests/rights_filter/test_public_module_layout.py` - Modify: `src/rights_filter/integrations/__init__.py` - Modify: `src/rights_filter/analysis/__init__.py` - Modify: `src/rights_filter/admin/__init__.py` - Test: `tests/rights_filter/governance/test_policies.py` **Approach:** - Document search-enriched mode and LLM-assisted mode with explicit disable guidance. - Add public module imports for planned modules so missing boundaries fail fast. - Update governance tests for Naver text-only evidence, LLM source-linking, applicant isolation, and no-scraping boundaries. - Keep external API enablement documentation explicit: Naver credentials and Google credentials are separate provider risks. **Patterns to follow:** - Existing operations doc structure in `docs/operations/image-rights-risk-filter.md`. - Existing public module import test in `tests/rights_filter/test_public_module_layout.py`. **Test scenarios:** - Happy path: all new public modules import successfully. - Policy: operations guidance states Naver uses text queries only and LLM summaries are reading aids. - Security: governance tests cover applicant non-exposure for Naver evidence and LLM summaries. - Regression: existing Cloud Vision compliance-gated mode remains documented and tested. **Verification:** - A deployer can identify how to enable, disable, audit, and explain search and LLM enrichment without weakening existing safety boundaries. --- ## System-Wide Impact - **Interaction graph:** The enrichment flow touches analysis runs, evidence source typing, provider adapters, scoring, operator presenters, decision feedback, and governance policies. - **Error propagation:** Naver, LLM, and Google failures become evidence/failure groups for operators, not silent success and not low-risk proof. - **State lifecycle risks:** Enrichment must be idempotent by run/provider/query to avoid duplicate evidence; rejection-derived knowledge entries must be deactivatable when decisions are corrected. - **API surface parity:** Operator-only surfaces gain richer evidence; applicant-facing summaries must stay intentionally sparse. - **Integration coverage:** End-to-end tests should cover analysis run -> enrichment -> score -> detailed review -> rejection -> knowledge entry -> correction. - **Unchanged invariants:** The filter never changes review status automatically, never exposes provider evidence to applicants, and never uses face identity recognition. --- ## Risks & Dependencies | Risk | Mitigation | |------|------------| | Naver API is mistaken for reverse image search | Keep adapter text-query only, reject image payload inputs, and document the official API limitation. | | LLM hallucination pollutes scores | Require source references for summaries and assign zero direct scoring weight to LLM evidence. | | Search results create false positives | Promote only strongly linked person/work/character evidence, keep context-only results visible but low impact, and preserve operator judgment. | | External provider cost or quota spikes | Add provider policies, daily limits, skipped evidence, and operational counters. | | Detailed presenter leaks to applicants | Keep separate operator and applicant presenters with explicit regression tests. | | Automatic rejection entries poison future matching | Preserve provenance, add correction/deactivation flows, and test stale-entry removal from active matching. | | Real app integration differs from portable core | Keep this plan focused on backend contracts; defer routes, UI components, auth, and persistence wiring to target app integration. | --- ## Alternative Approaches Considered - Build the real web admin screen now: rejected because this workspace contains no admin app, frontend stack, routes, auth, or database. - Search-enrichment-only implementation: rejected because operators need a detailed surface to judge evidence quality. - LLM-first scoring: rejected because source-less LLM output is not reliable enough for rights-risk decisions and conflicts with the origin safety boundary. - Naver scraping or browser automation: rejected because official APIs are available for text-query search and UI automation would create unnecessary legal and operational risk. --- ## Phased Delivery ### Phase 1: Evidence and provider foundation - Land U1 and U2 so Naver search evidence has a safe, typed, text-only path. ### Phase 2: LLM and enrichment pipeline - Land U3 and U4 so query generation, search execution, summaries, and provider failure handling can run around existing analysis. ### Phase 3: Scoring and operator review - Land U5 and U6 so promoted evidence affects risk conservatively and operators can inspect grouped evidence in one view model. ### Phase 4: Feedback, correction, and operations - Land U7 and U8 so decisions improve future matching without permanent false-positive contamination. --- ## Documentation / Operational Notes - Update the operations doc with Naver credential handling, query-only usage, provider quotas, and emergency disable mode. - Document that LLM summaries are reading aids and must cite source evidence. - Document how to interpret search-skipped, no-results, provider-failure, and LLM-failure states. - Document correction flow for rejection-derived knowledge entries. --- ## Sources & References - **Origin document:** [docs/brainstorms/2026-05-25-image-rights-review-enrichment-requirements.md](docs/brainstorms/2026-05-25-image-rights-review-enrichment-requirements.md) - **Base plan:** [docs/plans/2026-05-25-001-feat-image-rights-risk-filter-plan.md](docs/plans/2026-05-25-001-feat-image-rights-risk-filter-plan.md) - **Operations doc:** [docs/operations/image-rights-risk-filter.md](docs/operations/image-rights-risk-filter.md) - Related code: [src/rights_filter/domain/records.py](src/rights_filter/domain/records.py) - Related code: [src/rights_filter/admin/review_handlers.py](src/rights_filter/admin/review_handlers.py) - Related code: [src/rights_filter/jobs/batch_analyzer.py](src/rights_filter/jobs/batch_analyzer.py) - Related code: [src/rights_filter/integrations/cloud_vision_web_detection.py](src/rights_filter/integrations/cloud_vision_web_detection.py) - Related tests: [tests/rights_filter/admin/test_review_handlers.py](tests/rights_filter/admin/test_review_handlers.py) - Naver Image Search API: https://developers.naver.com/docs/serviceapi/search/image/image.md - Naver Search API product page: https://developers.naver.com/products/service-api/search/search.md - Naver API terms: https://developers.naver.com/products/terms - Google Cloud Vision Data Usage FAQ: https://docs.cloud.google.com/vision/docs/data-usage - Google Cloud Vision Web Detection: https://docs.cloud.google.com/vision/docs/detecting-web