POSA_Copyrighter

cw2007/POSA_Copyrighter

Author	SHA1	Message	Date
유창욱	40501e13f1	refactor: extract SQLite persistence primitives into StorePersistenceMixin Move connection/transaction management and the generic _put/_get/_all row access plus evidence-read helpers into a mixin; CopyrighterStore now inherits it. Methods rely on the host class's self.db_path / self._write_lock. Behavior-preserving. sqlite_store.py 3368 -> 3072 lines.	2026-06-20 21:46:01 +09:00
유창욱	3be7b016ce	refactor: extract store constants and remaining domain helpers Move shared constants + _bounded_int_env into store_constants (a leaf module), and the remaining module-level domain helpers (validation, query signatures, search-hint evidence, watchlist selection, knowledge type/provenance) into store_serialization. sqlite_store.py is now the CopyrighterStore class plus thin imports: 3613 -> 3368 lines (5333 -> 3368 overall, -37%). All behavior-preserving.	2026-06-20 21:38:03 +09:00
유창욱	8e53139029	refactor: extract payload serialization helpers into store_serialization Move submission/evidence payload builders, provider-state derivation, UI<->domain evidence mapping, weak-label handling, and id/label/image helpers into store_serialization (depends only on stdlib + domain + url/text helpers, no store coupling). Behavior-preserving; imported back into sqlite_store. 3992 -> 3613 lines.	2026-06-20 21:24:58 +09:00
유창욱	e3bc99e6b9	refactor: extract text helpers and HTML/CSS image-scraping from sqlite_store Move the pure text helpers (_text_list, _unique_texts) into store_text and the ~950-line page/CSS/JSON/srcset image-URL extraction (the _PageImageParser and its helpers) into store_page_scrape. Both behavior-preserving; store_page_scrape depends only on stdlib + url/text helpers + domain Evidence (no store coupling). sqlite_store.py 4955 -> 3992 lines.	2026-06-20 21:10:22 +09:00
유창욱	bd35cf6f3f	docs: record remediation implementation status in plan	2026-06-20 20:57:02 +09:00
유창욱	da917755dd	refactor: extract remote-fetch and schema modules from sqlite_store Move the pure network layer (image/page/stylesheet fetchers, SSRF guard, redirect-validating opener) into store_remote_fetch, and the DDL/typed-column/ constraint-migration helpers into store_schema. Both are behavior-preserving relocations imported back into sqlite_store; tests repoint their fetch monkeypatches to store_remote_fetch. sqlite_store.py 5333 -> 4948 lines.	2026-06-20 20:50:29 +09:00
유창욱	e66f9d5001	refactor: extract URL helpers into store_url_utils Move the SQLite store's 5 URL helpers (_decoded_nested_url, _is_http_url, _url_path_has_image_suffix, _url_has_image_format_hint, _url_looks_like_image) into a focused module and import them back. Pure relocation, no behavior change. First step of splitting the 5300-line sqlite_store god file.	2026-06-20 20:35:21 +09:00
유창욱	7317bfb2b3	fix: reject protocol-relative and backslash URLs in safeUrl Address commit security review: the same-origin branch of safeUrl accepted //host and /\host, which browsers normalize to an external host (open redirect). Allow only true same-origin paths.	2026-06-20 18:47:13 +09:00
유창욱	f8aa10f91b	fix: frontend URL scheme allowlist, fetch ok-check, image onerror Add safeUrl() to gate external search-result URLs into href/src (blocks javascript:/data:), parse the response body before the ok check in apiJson so non-JSON error bodies surface the real status, and hide broken evidence preview images via onerror.	2026-06-20 18:44:20 +09:00
유창욱	7f5799e5e1	fix: PII retention, write-race serialization, and correctness fixes Governance: purge biometric face crops past a retention window (env COPYRIGHTER_FACE_CROP_RETENTION_DAYS, default 90d) with an audit trail, run at startup and reload; audit personal-image transmission to external Vision. Concurrency: a process write lock + atomic provider-usage delta stop lost counter updates; candidate promotion is idempotent (deterministic id + status guard); seeding is serialized. Correctness: skip LLM summarize when a summary already exists; constraint migration cleans orphan temp tables on failure. Add provider-readiness startup log. Tests pin all of the above plus risk-band boundaries (29/30/69/70, 100 cap) and media path-traversal guards.	2026-06-20 18:44:08 +09:00
유창욱	1abb1107a2	fix: cookie-based operator auth keeps token out of URLs Address commit security review: replace the ?token= query fallback (which leaked the token into logs/referrers) with an HttpOnly, SameSite=Strict session cookie minted on the first header-authenticated request, so <img> media loads authenticate without a URL token. Use hmac.compare_digest for constant-time comparison and add Cache-Control: no-store + Referrer-Policy: no-referrer on untrusted biometric media. Also cover upload/import boundary validation (400) at the HTTP layer.	2026-06-20 18:43:53 +09:00
유창욱	62e2d183f8	fix: block SSRF to internal addresses in remote fetchers Resolve each candidate image/page/stylesheet URL and refuse loopback, RFC1918, link-local (cloud-metadata), reserved, multicast, and unspecified targets before fetching; re-validate on every redirect hop via a custom opener. URLs originate from external search-result content, so this closes the operator server fetching internal services.	2026-06-20 18:22:10 +09:00
유창욱	8958dd1b83	chore: pin runtime dependencies for offline air-gapped install Add requirements.txt (numpy/opencv-python-headless/pillow — the only third-party runtime imports) and requirements-dev.txt, plus an offline install runbook. Ignore .coverage and wheelhouse/.	2026-06-20 18:19:08 +09:00
유창욱	20a6f55408	fix: SQLite concurrency safety and atomic decision writes Enable WAL + busy_timeout in _connect (ThreadingHTTPServer concurrent operators no longer hit 'database is locked'), add a _transaction helper and thread an optional conn through _put/_get/add_audit_event so record_decision commits its status change, watchlist entry, and audit event atomically.	2026-06-20 18:19:08 +09:00
유창욱	e9a15e8110	fix: harden operator HTTP server Remove wildcard CORS (prevented cross-origin reads of biometric/case data from localhost), add optional shared-token auth gate on data routes (COPYRIGHTER_AUTH_TOKEN; GUI shell + /health stay open), cap request body size (413), and map malformed JSON to 400 and SQLite lock contention to 503.	2026-06-20 18:18:54 +09:00
유창욱	62c13faafa	docs: implementation plan for project-review remediation	2026-06-20 18:18:54 +09:00
유창욱	37294dc140	fix: resolve multi-agent review findings for workbench efficiency round	2026-06-12 18:44:35 +09:00
유창욱	4d98582ed3	feat: rerun enrichment evidence diff with score delta and new-evidence badges	2026-06-12 18:00:43 +09:00
유창욱	1e0f4f8690	feat: persist and display detected face crop thumbnails in workbench	2026-06-12 17:56:09 +09:00
유창욱	646b871b76	feat: knowledge base search/filter, inline edit, and server-backed lifecycle actions	2026-06-12 17:51:36 +09:00
유창욱	cd9d69dddb	feat: knowledge entry update/deactivate/reactivate endpoints with audit events	2026-06-12 17:48:26 +09:00
유창욱	cf342425c5	feat: expose google_search as operator manual text-query provider	2026-06-12 17:46:45 +09:00
유창욱	4abb837aaa	feat: one-click and batch execution for suggested evidence queries	2026-06-12 17:44:48 +09:00
유창욱	63bbf0d755	docs: implementation plan for operator workbench efficiency (F1-F5)	2026-06-12 17:40:34 +09:00
유창욱	b4b8f4b5d8	docs: operator workbench efficiency design (F1-F5)	2026-06-11 16:03:15 +09:00
유창욱	7cac0b3835	fix: resolve code-review findings from the clean-review restyle Correctness: - Make the local-artifact audit test skip on fresh clones (data/ is gitignored), so the suite passes outside this workstation - Drop the transform from the viewRise entrance animation: an animated transform made .view.active a containing block for 320ms and threw the fixed decision panel off-screen on every workbench entry - Collapse the queue toolbar at 1380px instead of 1180px; 1280x800 laptops no longer get a horizontal scrollbar (verified live) - Serve .woff2 as font/woff2 with an immutable cache header so the 2MB bundled font is fetched once, not per page load (with test) - Clip overflow on top-bar status chips (long apiError strings spilled over neighbors at 981-1180px) - Give queue-row selection a selector that outranks the even-row zebra stripe (selection background was parity-dependent) Cleanup: - Replace the stale old-palette focus ring and ::selection literals with color-mix over var(--teal) - Delete dead tokens: unused back-compat aliases (the comment claiming they were referenced was false), --rail-bot, --ochre-deep, and --font-stamp (identical to --font-ui since the Pretendard switch) - Tokenize scattered raw colors: rail ink scale, soft tint levels, inset-well and bevel shadows, naver/internal source-chip triplets - Remove the asset-preload div and three orphan SVGs nothing renders; tests now reject reintroducing them Verified: 359 tests pass; Playwright audit at 1440/1280/390 shows zero horizontal overflow on all views, Pretendard active, decision panel fixed at the viewport corner mid-animation.	2026-06-11 11:13:46 +09:00
유창욱	ed701bd436	feat: clean review-instrument restyle with bundled Pretendard font - Bundle Pretendard Variable woff2 locally (air-gapped safe, no CDN) and switch UI/stamp font stacks to it; preload in index.html - Replace the forensic-dossier paper theme with a flat neutral cool palette: single teal accent, white cards, no noise texture, and zero linear/radial gradients (per design contract) - Restore the product-purpose top-bar block and its CSS, drop the unused global search form, and strip the stray UTF-8 BOM - Re-skin queue hover/selection, eyebrows, nav rail, chips, and empty states to the neutral palette; tabular numerals for numbers - Regenerate ui-overhaul final audit artifacts: zero horizontal overflow across 8 views at 1440x900 and 390x844, Pretendard active Design spec: docs/superpowers/specs/2026-06-11-operator-console-clean-review-ui-design.md Plan: docs/plans/2026-06-11-001-feat-operator-console-clean-review-ui-plan.md Tests: 358 passed (full suite incl. browser smoke)	2026-06-11 10:31:16 +09:00
유창욱	3f7b3a9cf2	chore: initial commit of copyrighter (rights_filter) Image rights / copyright detection system: SQLite store, HTTP app, search integrations (Naver, Google Custom Search, Google Cloud Vision web detection), image analysis (fingerprints, face/person detection, evidence enrichment, risk scoring), an admin/review layer, governance and retention policies, batch jobs, and a browser-based operator GUI. This baseline incorporates a full code-review remediation pass (46 fixes; 358 tests passing). Highlights: CRITICAL - Prevent evidence cascade-delete during the schema-constraint migration by disabling FK enforcement around the table rebuild. Security - Sandbox served media (neutralize stored XSS from uploaded/collected SVGs) via CSP + nosniff on the untrusted media routes. - Strip embedded EXIF/GPS from external image derivatives before they are sent to third-party APIs. - Return a clean 404 (not an uncaught StopIteration) for PATCH on an unknown provider. Correctness - LLM-summary failures no longer add +30 to the risk score. - Decode only explicit JS escapes so Korean image URLs are not mangled. - Consume search quota only after a successful request. - Naver/Google adapters map responses inside the failure boundary, so a malformed response degrades to evidence instead of crashing enrichment. - Domain-aware provider attribution; face-box IoU de-duplication; count searches (not result items); per-box crop isolation; clamp evidence confidence and Google CSE num; real submittedEpoch; and more. Robustness - Offline LLM connect fast-fails (short connect timeout) so seed/reload requests are not stalled; full read timeout preserved for generation. - Malformed numeric env vars fall back to defaults instead of crashing startup. Performance - Per-submission evidence reads (no full-table scan per rescore), audit-log LIMIT, lazy active-store lookup, hoisted timestamps. Tests - ~24 regression tests added pinning the above fixes. Runtime data (data/, outputs/, .sqlite3, .log), secrets (.env), and node_modules are gitignored.	2026-06-09 09:50:31 +09:00

28 commits