2026-05-11 16:05 KST Status#
Checked local WSL/drive processes and logs.
- Current free space: E: ZaraStudio about 878 GiB free, F: ZaraServer about 7.3 TiB free, H: Zara1 about 107 GiB free, VHDX
/mnt/zara-insu-dataabout 52 GiB free. - VHDX -> ZaraServer copy:
소설finished at 11:24:53;메타데이터finished at 13:07:35.만화and웹툰rsync workers are still running. Manga log around 16.39G transferred, 74%, current ETA line about 1h40m but progress2 line about 5h. Webtoon log around 5.42G transferred, 98%, likely near finish for current pass. - Cutover watcher is still waiting for active webtoon/manga collectors. It has not switched
/home/insu/인수창고/자료away from the VHDX yet. - Webtoon collection batch
site_targets_전체_인기순_20260511_080253_webtoon.txt: 30 titles total; 29 completed. Remaining title is백련성신, total 1358 episodes. At check time 1227 episode directories existed; recent filesystem mtime rate estimated about 0.7-1.4 hours remaining for that title. - AVIF converter remains stopped (
Tstate), so it should not be adding IO pressure. - ZaraStudio residual encode pipeline is running
E_ZaraStudio_additional_encode_queue_20260511_153259.csv. First 16 queue starts were skips/errors: destination exists, interlaced MXF manual-review, or ffprobe failure. Current file wasE:\영상\르네셀\촬영본\GH5M2\P1002144.MOV; ffmpeg had reached about 5m58s of a 7m07s source at ~0.197x. - E residual queue after current file: 28 items / about 67.3 GiB source, of which non-MXF is 10 items / about 29.3 GiB. The next non-MXF is
E:\영상\후즈아트\촬영본\a7m4\C8895-001.MP4(18.16 GiB), previously suspected risky due decode errors, so filtering before continuing would avoid wasted time. - H Zara1 -> ZaraStudio encode queue is not yet started; pipeline starts it only after the E queue finishes. H queue has 1423 items / about 3330 GiB source (3575.7 decimal GB). Top-30 ffprobe sample was 1194.7 decimal GB source and estimated 657.5 decimal GB output, ratio about 55.0%, duration 26.87h. At current ~0.205x encode speed, full H queue projects to about 15 days sequential; plausible range about 9-17 days depending actual speed.
Capacity Finding#
Direct H ZaraStudio merge to E is not feasible now. H ZaraStudio class is about 3459 GiB and E has only about 878 GiB free, so direct copy is short by about 2.58 TiB, or about 2.78 TiB if preserving a 200 GiB guard.
Even after encoding H videos, the expected final H->E payload is still roughly 1.63-2.13 TiB including non-video residue, with a central estimate around 1.96 TiB. Against current E free space this is still short by about 0.75-1.25 TiB raw, or about 0.95-1.45 TiB with a 200 GiB guard. Central shortage is about 1.1-1.3 TiB.
Recommended direction: do not try to finish all H->E on ZaraStudio immediately. Use F: ZaraServer as overflow/staging for verified encoded outputs or keep H parked until more E space is freed. Also filter current queues to skip interlaced MXF, existing-destination items, corrupt/ffprobe-failed files, and known risky C8895 before spending more encode time.
Sagwan Revalidation 2026-05-11T07:18:47Z#
- verdict:
refresh - note: 진행률·ETA·남은 용량 등 실시간 상태라 재확인이 필요함
Correction 2026-05-11 16:28 KST#
The earlier wording that manga/webtoon copy was nearly done was misleading. Recheck showed F: ZaraServer used only about 47 GiB while the VHDX mount reports about 415 GiB used. The active rsync progress2 percentages for manga/webtoon should not be treated as total-source completion because the file list is still incremental and the source is changing.
Verified details:
- VHDX top-level contains only
/mnt/zara-insu-data/자료pluslost+foundand a write test file. /자료contains only만화,웹툰,소설,메타데이터.- Finished rsync totals:
소설about 17.20G,메타데이터about 2.15G. 만화sourceducompleted: about 73G disk usage / 72G apparent size. Current rsync had only transferred around 17.58G at the recheck, so manga is not complete.웹툰sourcedudid not finish within 60 seconds, indicating a large/high-file-count tree. By subtracting known source dirs from VHDXdf, webtoon is likely hundreds of GiB, roughly around 320GiB before precise measurement. Current rsync had only transferred around 5.91G, so webtoon is definitely not complete.- AVIF stopped process held only about 0.33GiB of deleted open files, so it does not explain the 415GiB vs 47GiB gap.
Operational correction: keep the live link on the VHDX; do not cut over based only on rsync progress percentages. After active collection ends, run a verified final copy/check using source-vs-destination size/stat comparison or rsync dry-run stats before switching symlink to ZaraServer.
Copy Method Recheck 2026-05-11 16:30 KST#
Rechecked whether the requested robocopy multi-thread copy was actively running. It is not currently active. tasklist.exe found no robocopy.exe, and /mnt/f/insu-warehouse-data/_robocopy_logs/robocopy_fast_20260511_105047.log is only a 640-byte header showing the intended /MT:32 /R:1 /W:1 command setup, with no completion/progress summary. The active copy processes are WSL rsync workers for 만화 and 웹툰; 소설 and 메타데이터 already completed. Webtoon collection is still active: at 16:30 KST, 백련성신 had reached episode 1278 of 1358, with about 80 episodes remaining. Operational takeaway: do not say robocopy is currently applied; the current live method is safer low-priority rsync plus final-delta/cutover watcher. After collection stops, a robocopy retry can be considered for speed, but should not be run concurrently with the active rsync/collector unless deliberately accepting extra IO pressure.
Decision 2026-05-11 16:33 KST#
User approved switching to a safer completed-folder copy strategy: collection/encoding should continue in WSL/staging, and already completed content should be copied to ZaraServer using robocopy with limited threading, excluding actively collecting folders and temporary/partial files. Recommended starting setting is MT 8, then raise to 16 if IO remains healthy; avoid MT 32 while live collection is writing to the same VHDX/source tree.
Completed-Only Robocopy Started 2026-05-11 17:04 KST#
Implemented the approved completed-only copy strategy. Stopped old whole-tree rsync workers and the automatic cutover watcher to avoid duplicate IO and premature symlink switch. Created and started F:\insu-warehouse-data\_robocopy_logs\robocopy_completed_to_zaraserver.ps1 with -Mt 8. The script runs robocopy sequentially for 소설, 메타데이터, 만화, then 웹툰, using /E without deletion and excluding .rsync-partial, temp/partial file patterns, and active webtoon title 백련성신. WSL PID recorded in /home/insu/인수창고/창고스크랩퍼/logs/completed_only_robocopy.pid was 1065710. Status file written to /home/insu/인수창고/창고스크랩퍼/logs/completed_only_robocopy.status. At verification, Robocopy.exe was running, F: usage had risen from about 47G to 49G, and webtoon collection was still active/reaching 백련성신 1332/1358.
IO Spike Response 2026-05-11 17:09 KST#
User reported IO looked dangerous. Checked PSI/vmstat/processes. At 17:07 KST WSL IO pressure was not catastrophic (io some avg10 about 1.00, full avg10 about 0.87) but Windows had both Robocopy.exe and ffmpeg.exe active. The encode queue had advanced to risky file E:\영상\후즈아트\촬영본\a7m4\C8895-001.MP4, with ffmpeg.exe writing output while robocopy was copying completed content via \\wsl.localhost. To reduce overlapping IO, stopped the completed-only robocopy (Robocopy.exe PID 11184 / PowerShell WSL pid 1065710). After pause, robocopy was gone, ffmpeg remained active, collector remained active, and IO pressure was around some avg10 0.90 / full avg10 0.87. Status file was appended with pause reason and restart suggestion MT 2 after pressure settles.
Actual Pipeline State 2026-05-11 17:15 KST#
Rechecked current collection/encoding/transfer state after the user's question. The desired WSL staging -> AVIF encode -> ZaraServer final transfer pipeline is not currently active. /home/insu/인수창고/자료 is still a symlink to /mnt/zara-insu-data/자료 (VHDX), so the live ordered_webtoon_collector.py is still writing to the VHDX, not /home/insu/insu-warehouse-staging/자료. The AVIF node process exists but is stopped (T state), so webtoon image encoding is not actively running. Robocopy completed-only transfer is paused and no robocopy.exe is running. The only active collection is the old batch finishing 백련성신, which had reached about 1350/1358 episodes at 17:15 KST. The desired pipeline exists in scripts (run_media_collection_pipeline.sh with DATA_ROOT=$STAGING_ROOT and FINAL_DATA_ROOT=$NEW_ROOT), but it was not launched because cutover was stopped to avoid premature symlink switch and duplicate IO.
Staging IO Probe Watcher 2026-05-11 17:23 KST#
Implemented the user's requested automatic transition. Created /home/insu/인수창고/창고스크랩퍼/tools/start_staging_pipeline_after_current_round.sh and launched watcher pid 1135776. The watcher observed the current VHDX round finish (백련성신 completed 1358/1358), stopped old AVIF job pid 561182, then started staging pipeline pid 1135818. Staging configuration: DATA_ROOT=/home/insu/insu-warehouse-staging/자료, FINAL_DATA_ROOT=/mnt/f/insu-warehouse-data/자료, TRANSFER_AFTER_BATCH=0, WEBTOON_ENABLED=1, MANGA_ENABLED=0, MAX_CYCLES=1, low webtoon concurrency (batch 2, image concurrency 2, AVIF concurrency 1). Purpose is IO probing only; no ZaraServer copy and no final source deletion. At 17:23 KST, staging pipeline was in prepare_site_collection_targets.py duplicate-scan phase using staging root plus VHDX/F extra roots; IO pressure was low (io some/full avg10 about 0.00, avg60 about 0.01).
Staging Queue Pipeline 2026-05-11 17:40 KST#
- User clarified desired flow: exclude already collected/encoded media, build 30-item batches, collect/encode in WSL staging, then move completed works into ZaraServer.
- Added
/home/insu/인수창고/창고스크랩퍼/tools/prepare_staging_media_queue.pyto build shallow queues from the live source TSVs without recursively scanning VHDX/ZaraServer trees. It excludes catalog items plus first-level work folders/manifests in staging, ZaraServer, and/mnt/zara-insu-data/자료, plus active webtoon target files. - Added
/home/insu/인수창고/창고스크랩퍼/tools/run_staging_queue_pipeline.sh, a detached controller for WSL staging collection, AVIF conversion, and safe completed-work transfer to/mnt/f/insu-warehouse-data/자료viatransfer_completed_media_to_final.py --delete-source. - Patched
ililtoon_collector_v2.pyto accept--targets-file, allowing manga to use the generated queue instead of rescanning live lists every batch. - Dry-run queue at
/home/insu/인수창고/창고스크랩퍼/logs/staging_queues_dryrun_20260511_173352reported webtoon source 14,295 / remaining 13,409, manga source 16,650 / remaining 16,338, batch size 30 each. - Active old staging webtoon batch is still running (
ordered_webtoon_collector.pypid 1140248) forsite_targets_전체_인기순_20260511_172128_webtoon.txt; new queue controller pid 1146278 is detached withsetsidand waiting for existing collection/encoding jobs to finish before it starts queue cycles. - IO pressure after switching away from deep prepare scan was low at check time:
/proc/pressure/ioavg10 around 0.21, avg60 around 0.30. Staging webtoon usage was ~562M with죽이고-싶다and꽃을-든-여자in progress.
D/E IO Spike Response 2026-05-11 22:08 KST#
- User reported D: and E: spent a long time at 100% usage and the PC became unresponsive.
- Checked live counters after the spike: D: read/write 0 B/s and queue 0; E: write about 7-8 MB/s, read 0, queue 0-1; H: read about 24-26 MB/s. WSL
/proc/pressure/ioavg10 about 0.21, avg60 about 0.23. - Root cause candidate found: Windows PowerShell PID 32928 running
E:\_MAIN1_ARCHIVE_MIGRATION\06_scripts\wait_main3_drain_then_inventory.ps1 -WaitProcessId 999999, which recursively scans D/E/F/H withGet-ChildItem -Recurse -Force -File. This matches simultaneous D/E 100% usage and UI freeze. - Stopped PID 32928. Also killed my own accidental heavy WSL
ls -t /mnt/e/_MAIN1_ARCHIVE_MIGRATION/05_transcode_logs/*inspection commands. - Left active H->E ffmpeg encode running: Windows ffmpeg PID 27744 encoding
H:\자라스튜디오\마이스컴퍼니\2024 세번째 안테나\촬영본\스케치\C6667.MP4intoE:\_MAIN1_ARCHIVE_MIGRATION\07_zara1_to_zarastudio_encode_scratch\...tmp.mp4; current impact was modest (~7-8 MB/s E write, ~24-26 MB/s H read). - Recent Windows System event search for disk/NTFS/storage-style errors returned no matching events in the checked window, so this looked like workload saturation rather than a confirmed hardware/NTFS error at check time.
Inventory IO Guard Fix 2026-05-11 22:15 KST#
- Rechecked root cause after user asked for a fix.
wait_main3_drain_then_inventory.ps1had two unsafe behaviors: if-WaitProcessIdwas already gone (e.g.999999), it treated that as ready-to-scan and immediately started inventory; and its default path recursively indexed every file on D/E/F/H withGet-ChildItem -Recurse -Force -File. - Patched
/mnt/e/_MAIN1_ARCHIVE_MIGRATION/06_scripts/wait_main3_drain_then_inventory.ps1: - default
WaitProcessIdis now0, and if a supplied wait PID is not running, the script exits without scanning unless-RunNowis explicitly supplied. - default mode is safe summary only; recursive duplicate/latest-file indexing now requires
-RunNow -FullFileIndex. - skip top folders now include
$RECYCLE.BIN,System Volume Information, and_MAIN1_ARCHIVE_MIGRATION, so the script will not recurse into its own generated logs/work root. - full indexing has
MaxFilesPerDrive, throttling, and throws/stops enumeration when the limit is reached rather than continuing to enumerate the whole disk. - safe-summary report wording now says
scanMode=safe-summary,indexedFiles=0, anddataGB not calculatedto avoid misreading skipped indexing as empty drives. - Verification: running with
-WaitProcessId 999999exited immediately withwait pid is not running and -RunNow was not supplied; exiting without disk scan.Running safe-RunNow -DriveLetters Ecompleted in seconds with no recursive file index. Running-RunNow -FullFileIndex -MaxFilesPerDrive 1stopped at exactly one indexed file and loggedremaining files intentionally skipped to protect disk IO.Current D/E/H queue lengths were 0-1 and WSL IO pressure stayed low.
Sagwan Revalidation 2026-05-12T07:47:37Z#
- verdict:
refresh - note: 실행 상태·잔여 시간·용량 수치가 시간 민감해 재확인이 필요함
Sagwan Revalidation 2026-05-13T08:24:26Z#
- verdict:
refresh - note: 진행률·여유공간·작업 상태가 시간 의존적이라 현재 재확인이 필요함
Sagwan Revalidation 2026-05-14T08:31:15Z#
- verdict:
refresh - note: 진행 상태·ETA·여유 공간 수치가 시점성이라 3일 후 재확인이 필요함
Sagwan Revalidation 2026-05-15T08:58:08Z#
- verdict:
refresh - note: 진행 상태·용량·ETA가 모두 시점 의존이라 최신 재확인이 필요함
Sagwan Revalidation 2026-05-16T09:06:09Z#
- verdict:
refresh - note: 진행 중 작업·잔여시간·여유공간 수치가 시간 민감해 현재 재확인이 필요함
Sagwan Revalidation 2026-05-17T09:28:41Z#
- verdict:
refresh - note: 진행 상태·여유 공간·ETA가 시점성 정보라 현재 재확인이 필요함
Sagwan Revalidation 2026-05-18T09:54:13Z#
- verdict:
refresh - note: 실행 상태·잔여 용량·ETA가 시간 민감해 현재 재확인이 필요함
Sagwan Revalidation 2026-05-19T10:22:17Z#
- verdict:
refresh - note: 실행 상태·용량·ETA가 시간 민감해 현재 재확인이 필요함
Sagwan Revalidation 2026-05-20T10:44:58Z#
- verdict:
refresh - note: 실행 상태·잔여 용량·ETA가 9일 전 스냅샷이라 현재 재확인이 필요함
Sagwan Revalidation 2026-05-21T11:21:43Z#
- verdict:
refresh - note: 5/11 실시간 용량·프로세스 상태라 현재 재확인이 필요합니다.
Sagwan Revalidation 2026-05-22T11:52:15Z#
- verdict:
refresh - note: 실행 상태·여유 공간·ETA가 시간 민감해 현재 재확인이 필요함
Sagwan Revalidation 2026-05-23T12:24:49Z#
- verdict:
refresh - note: 실행 상태·잔여 용량·ETA가 시간 민감해 현재 재확인이 필요함
Sagwan Revalidation 2026-05-24T12:29:37Z#
- verdict:
refresh - note: 실행 상태·여유 공간·ETA가 시간 민감해 현재 재확인이 필요합니다.
Sagwan Revalidation 2026-05-25T12:47:02Z#
- verdict:
refresh - note: 진행상태·여유공간 수치가 시간 의존적이라 최신 점검으로 대체가 필요함
Sagwan Revalidation 2026-05-26T13:10:39Z#
- verdict:
ok - note: [chatgpt 오류] The read operation timed out
Sagwan Revalidation 2026-05-27T13:19:44Z#
- verdict:
refresh - note: 실행 상태·여유 공간·ETA가 시점 의존 정보라 현재 재확인이 필요함
Sagwan Revalidation 2026-05-28T13:48:17Z#
- verdict:
refresh - note: 진행 상태·여유 공간·ETA가 모두 시점 의존이라 최신 재확인이 필요함
Sagwan Revalidation 2026-05-29T14:17:00Z#
- verdict:
refresh - note: 실시간 용량·프로세스·ETA가 18일 전 상태라 최신 점검이 필요합니다
Sagwan Revalidation 2026-05-30T14:28:13Z#
- verdict:
refresh - note: 실시간 용량·작업 상태·ETA가 5/11 기준이라 현재 재확인이 필요함
Sagwan Revalidation 2026-05-31T14:39:16Z#
- verdict:
refresh - note: 실행 상태·잔여 시간·용량 수치가 시점 의존적이라 현재 재확인이 필요함
Sagwan Revalidation 2026-06-01T16:14:38Z#
- verdict:
refresh - note: 실행 상태·여유 공간·진행률이 시점성 정보라 현재 재확인이 필요함
Sagwan Revalidation 2026-06-02T20:35:04Z#
- verdict:
refresh - note: 진행 상태·여유 공간·작업 큐가 시간 의존적이라 현재값 재확인이 필요함
Sagwan Revalidation 2026-06-03T20:56:46Z#
- verdict:
refresh - note: 실시간 용량·작업 상태 기록이라 현재 재확인이 필요하다.
Sagwan Revalidation 2026-06-04T21:32:27Z#
- verdict:
refresh - note: 진행상태·여유공간·작업 ETA가 시점 의존이라 현재 재확인이 필요함