===============================================
IndexTTS2 GUI - Project Size Analysis
项目大小分析报告
===============================================
Analysis Date: 2025-11-22
Total Project Size: ~26 GB


===============================================
SIZE BREAKDOWN | 空间占用分析
===============================================

Directory                    Size        Status          Action
------------------------------------------------------------------------
checkpoints/                 8.26 GB     ✓ Required      Keep
tmp_hf_home/                 8.20 GB     ⚠️ Duplicate!   Can DELETE
venv/                        5.58 GB     ✓ Current       Keep
venv_old/                    4.47 GB     ✗ Old           Can DELETE
Others                       ~0.5 GB     ✓ Required      Keep
------------------------------------------------------------------------
TOTAL                        26.51 GB

Potential Space Savings: ~12.7 GB (48%)


===============================================
DUPLICATE MODELS DETECTED | 发现重复模型
===============================================

Model: facebook--w2v-bert-2.0
Location 1: checkpoints/hf_cache/        2,214 MB
Location 2: tmp_hf_home/hub/             4,436 MB (2x size!)
Duplication: YES ⚠️

Model: nvidia--bigvgan_v2_22khz_80band_256x
Location 1: checkpoints/hf_cache/        428 MB
Location 2: tmp_hf_home/hub/             428 MB
Duplication: YES ⚠️

Model: amphion--MaskGCT
Location 1: checkpoints/hf_cache/        169 MB
Location 2: tmp_hf_home/hub/             169 MB
Duplication: YES ⚠️

Model: funasr--campplus
Location 1: checkpoints/hf_cache/        27 MB
Location 2: tmp_hf_home/hub/             27 MB
Duplication: YES ⚠️

Total Duplicate Size: ~5.1 GB


===============================================
WHY DUPLICATES EXIST | 重复原因
===============================================

1. Hugging Face Cache Duplication
   - checkpoints/hf_cache/ - Project-specific cache
   - tmp_hf_home/hub/ - Temporary HF_HOME cache
   - Both contain the same models!

2. Virtual Environment Duplication
   - venv/ - Current environment (5.58 GB)
   - venv_old/ - Old environment (4.47 GB)
   - PyTorch CUDA libraries duplicated in both


===============================================
DETAILED SIZE ANALYSIS | 详细分析
===============================================

checkpoints/ (8.26 GB) - KEEP
-------------------------------
✓ gpt.pth                      ~1.5 GB    - GPT model (required)
✓ s2mel.pth                    ~1.3 GB    - S2Mel model (required)
✓ qwen0.6bemo4-merge/          ~1.2 GB    - Qwen emotion model
✓ hf_cache/                    ~2.8 GB    - HF models cache
  ├─ facebook--w2v-bert-2.0    2.2 GB
  ├─ nvidia--bigvgan           0.4 GB
  ├─ amphion--MaskGCT          0.17 GB
  └─ funasr--campplus          0.03 GB
✓ Other files                  ~1.4 GB

tmp_hf_home/ (8.20 GB) - CAN DELETE
------------------------------------
⚠️ hub/models--facebook--w2v-bert-2.0     4.4 GB  - DUPLICATE!
⚠️ hub/models--nvidia--bigvgan            0.4 GB  - DUPLICATE!
⚠️ hub/models--amphion--MaskGCT           0.17 GB - DUPLICATE!
⚠️ hub/models--funasr--campplus           0.03 GB - DUPLICATE!
⚠️ xet/cache/                             ~3.2 GB - Temporary cache

venv/ (5.58 GB) - KEEP (Current)
---------------------------------
✓ PyTorch + CUDA libraries     ~4.5 GB
✓ Other Python packages        ~1.0 GB

venv_old/ (4.47 GB) - CAN DELETE
---------------------------------
✗ PyTorch + CUDA libraries     ~3.8 GB  - OLD VERSION
✗ Other Python packages        ~0.6 GB  - NOT NEEDED


===============================================
RECOMMENDED ACTIONS | 建议操作
===============================================

Priority 1: Delete tmp_hf_home/ (Save ~8.2 GB)
----------------------------------------------
This is a SAFE deletion:
✓ Models already in checkpoints/hf_cache/
✓ Will be recreated if needed
✓ Only temporary cache

Command:
Remove-Item "tmp_hf_home" -Recurse -Force


Priority 2: Delete venv_old/ (Save ~4.5 GB)
--------------------------------------------
Safe if venv/ is working:
✓ Old virtual environment
✓ Not used anymore
✓ Can be recreated if needed

Command:
Remove-Item "venv_old" -Recurse -Force


Priority 3: Optional Cleanups
------------------------------
a) Clean temp/ directory
   - Temporary WAV files (~150 files)
   - Command: Remove-Item "temp\*" -Recurse -Force
   - Savings: ~100-200 MB

b) Clean __pycache__/ directories
   - Python bytecode cache
   - Auto-regenerated
   - Command: Get-ChildItem -Path . -Filter __pycache__ -Recurse | Remove-Item -Recurse -Force
   - Savings: ~50-100 MB

c) Clean build/ directory
   - PyInstaller build files
   - Can be recreated
   - Command: Remove-Item "build" -Recurse -Force
   - Savings: ~500 MB


===============================================
SAFE DELETION SCRIPT | 安全清理脚本
===============================================

Run this in PowerShell (in project root):

# Backup check
Write-Host "Current directory: $(Get-Location)"
Write-Host "Press Ctrl+C to cancel, or"
Pause

# Delete tmp_hf_home (save 8.2 GB)
if (Test-Path "tmp_hf_home") {
    Write-Host "Deleting tmp_hf_home..."
    Remove-Item "tmp_hf_home" -Recurse -Force
    Write-Host "✓ Deleted tmp_hf_home (saved ~8.2 GB)"
}

# Delete venv_old (save 4.5 GB)
if (Test-Path "venv_old") {
    Write-Host "Deleting venv_old..."
    Remove-Item "venv_old" -Recurse -Force
    Write-Host "✓ Deleted venv_old (saved ~4.5 GB)"
}

# Optional: Clean temp
if (Test-Path "temp") {
    Write-Host "Cleaning temp directory..."
    Remove-Item "temp\*" -Recurse -Force -ErrorAction SilentlyContinue
    Write-Host "✓ Cleaned temp directory"
}

# Optional: Clean __pycache__
Write-Host "Cleaning Python cache..."
Get-ChildItem -Path . -Filter __pycache__ -Recurse | Remove-Item -Recurse -Force
Write-Host "✓ Cleaned Python cache"

Write-Host ""
Write-Host "Cleanup complete!"
Write-Host "Estimated space saved: ~12.7 GB"


===============================================
AFTER CLEANUP SIZE | 清理后大小
===============================================

checkpoints/                 8.26 GB     (unchanged)
venv/                        5.58 GB     (unchanged)
Others                       ~0.5 GB     (unchanged)
------------------------------------------------------------------------
TOTAL (After Cleanup)        ~14.3 GB

Space Saved: 12.2 GB (46% reduction)


===============================================
WILL IT STILL WORK? | 会影响功能吗？
===============================================

YES! ✓ Project will work perfectly after cleanup.

What happens:
1. Models in checkpoints/ are still there
2. tmp_hf_home/ will be auto-recreated if needed
3. venv_old/ is not used anyway
4. All functionality remains intact

If tmp_hf_home/ is needed again:
- It will be automatically created
- Models will link from checkpoints/hf_cache/
- No re-download needed (models already cached)


===============================================
WHY SO BIG? | 为什么这么大？
===============================================

This is normal for AI/ML projects:

1. AI Models are Large
   - GPT model: ~1.5 GB
   - S2Mel model: ~1.3 GB
   - Qwen model: ~1.2 GB
   - W2V-BERT: ~2.2 GB
   - Others: ~1 GB
   Total: ~7-8 GB (unavoidable)

2. PyTorch + CUDA
   - PyTorch library: ~2 GB
   - CUDA libraries: ~2-3 GB
   Total: ~5 GB per venv

3. Duplicate Caches (fixable!)
   - HF cache duplication
   - Old venv

After cleanup: 14 GB is reasonable for an AI project!


===============================================
COMPARISON WITH SIMILAR PROJECTS
===============================================

Typical sizes:
- Stable Diffusion: 10-30 GB
- LLM projects: 20-100 GB
- TTS projects: 5-15 GB

IndexTTS2 after cleanup: 14 GB ✓ (reasonable)


===============================================
.GITIGNORE RECOMMENDATIONS | Git忽略建议
===============================================

Already configured in .gitignore:
✓ venv/
✓ venv_old/
✓ tmp_hf_home/
✓ temp/
✓ __pycache__/
✓ build/

This ensures repository stays small (<100 MB code only)


===============================================
FOR USERS CLONING PROJECT | 给克隆用户的说明
===============================================

When users clone from GitHub:
- They only download ~100 MB (code + configs)
- Models download on first run (~8 GB)
- Virtual environment created locally (~5 GB)
- Total on their machine: ~14 GB

This is MUCH better than including everything in repo!


===============================================
SUMMARY | 总结
===============================================

Current Issue:
✗ Duplicate HF cache (tmp_hf_home/ vs checkpoints/hf_cache/)
✗ Old virtual environment (venv_old/)
✗ Total: 26 GB

Solution:
✓ Delete tmp_hf_home/ (save 8.2 GB)
✓ Delete venv_old/ (save 4.5 GB)
✓ Optional cleanups (save 0.5 GB)
✓ New total: ~14 GB

Result:
✓ 46% size reduction
✓ No functionality lost
✓ More manageable project size
✓ Faster backups/transfers


===============================================
READY TO CLEAN? | 准备清理？
===============================================

Run the cleanup script above, or manually:

1. Delete tmp_hf_home:
   Remove-Item "tmp_hf_home" -Recurse -Force

2. Delete venv_old:
   Remove-Item "venv_old" -Recurse -Force

3. Test project:
   run_tts_gui.bat

Everything should work normally!


===============================================
Last Updated: 2025-11-22
Analysis Tool: PowerShell
===============================================

