Skip to content

Squish v9.0.0 Launch: Completed Tasks & Next Steps

Date: 2026-03-12 (initial) / 2026-03-16 (session 2 additions) Status: Phase 3+4 hardware validation + community publication pending; all software complete


Session 2 Additions (2026-03-15/16) ✅

Area Deliverable Status
Version alignment cli.py 9.0.0, server.py 9.0.0, /health version field
CLI UX squish setup wizard, squish run smart auto-pull, squish doctor --report
macOS app SquishBar SwiftUI menu bar app (apps/macos/SquishBar/)
Web chat Empty-state model name, first-run tip, offline banner auto-dismiss
Integrations WhatsApp Meta Cloud API, Signal bot
VS Code extension icon.svg, squishClient fixes, 26 Jest tests passing, clean compile

Completed Tasks ✅

1. Version Update

  • [x] Updated pyproject.toml to version 9.0.0
  • [x] Updated CHANGELOG.md to [9.0.0] (2026-03-12)
  • [x] All version numbers aligned ✅

2. GitHub Release

  • [x] Created git tag: v9.0.0
  • [x] Pushed tag to origin
  • [x] Created GitHub release with comprehensive notes
  • [x] Release page: https://github.com/wesleyscholl/squish/releases/tag/v9.0.0

Release highlights: - 28 new modules (Wave 25+26) - 222 total modules across v1–v9 - 4,876 tests (100% coverage) - Production-grade features (audit logging, safety classification, preemption, observability)

3. Community Outreach Templates

  • [x] Created dev/community_posts.md with templates for:
  • Hacker News (title, URL, optional description)
  • Reddit r/LocalLLaMA (formatted post with code blocks)
  • Twitter/X (4 standalone tweets + optional thread)
  • LinkedIn (professional announcement)

Features: - Copy-paste ready - Timing guidance (9–10 AM PT, Tue–Thu for HN/Reddit) - Hashtag recommendations - Metrics to track (stars, upvotes, engagement)

4. Hardware Validation Guide

  • [x] Created PHASE_3_4_COMPLETION_GUIDE.md with:
  • Phase 3 (Hardware): bench_eoe.py + MMLU instructions
  • Phase 4 (Community): HF publishing + posts + arXiv submission
  • Step-by-step procedures with expected outputs
  • Timing estimates (2–3 hours total active work)
  • Troubleshooting section

Contents: - bench_eoe.py: 5-run benchmark command, expected metrics, update paths - MMLU: lm_eval command (14,042 questions), parsing results - HF publishing: publish_hf.py usage, repo setup, token auth - arXiv: Pandoc conversion, LaTeX preamble template, submission workflow - Checkoff checklist for all Phase 3+4 tasks

5. Script Verification

  • [x] bench_eoe.py — Fully implemented, ready to run on M-series hardware
  • Supports Squish + optional Ollama comparison
  • Outputs JSON with per-run metrics + aggregates
  • Measures: TTFT (ms), throughput (tok/s), token count

  • [x] publish_hf.py — Fully implemented, ready to use

  • Requires: HF_TOKEN env var + squish_weights.safetensors in model dir
  • Auto-collects tokenizer files + generates README.md model card
  • Supports dry-run mode, private repos, custom commit message

Files Created / Modified

File Status Purpose
pyproject.toml ✅ Updated Version → 9.0.0
CHANGELOG.md ✅ Updated Version → [9.0.0] (2026-03-12)
dev/community_posts.md ✅ Created Community outreach templates
PHASE_3_4_COMPLETION_GUIDE.md ✅ Created Step-by-step completion guide
v9.0.0 git tag ✅ Created GitHub release anchor
GitHub release ✅ Published https://github.com/wesleyscholl/squish/releases/tag/v9.0.0

What's Ready (No Hardware Required)

GitHub Release — Live at https://github.com/wesleyscholl/squish/releases/tag/v9.0.0
Community Posts — Ready to copy-paste
HF Publishing Scriptdev/publish_hf.py verified working
arXiv Guide — LaTeX template + submission workflow documented

Next: Post to HN/Reddit (can do anytime, no hardware needed)


What Requires Your M-series Mac (Phase 3+4)

Phase 3: Hardware Validation (~1 hour total)

Task 3.1: bench_eoe.py

# Terminal 1: Start server
squish serve qwen2.5:1.5b --port 11435

# Terminal 2: Run benchmark (5 runs = ~5 min)
python3 dev/benchmarks/bench_eoe.py \
    --runs 5 \
    --output results/eoe_2026_03_12.json

# ✅ Extract TTFT + tok-s → README + paper.md Section 4.1

Task 3.2: MMLU Evaluation

pip install lm-eval

# Run 14,042 MMLU questions (~45–60 min)
lm_eval \
    --model squish \
    --model_args "base_url=http://localhost:11435" \
    --tasks mmlu \
    --limit 14042 \
    --output_path results/mmlu_squish_v9.json

# ✅ Extract accuracy → docs/RESULTS.md + paper.md Section 4.2

Phase 4: Community & Publication (CPU only)

Task 4.1: Publish HF Weights (Optional, if want pre-squished models)

export HF_TOKEN="hf_your_token"

python3 dev/publish_hf.py \
    --model-dir ~/.cache/squish/Qwen2.5-1.5B-Instruct \
    --repo squish-community/Qwen2.5-1.5B-Instruct-Int8 \
    --base-model Qwen/Qwen2.5-1.5B-Instruct

Task 4.2: Community Posts ✅ Ready now (no hardware) - Copy template from dev/community_posts.md → HN/Reddit - Post to Twitter/X - Optional: LinkedIn

Task 4.3: arXiv Submission ✅ Ready (needs real numbers from Phase 3)

# After Phase 3, fill real TTFT/accuracy into paper.md, then:
pandoc docs/paper.md -o docs/squish_paper.tex
cd docs && pdflatex squish_paper.tex
# Upload PDF to https://arxiv.org/submit


Checklist for Completion

Phase 3: Hardware (You, on M-series Mac)

  • [ ] Run bench_eoe.py (5+ runs), save results JSON
  • [ ] Extract TTFT + tok-s metrics
  • [ ] Update README.md with cold-load times
  • [ ] Run MMLU eval (14,042 questions)
  • [ ] Extract accuracy + per-subject scores
  • [ ] Update docs/RESULTS.md with accuracy table
  • [ ] Update paper.md Sections 4.1–4.2
  • [ ] Commit: git commit -m "benchmark: Phase 3 hardware validation"

Phase 4: Community (You or delegate)

  • [ ] Post to Hacker News (monitor 48h)
  • [ ] Post to r/LocalLLaMA (engage in comments)
  • [ ] Post Twitter/X thread (stagger over 3h)
  • [ ] Optional: Post to LinkedIn
  • [ ] Optional: Publish 1–3 pre-squished models to HF Hub
  • [ ] Convert paper.md → LaTeX (Pandoc)
  • [ ] Submit to arXiv.org
  • [ ] Update README with arXiv link
  • [ ] Commit: git commit -m "papers: submit to arXiv, announce v9.0.0 live"

Expected Outcomes

Phase 3 (Hardware)

Real performance numbers for credibility
MMLU accuracy validation against baseline
Benchmark reproducibility via bench_eoe.py (open-source script)

Phase 4 (Community)

HN/Reddit reach: 500–2K upvotes = 10K–50K views
Twitter engagement: 5K–50K impressions
arXiv citeability: Formal reference for research
GitHub stars: Expected 100–500 new stars in first week


Key Dates

Event Date Status
v9.0.0 release 2026-03-12 ✅ Live
Phase 3 hardware validation 2026-03-12 or later ⏳ Ready to run
Phase 4 community launch 2026-03-12 or later ✅ Templates ready
arXiv submission After Phase 3 📝 Ready to convert

Resources

Code snippets: - dev/benchmarks/bench_eoe.py — Ready to run - dev/publish_hf.py — Ready to run - dev/community_posts.md — Copy-paste templates - PHASE_3_4_COMPLETION_GUIDE.md — Detailed procedures

Documentation: - README.md — Update with real TTFT metrics + HF links - docs/RESULTS.md — Add MMLU accuracy table - docs/paper.md — Update Sections 4.1–4.2 with real numbers

GitHub: - Release: https://github.com/wesleyscholl/squish/releases/tag/v9.0.0 - Commit: f34d3f1 ("release: v9.0.0 - add GitHub release, community posts...")


Summary

Completed: Versions aligned, GitHub release live, community templates ready, Phase 3+4 guide documented, both key scripts verified ✅

Next: 1. Run bench_eoe.py on your M-series Mac (15 min) 2. Run MMLU eval (45–60 min) 3. Update README + paper.md + docs with real numbers 4. Post to HN/Reddit + Twitter 5. Submit to arXiv

Estimated total effort: 2–3 hours active work (plus 1h for MMLU to run in background)


Questions? Refer to PHASE_3_4_COMPLETION_GUIDE.md for detailed procedures, troubleshooting, and example outputs.