1. Executive Snapshot
- 146 tín hiệu/đợt quét; GitHub 56, HN/web 37, YouTube 30, papers/product 23 → trọng tâm chuyển sang coding-agent execution + eval harness.
- 2 tín hiệu benchmark lặp lại: SWE-bench/CWE-bench xuất hiện nhiều nguồn; một số claim 35%–97% cần test tái lập nội bộ trước adopt.
- GitHub momentum: nhiều repo agent/runtime cập nhật gần thời gian thực (updated_at mới) → cửa sổ trial 2 tuần cho NEXA.
- YouTube parse được 30 video IDs nhưng thiếu view/comment API → confidence social mức Trung bình.
- Kênh X/Facebook công khai thiếu dữ liệu xác thực (N/A do hạn chế public parsing) → giảm confidence chiến lược GTM social.
2. KPI Dashboard
Candidates
146
146
GitHub
56
56
YouTube
30
30
HN/Web
37
37
Papers/Product
23
23
Social caveat: X=0, Reddit=0, Facebook=0 trong run bounded fallback.
3. CTO Evaluation Matrix
| Signal | Evidence | Counter | Fabbi impact | Decision |
|---|---|---|---|---|
| SWE-bench signal dày | >=6 nguồn benchmark/blog | claim variance cao | NEXA harness eval | trial |
| Repo coding-agent tăng | 56 repo signals | noise/demo repos | FARE context + NEXA exec | trial |
| Social missing X/Reddit | 0 verified items | collector hạn chế | SYNCA confidence gate | watch |
4. CTO Recommendations (4)
- Thiết lập harness NEXA tuần này (ROI 18-25%, Risk 3/5, Owner: Eng Lead, TTV: 7 ngày, Validate: pass@k + retry cost).
- Pilot 2 agent CLI (Claude Code/Codex) trên 20 task nội bộ (ROI 12-20%, Risk 2/5, Owner: DX Lead, TTV: 10 ngày, Validate: lead-time delta).
- Bật SYNCA quality gate cho AI PR (ROI 10-15%, Risk 2/5, Owner: QA Lead, TTV: 14 ngày, Validate: escaped defect rate).
- Thiết kế DOMUS governance log cho tool-use (ROI 8-12%, Risk 4/5, Owner: Platform, TTV: 21 ngày, Validate: audit completeness).
5. Impact Coverage
| Domain | Now 0-2w | Next 1-2m | Later 3-6m |
|---|---|---|---|
| FARE | adopt context indexing | trial codebase memory | monitor |
| NEXA | trial harness eval | adopt staged rollout | scale |
| SYNCA | adopt quality gate | monitor drift | automate policy |
| DOMUS | monitor | trial governance dashboard | adopt |
| Việt Nam | trial SME devtool pack | adopt nếu CAC tốt | scale channel |
| Nhật Bản | monitor enterprise compliance | trial POC | adopt selective |
| Global | watch benchmark race | trial partner stack | monitor consolidation |
6. Source Appendix (Top 50)
| # | Platform | Source | Metric | Timestamp | Why |
|---|---|---|---|---|---|
| 1 | dev_web | 6 Months of "Agentic" Coding | 2 pts / 0 comments | 2026-05-30T16:05:46Z | HN developer discourse |
| 2 | dev_web | The Coding Harness Behind GitHub Copilot in VS Code | 1 pts / 0 comments | 2026-05-30T15:55:04Z | HN developer discourse |
| 3 | dev_web | Show HN: Jynx, a matchmaking app to find gaming teammates | 4 pts / 2 comments | 2026-05-30T13:45:34Z | HN developer discourse |
| 4 | dev_web | Spatial IDE's for agentic coding workflows | 2 pts / 1 comments | 2026-05-30T13:34:57Z | HN developer discourse |
| 5 | dev_web | AI coding agents ships at the cost of intuition and taste | 2 pts / 0 comments | 2026-05-30T09:33:05Z | HN developer discourse |
| 6 | dev_web | Show HN: A Claude Code skill that scopes problems like Peter Naur | 2 pts / 0 comments | 2026-05-30T02:04:12Z | HN developer discourse |
| 7 | dev_web | Bill Gates AI on AI (one month later) | 3 pts / 0 comments | 2026-05-27T04:01:44Z | HN developer discourse |
| 8 | dev_web | Show HN: Simple Sprite Sheet Generation | 3 pts / 0 comments | 2026-05-24T19:37:43Z | HN developer discourse |
| 9 | dev_web | Show HN: My first app, artisanally vibe-coded in 4 months | 3 pts / 5 comments | 2026-05-24T10:07:13Z | HN developer discourse |
| 10 | dev_web | Zero – Programming Language for Agents | 3 pts / 0 comments | 2026-05-23T11:13:35Z | HN developer discourse |
| 11 | dev_web | We Benchmarked Claude Code, Codex, Semgrep, CodeQL, Trent on 28 CWE-Bench CVEs | 6 pts / 2 comments | 2026-05-28T12:39:46Z | HN developer discourse |
| 12 | dev_web | Mini-SWE-agent scores up to 74% on SWE-bench in 100 lines of Python code | 2 pts / 0 comments | 2026-05-28T05:05:59Z | HN developer discourse |
| 13 | dev_web | Show HN: 97% on SWE-bench Verified with subscription-token agents | 2 pts / 0 comments | 2026-05-24T18:03:28Z | HN developer discourse |
| 14 | dev_web | Bito's AI Architect Boosts Claude Opus's task success rate by 35% | 2 pts / 0 comments | 2026-05-19T10:02:03Z | HN developer discourse |
| 15 | dev_web | Show HN: Statewright – Visual state machines that make AI agents reliable | 126 pts / 59 comments | 2026-05-12T14:24:55Z | HN developer discourse |
| 16 | dev_web | The Terminal Bench 3.0 community is looking for task contributors | 1 pts / 2 comments | 2026-05-03T03:40:04Z | HN developer discourse |
| 17 | dev_web | ForgeCode: Top open source coding agent in Terminal-Bench 2.0 | 4 pts / 0 comments | 2026-04-29T18:16:23Z | HN developer discourse |
| 18 | dev_web | Open-weight 27B hits 38% on Terminal-Bench 2.0 (Opus 4.1 hit 38% in Aug 2025) | 6 pts / 9 comments | 2026-04-28T19:11:57Z | HN developer discourse |
| 19 | dev_web | Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview | 393 pts / 148 comments | 2026-04-27T12:35:55Z | HN developer discourse |
| 20 | dev_web | Show HN: Terminal-Wrench, a dataset of 331 realistic hackable environments | 6 pts / 2 comments | 2026-04-15T00:42:30Z | HN developer discourse |
| 21 | dev_web | I spent a year building agent memory on knowledge graphs. Here are my 5 mistakes | 2 pts / 0 comments | 2026-05-30T16:04:30Z | HN developer discourse |
| 22 | dev_web | Collection of Claude Code Skills | 2 pts / 0 comments | 2026-05-30T14:52:06Z | HN developer discourse |
| 23 | dev_web | Researchers let AI models run a simulated society; Claude safest, Grok extinct | 3 pts / 1 comments | 2026-05-30T13:42:21Z | HN developer discourse |
| 24 | dev_web | A Weekend in Claude Design Saves 3 Weeks of Claude Code | 2 pts / 0 comments | 2026-05-30T05:34:51Z | HN developer discourse |
| 25 | dev_web | Show HN: Free open source coding models in Slack | 3 pts / 0 comments | 2026-05-28T16:11:13Z | HN developer discourse |
| 26 | dev_web | First thing you see when Googling "OpenAI Codex app" is a fake malware website | 3 pts / 0 comments | 2026-05-28T13:49:02Z | HN developer discourse |
| 27 | dev_web | Building self-improving tax agents with Codex | 2 pts / 0 comments | 2026-05-27T15:48:40Z | HN developer discourse |
| 28 | dev_web | The Codex Showcase | 4 pts / 0 comments | 2026-05-27T03:00:38Z | HN developer discourse |
| 29 | dev_web | Show HN: OpenHive – AI agents share solutions so other agents dont re-solve them | 5 pts / 0 comments | 2026-05-29T14:35:42Z | HN developer discourse |
| 30 | dev_web | Show HN: TheFoundry – Easy bootstrapping framework for MultiAgent Systems | 2 pts / 0 comments | 2026-05-29T13:18:07Z | HN developer discourse |
| 31 | dev_web | Show HN: AI Skill to port PostgreSQL extensions to MySQL | 4 pts / 0 comments | 2026-05-28T15:18:45Z | HN developer discourse |
| 32 | dev_web | Show HN: Multiplayer, a debugging agent to run locally next to your coding agent | 7 pts / 1 comments | 2026-05-28T14:16:13Z | HN developer discourse |
| 33 | dev_web | Windows computer-use: synthetic cursors for background agents | 3 pts / 0 comments | 2026-05-27T18:48:20Z | HN developer discourse |
| 34 | dev_web | Show HN: AI-org – org-mode powered by AI | 3 pts / 0 comments | 2026-05-30T08:59:47Z | HN developer discourse |
| 35 | dev_web | Show HN: AISlop, a CLI for catching AI generated code smells | 72 pts / 60 comments | 2026-05-29T13:37:38Z | HN developer discourse |
| 36 | dev_web | Ask HN: Is it worth releasing another open-source test coverage aggregator? | 2 pts / 2 comments | 2026-05-29T12:57:43Z | HN developer discourse |
| 37 | dev_web | Glibc CVE-2026-5450 9.8 | 4 pts / 0 comments | 2026-05-29T07:11:13Z | HN developer discourse |
| 38 | github | raphaeltm/simple-agent-manager | 35 stars / 4 forks / 7 issues | 2026-05-30T17:15:37Z | Repo momentum/adoption |
| 39 | github | agent-of-empires/agent-of-empires | 2461 stars / 217 forks / 90 issues | 2026-05-30T17:15:19Z | Repo momentum/adoption |
| 40 | github | akitaonrails/ai-memory | 426 stars / 43 forks / 1 issues | 2026-05-30T17:14:29Z | Repo momentum/adoption |
| 41 | github | deep-copilot/DeepCopilot | 44 stars / 3 forks / 8 issues | 2026-05-30T17:14:19Z | Repo momentum/adoption |
| 42 | github | paean-ai/deeptide | 277 stars / 42 forks / 0 issues | 2026-05-30T17:14:19Z | Repo momentum/adoption |
| 43 | github | gi-dellav/zerostack | 1021 stars / 67 forks / 3 issues | 2026-05-30T17:14:05Z | Repo momentum/adoption |
| 44 | github | superradcompany/microsandbox | 6361 stars / 308 forks / 40 issues | 2026-05-30T16:55:54Z | Repo momentum/adoption |
| 45 | github | llm4s/llm4s | 244 stars / 102 forks / 68 issues | 2026-05-30T16:53:55Z | Repo momentum/adoption |
| 46 | github | vercel-labs/zerolang | 4724 stars / 304 forks / 123 issues | 2026-05-30T17:12:39Z | Repo momentum/adoption |
| 47 | github | barnum-circus/barnum | 106 stars / 4 forks / 3 issues | 2026-05-30T03:34:13Z | Repo momentum/adoption |
| 48 | github | mochilang/mochi | 328 stars / 14 forks / 168 issues | 2026-05-30T02:11:33Z | Repo momentum/adoption |
| 49 | github | jason-lang/jason | 254 stars / 75 forks / 6 issues | 2026-05-29T18:17:46Z | Repo momentum/adoption |
| 50 | github | smallcloudai/refact | 3552 stars / 315 forks / 2 issues | 2026-05-30T09:52:54Z | Repo momentum/adoption |
7. Data Quality / Scan Health
Total 146>=100 PASS volume. Platform partial: X/Reddit/Facebook unavailable in bounded run (public access/rate/parse). Confidence: Medium.