App Clinic: cal.com
A dissection of the cal.com codebase — how it works, what the team does well, where the technical debt sits, and what other projects can learn from it. The full Octokraft clinic.
What we're learning about AI-generated code, quality drift, and keeping velocity without the mess.
A dissection of the cal.com codebase — how it works, what the team does well, where the technical debt sits, and what other projects can learn from it. The full Octokraft clinic.
A dissection of the Grafana codebase — how it works, what the team does well, where the technical debt sits, and what other projects can learn from it. The full Octokraft clinic.
Rust, Go, TypeScript, Python, PHP, and Swift scored by the same pipeline across 24 open source projects. Rust's compiler sets a floor no other language matches. But Go proves the same toolchain can produce a 97.9 and a 50.1, depending on the team.
One engineer and 800 AI sessions built a Next.js replacement in under a week for $1,100. Octokraft's analysis pipeline gave it a 99.6 on security. Then external researchers found 31 vulnerabilities that static analysis never caught.
Four AI coding agents received the same weather service to port and the same bug to fix. Three ports work. All four scored D- or worse. Two agents fixed the root cause. The data shows what these tools produce without human review.
The same analysis pipeline scored 24 open source repositories. Heritage projects averaged 76.0. AI-heavy projects averaged 74.4. The methodology does not predict the score. Language choice and engineering discipline do.
285,000 GitHub stars. 8 CVEs. 42,000 exposed instances. We analyzed OpenClaw's 1.1 million lines of code across four dimensions. Here's what we found.
OpenClaw looked great by every standard metric. Clean code, strong architecture, massive test suite. It also had 8 CVEs. Here's why good analysis isn't enough.
We ran the exact same code analysis on the same codebase with the same LLM. The only variable: which agent CLI executed the work. 2.7x difference in findings. 2.2x difference in cost.
Most technical debt processes fail because they rely on separate tracking rituals. The sustainable model is tooling that surfaces friction where work already happens.
Octokraft's full analysis pipeline scored three open-source AI coding agents. OpenCode edges Codex by 0.1 at B+. Gemini CLI trails at B-. Security is no longer a clean sweep for compiled languages.
More articles coming soon.
Health scores, architecture reviews, and convention detection. Free to start.
Sign In and Try