mahdiyar's comments

mahdiyar · 2026-01-07T18:11:31 1767809491

But Ralph has problems when you move beyond one-off tasks:

1. Overcooking: The loop runs forever. Leave it too long and the AI starts adding features nobody asked for, refactoring code that was fine, writing documentation for the sake of it.

2. Undercooking: Ctrl+C too early and you're left with half-done features.

3. Fragile state: Ralph uses markdown files (TASKS.md, PLANNING.md) as source of truth. LLMs can corrupt these - add extra formatting, forget sections, change structure.

4. No memory between sessions: Each run starts fresh. The AI can't see what was done yesterday.

5. Vendor lock-in: Ralph was built for Claude Code. Switching to Codex or Gemini means rewriting your workflow.

So I built juno-code to fix these problems.

What juno-code does differently Iteration control: Instead of while :; do, you get -i 5 for exactly 5 iterations, or run_until_completion.sh that stops when kanban tasks are done.

Structured task tracking: Instead of markdown, tasks are stored in NDJSON files via juno-kanban. The format can't be corrupted by LLM formatting errors. You can query tasks programmatically:

./.juno_task/scripts/kanban.sh list --status in_progress Backend agnostic: Switch between Claude, Codex, Gemini, or Cursor with one flag:

juno-code -b shell -s claude -m :opus -i 5 -v juno-code -b shell -s codex -m :codex -i 5 -v juno-code -b shell -s gemini -m :flash -i 5 -v Stuck on a bug? Try a different model's perspective with one word change.

Full traceability: Every completed task links to a git commit. Time travel through development history. The AI can search git history instead of re-reading everything.

Hooks without lock-in: Run tests, linters, or any script at lifecycle points. Works with any backend:

{ "hooks": { "START_ITERATION": { "commands": ["./scripts/lint.sh"] }, "END_ITERATION": { "commands": ["npm test"] } } } Real-time feedback: Send feedback to the running AI without stopping it:

juno-code feedback "found a bug in the auth flow" Quick start npm install -g juno-code

cd your-project juno-code init --task "Migrate from JavaScript to TypeScript" --subagent claude

# Run until kanban tasks are complete ./.juno_task/scripts/run_until_completion.sh -s claude -i 10 -v The key insight Ralph proved that AI works better in loops. juno-code adds the structure that makes loops sustainable:

Controlled cooking time (not infinite) Strict task format (not corruptible markdown) Any AI backend (not vendor locked) Full audit trail (not blended changes) GitHub: https://github.com/askbudi/juno-code npm: https://www.npmjs.com/package/juno-code

Built with TypeScript. MIT licensed. Feedback welcome.

mahdiyar · 2025-12-11T22:42:15 1765492935

I read this a few years ago and start to doing that. And I never looked back. I can search what i did on a specific day, search for a task and see all the traces, having it accessible over dropbox.

No upgrade CTA, no nonsense. now even I can feed it to llm and get feedback about my planning, routines and everything

mahdiyar · 2025-10-20T16:30:04 1760977804

I have created a simple .sh command to do the testing using browser-use and add how to prompt this command in CLAUDE.md because it has to run a shell command, it is more deterministic than Skills. And it uses near to zero of context window compared to MCP.

Recently, I have found myself getting more interested in shell commands than MCPs. There is no need to set it up. Debugging is far easier. And I would be free to use whichever model I like ot use for a specific function. For example, for Playwright, I use GPT-5 just because I have free credits. I could save my Claude Code Quota for more important tasks.

mahdiyar · 2025-09-30T04:05:02 1759205102

TinyAgent lets you use basically any AI model without rewriting your code. GPT-5, Claude, that new Llama model everyone's talking about, local Ollama stuff - just swap the config. Oh and yes it's sandboxed before anyone asks. That `rm -rf` incident taught me to sandbox EVERYTHING. Uses proper containers and OS-level sandboxing

I've been using it for a couple side projects - one that chats with GitHub repos askbudi.ai

it is a core engine for juno-agent https://askbudi.ai/juno-cli

You can check the docs here: https://askbudi.ai/tinycodeagent

Installation: `pip install tinyagent-py[all]`

P.S. - I know there are like 50+ agent frameworks out there but this one actually works for my weird use cases P.P.S. - Yes I realize I'm just reinventing wheels at this point but they're MY wheels dammit

mahdiyar · 2025-09-28T21:32:11 1759095131

Hey HN, Last week, I spent 40 minutes debugging a production issue that should have taken 5. Not because the bug was complex, but because I kept switching between Claude Code, Cursor, Codex, and Gemini - copying context, losing thread, starting over. The workflow was painful: 1. Claude Code couldn't reproduce a React rendering bug 2. Copy-pasted 200 lines to Cursor - different answer, still wrong 3. Tried Codex - needed to re-explain the database schema 4. Finally Gemini spotted it, but I'd lost the original error logs

  This context-switching tax happens weekly. So I built Roundtable AI MCP Server.

What makes it different: Unlike existing multi-agent tools that require custom APIs or complex setup, Roundtable works with your existing AI CLI tools through the Model Context Protocol. Zero configuration - it auto-discovers what's installed and just works. Architecture: Your IDE → MCP Server → Multiple AI CLIs (parallel execution) It runs CLI Coding Agents in headless mode and shares the results with the LLM of choice. Real examples I use daily: Example 1 - Parallel Code Review: Claude Code > Run Gemini, Codex, Cursor and Claude Code Subagent in parallel and task them to review my landing page at '@frontend/src/app/roundtable/page.tsx'

  → Gemini: React performance, component architecture, UX patterns
  → Codex: Code quality, TypeScript usage, best practices
  → Cursor: Accessibility, SEO optimization, modern web standards
  → Claude: Business logic, user flow, conversion optimization

  Save their review in {subagent_name}_review.md then aggregate their feedback

  Example 2 - Sequential Task Delegation:
  First: Assign Gemini Subagent to summarize the logic of '@server.py'
  Then: Send summary to Codex Subagent to implement Feature X from 'feature_x_spec.md'
  Finally: I run the code and provide feedback to Codex until all tests in 'test_cases.py' pass
  (Tests hidden from Codex to avoid overfitting)

  Example 3 - Specialized Debugging:
  Assign Cursor with GPT-5 and Cursor with Claude-4-thinking to debug issues in 'server.py'
  Here's the production log: [memory leak stacktrace]
  Create comprehensive fix plan with root cause analysis

  All run in parallel with shared project context. Takes 2-5 minutes vs 20+ minutes of manual copy-paste coordination.

Try it: pip install roundtable-ai roundtable-ai --check # Shows which AI tools you have I'd love feedback on: 1. Which AI combinations work best for your debugging workflows? 2. Any IDE integration pain points? 3. Team adoption blockers I should address? GitHub: https://github.com/askbudi/roundtable Website: https://askbudi.ai/roundtable

mahdiyar · 2025-09-26T18:20:06 1758910806

I'm using FingerprintJS, overnight, they changed their pricing and removed the free plan, so I ended up paying for the subscription for the past 3 years. And also I can't remove them because it is critical in our anti-fraud system.

The reason I pay for their library is their accuracy. It would be amazingly interesting if your library could compete. Then I would switch immediately.

By the way, I do not have a problem with paying for a service; their plans are not based on the volume of users. (Minimum is $100 for 20,000 verification) And I use only 2,000.

FP_Protects · 2025-09-26T22:10:01 1758924601

Hey - Fingerprint team here.

Thank you for sharing your experience and for highlighting what matters most to you: accuracy and pricing for lower-usage tiers. I definitely hear your frustration with the removal of the Free plan a few years ago and the challenge of not having a plan that fit your usage level.

Based on feedback like yours, we recently introduced a new Free plan (in May) that includes 1,000 API calls for iOS and web, plus 500,000 calls for Android. Depending on which platform your ~2,000 calls are coming from, this might be a better fit.

Your feedback is exactly what helps us shape these updates, so we’ll continue to take this into account as we refine our plans. If you’d like, I’d be happy to connect directly to see how we can make sure you’re on the plan that best fits your needs.

bobbiechen · 2025-09-26T18:32:39 1758911559

I get that open-source in fraud prevention is really hard, I'm sympathetic to the challenges here.

FingerprintJS open-source (and the discussed FingerprinterJS) are both trivial to spoof since the entire codebase is easily examined, and the implementation is totally open as an oracle to someone who wants to bypass it or construct arbitrary fingerprints. It's a nice proof of concept (and I like the attention to unstable signals in FingerprinterJS here) but ultimately doesn't hold up against any dedicated attackers.

I work on a competing commercial product (Stytch Device Fingerprinting) and your usage would be within our free tier. Unfortunately we don't have an open-source version or self-serve onboarding because of the adversarial problems mentioned above. Happy to chat if that helps, bchen at stytch dot com.

mahdiyar · 2025-09-25T18:05:28 1758823528

Working on creating open-source stack for developing agents. Started a year ago with playing with CrewAI, Langgraph , Ango and others.. Then realised the learning curve is more than necessary and I do not use most of their fancy things. So I started building tinyAgent inspired by a post on Huggingface. https://github.com/askbudi/tinyagent

Later on, I find myself switching between Claude, Cursor, Codex and Gemini-cli. (Not much gemini-cli to be honest:D ) And wanted to play with MCP Servers, so I built Roundtable https://askbudi.ai/roundtable , and when I face a bug, or I need to brainstorm, I task it to create subagents from Claude, Codex,... And task each of them to analyze the issue and then aggregate their opinion. It is fun, and I feel I get more out of what I have already paid for. ( Paid for Cursor 1-year plan, later on switched to Claude Code, and codex is a part of the Plus plan that I have access to)

mahdiyar · 2025-09-25T17:16:58 1758820618

Example usecase:

Prompt: ``` The user dashboard is randomly slow for enterprise customers.

Use Gemini SubAgent to analyze frontend performance issues in the React components, especially expensive re-renders and inefficient data fetching.

Use Codex SubAgent to examine the backend API endpoint for N+1 queries and database bottlenecks.

Use Claude SubAgent to review the infrastructure logs and identify memory/CPU pressure during peak hours. ```

Behnaz-Sobhani · 2025-09-27T00:30:21 1758933021

Any plan for supporting GitHub Copilot CLI?