Skip to main content

Research Methodology

What to do this week: Start tracking your AI development work. Even rough notes beat nothing.

Most AI content is "I built X with Claude" without any way to verify the claim. We track everything: real costs, actual time, every error and intervention. Try this yourself—you'll learn more in one tracked session than a month of undocumented work.

Try This Workflow

1. Build

Claude Code

Work with AI agents as development partners

2. Track

Auto-Log

Hooks capture every prompt, error, intervention

3. Analyze

Real Data

Actual costs, time, errors from APIs

4. Publish

Honest Results

What worked, what didn't, and why

Notice what changes when you track: you catch patterns you'd otherwise miss.

What to Capture (Start Simple)

Prompts

Real-time logging via Claude Code hooks

47 iterations logged

Errors

Precise counts & resolution times

23 errors, avg fix: 8 min

Costs

Token usage + infrastructure from APIs

$18.50 Claude + $8.30 Cloudflare

Interventions

When AI needed human help, and why

12 manual fixes documented

Time

Session duration, not guesswork

26 hours actual vs 120 estimated

Architecture

Decisions made, alternatives considered

Workflows over Workers (why)

Pick Your Starting Point

Already mid-project? That's fine. You can start tracking anywhere—just be honest about what you measured vs. estimated.

Start Fresh

Best Data

Track from the first prompt. You get complete data on every iteration, error, and decision.

Data Quality:

High confidence, precise metrics

Use Case:

New experiments starting from scratch

Pick Up Mid-Project

Most Common

Realize halfway through that this is worth tracking? Combine real-time data going forward with what you can reconstruct from git history.

Data Quality:

Mixed: estimates for past work, precise for future

Use Case:

Active projects you realize are experiment-worthy

Document After the Fact

Still Valuable

Already shipped? Reconstruct from git commits, API logs, and memory. Just be transparent about limitations.

Data Quality:

Lower confidence, acknowledged limitations

Use Case:

Completed projects with production data

What Changes When You Track

Try tracking one project. Notice the difference in what you learn.

Before: Just Building

  • "I built X with AI" (anecdote)
  • No way to replicate your success
  • Can't prove it worked
  • You forget what actually happened

After: Building + Tracking

  • "I built X: 26 hrs, $27, 78% savings" (proof)
  • Others can try your approach
  • You spot patterns across projects
  • You actually remember what worked

The real benefit: you learn from your own work. Without data, every project is a one-off. With data, patterns emerge across experiments.

Ready to Try It?

The experiment tracking system is available as a Claude Code Skill. Here's how to get started:

1

Install the Skill

Add experiment tracking to your Claude Code setup

2

Build & Track

Work with Claude Code while automatic logging captures everything

3

Generate Papers

Transform tracked data into reproducible research

See It in Practice

Here's what tracking looks like on a real project—Experiment #1: Zoom Transcript Automation:

26
Hours
47
Errors
12
Interventions
78%
Time Savings

Data sources: Real-time prompt logging via hooks, Claude Code Analytics API, Cloudflare billing API, git commit history

What you can try: Start with just time and error counts. Add cost tracking once you've got the habit.