Library note. Tool comparisons stay qualitative until dated rerun logs are ready.

ToolLab — honest comparisons between tools we actually use

ToolLab is the comparison half of Leefan Reports. Three pages today: Claude Code vs Cursor for pSEO ops; Firecrawl vs Browserless for structure analysis; Gumroad vs BOOTH vs Stripe for productized assets. Each one is a comparison we ran for ourselves, on a real piece of work, with the criteria written down before the result came in. None of the three pages is a vendor-supplied take.

This hub exists to set expectations about how we compare tools before you read any individual page. The methodology is more important than any one verdict.

1. What ToolLab is, and what it deliberately isn’t

ToolLab is operator-grade orientation. Small comparisons done on a single operator’s machine, against a fixed input, with criteria stated up-front and a counter-case section that names the situation in which the losing tool is the right choice.

ToolLab is not a procurement benchmark. If you are buying a tool for a 200-engineer team and need a controlled cross-vendor benchmark on production-grade load, read the page anyway to set up your own criteria, then run your own benchmark on your own workload.

ToolLab is also not sponsored. No vendor has paid for placement; no link on a ToolLab page is an affiliate link; no vendor reviewed or approved any text.

2. The site-wide benchmark posture

Each comparison page declares, at the top, which posture it takes:

PagePosturePreview status
T01 — Claude Code vs Cursor for pSEO ops Single-operator measurement numeric rerun pending — rerun pending; no placeholder numbers shown.
S01 — Firecrawl vs Browserless for structure analysis Hybrid numeric rerun pending — free-tier rerun pending; no placeholder numbers shown.
M01 — Gumroad vs BOOTH vs Stripe for productized assets Rubric-only rubric-only — synthetic net-of-fee numbers stripped; vendor-fee table replaced with [fee table pending current public-fee check] until current public fee pages are re-checked.

3. The standard benchmark-posture disclosure block

Benchmark posture: Each comparison states whether it is a single-operator measurement, hybrid, or rubric-only synthesis, last verified 2026-05-21. Raw inputs and run logs can be requested at contact@leefan.co.jp. The numbers are orientation, not a procurement-grade benchmark. If you are buying this software for your team, rerun the input yourself on your own workload before deciding.

In this preview, {date} renders as {T01_RERUN_DATE} / {S01_RERUN_DATE} / {M01_LAST_VERIFIED} literally; none of those values is bound.

4. The three pages, in the order a new reader should take them

#PageWhat it gives you (qualitatively)Verdict shape
1 T01 — Claude Code vs Cursor for pSEO ops numeric rerun pending Card-×-iteration comparison framing on a fixed input. Output-shape distinction (wrote-file-to-disk vs wrote-file-into-chat). Explicit counter-case for where Cursor is the right tool. Claude Code wins on file-driven, repeatable, auditable operations; Cursor wins on live-edit-in-IDE, conversational debug.
2 S01 — Firecrawl vs Browserless for structure analysis numeric rerun pending 10-URL run on free-tier of each (rerun pending). Load-bearing artifact: the “structure-only, NOT body” framing. Firecrawl wins on structure extraction with zero JS execution; Browserless wins on full-page rendering for JS-heavy sites.
3 M01 — Gumroad vs BOOTH vs Stripe for productized assets rubric-only Rubric-only comparison; vendor-fee table held at [fee table pending current public-fee check] in preview; JP-specific framing (BOOTH audience overlap, payout cadence, 印紙税 applicability). Depends — Gumroad on cross-border simplicity; BOOTH on JP-domestic audience overlap for niche assets; Stripe on margin once SKU volume is high and you have your own checkout.

If you only have time for one, read T01 once it is rerun — it is the page with the clearest “we ran this, here is what happened” shape.

5. The criteria we use across ToolLab

FamilyWhat it capturesWhen it dominates
ReproducibilityCan a stranger rerun the comparison and check our claim?Always at least 20%; load-bearing for honesty.
Operator-fitDoes the tool fit a small-team / single-operator workflow?Heavy when the audience is a 1–5-person practice.
Trust contractDoes the tool tell you when it has done something dangerous (overwritten a file, fired a network request, billed you)?Heavy for any tool that can spend money or modify state on your behalf.
Time-to-first-real-outputHow long from install to first useful output?Heavy for new-tool evaluation.
JP applicabilityFor JP-domestic operators: payout cadence, tax form support, language coverage, audience overlap.Heavy whenever the artifact is a JP-domestic operation.
Vendor exit costIf we stop paying, what do we lose?Heavy for any tool we propose to add to the Leefan Reports stack.

6. What ToolLab does not compare (and why)

7. Where these pages plug into the rest of the site

We do not link from ToolLab to vendor partner programs, affiliate networks, or trial sign-up CTAs.

8. What’s not here yet