Citation Archiving Tool for Researchers
A browser extension that archives web sources at the moment of citation, preventing link rot in academic work.
Build
Link rot is a real, painful problem for researchers—one in five citations break within five years. Existing citation tools ignore the issue, leaving researchers to manually archive sources. The challenge is distribution: reaching individual researchers and convincing institutions to pay. The technical build is straightforward (Wayback API + local capture), but trust and habit change are hard. For this to work, researchers must adopt the extension as part of their workflow, and libraries must see it as a budget-worthy preservation tool.
At a Glance
Market Size
~$500M
Adjacent to citation management and digital preservation markets.
Confidence 60%
Competition Density
Low
No direct competitor; adjacent tools ignore archiving.
Confidence 90%
Defensibility
7/10
Accumulated archive of captured pages creates switching costs.
Confidence 70%
Time to Validate
4 weeks
10-user pilot + librarian interest signals.
Confidence 80%
Quick Metrics
Entry Difficulty
Medium80%
Technical build easy; distribution and trust harder.
Time to MVP
14–28 days
Browser extension + Wayback API integration.
Time to First $
72–120h
Sell to individual researchers via $15/mo subscription.
Opportunity Breakdown
Opportunity
8/10Clear pain with no direct competitor.
Problem
9/10Broken citations erode research credibility.
Feasibility
8/10Buildable with existing APIs and browser tech.
Why Now?
Superpowers Unlocked
7/ 10
Wayback API + browser extensions mature.
Cultural Tailwinds
8/ 10
Growing concern over digital preservation.
Blue Ocean Gap
9/ 10
No citation tool archives sources.
Ship Now or Regret Later
6/ 10
Existing tools could add this feature.
Creator Economy Boost
4/ 10
Not directly relevant to researchers.
Economic Pressure
5/ 10
Libraries seek cost-effective preservation.
Heuristic scoring based on model judgment, not factual measurement.
Scorecard
Strength Profile
Demand
8.0/10Researchers actively complain about broken links.
Problem Severity
9.0/10Broken citations undermine research integrity.
Monetization Readiness
7.0/10Libraries already pay for preservation tools.
Competitive Gap
8.0/10No tool combines citation + archiving seamlessly.
Timing
8.0/10Rising awareness of digital preservation.
Founder Fit
7.0/10Achievable for a solo developer with API knowledge.
Revenue Criticality
8.0/10Directly preserves research value for institutions.
Risk Profile
Operational Complexity
Moderate complexityMostly software; some support for edge cases.
Liquidity Risk
Low riskLow capital; revenue from day one possible.
Regulatory Risk
Low riskStandard data privacy compliance only.
Lower values indicate lower risk.
Demand Signals
Researchers manually saving web pages as PDFs for citations.
Complaints about broken links in academic papers on Twitter and Reddit.
Libraries investing in digital preservation tools (e.g., LOCKSS, Portico).
Studies quantifying link rot rates (e.g., 20% in 5 years).
Existing citation tools' forums requesting archiving features.
Growth of the 'digital preservation' job postings and conferences.
Insights
Link rot affects 20% of citations within 5 years, per studies.
Researchers manually archive sources using browser bookmarks or Wayback.
Citation tools (Zotero, EndNote) ignore source preservation.
Libraries spend heavily on journal access but not on archiving cited pages.
Wayback Machine API is free but rate-limited; local capture needed for paywalls.
Institutional sales cycles are long but high-value ($5K+ per library).
Distribution via Code4Lib and preservation listservs is effective.
User habit change is the biggest barrier—must integrate into existing workflow.
Risks
Wayback API rate limits may slow down archiving for heavy users.
Paywalled pages require local storage, increasing storage costs.
Researchers may not change habit of using existing citation tools.
Library sales cycles are long (6-12 months), delaying revenue.
Superpowers
First mover in combining citation formatting with source archiving.
Leverages free Wayback API for low-cost archiving.
Integrates with popular tools (Zotero, Notion) reducing switching cost.
Institutional pricing aligns with existing library budgets.
Honest Read
What we know for certain versus what still needs testing.
What we know for certain
- Link rot affects 20% of citations within 5 years (peer-reviewed studies).
- Researchers manually archive sources using bookmarks or PDFs.
- No existing citation tool offers automatic source archiving.
- Libraries already spend on digital preservation (LOCKSS, Portico).
Open questions
- Will researchers pay $15/mo individually, or only via institutions?
- Can local capture reliably handle paywalled and JavaScript-heavy pages?
- How long does it take to convert a librarian lead into a $5K contract?
These need user testing or more data before you should bet on the answer.
Live Fast