For AI labs running autonomous research agents

Your research agents should remember what worked

ResearchFS turns papers, repos, lab notes, and experiment results into a memory graph your research agents can use. The next run does not start from a blank prompt. It starts from everything your lab has already learned.

See how it works View benchmark signal

paper → result

turn research into reusable agent memory

10.5%

fewer tokens on GLM 5.2 run

9.5%

faster wall-clock on GLM 5.2

v0 running

papers, methods, retrieval, experiments, results

Every useful paper should become a reusable lesson

A paper in a folder is just a file. A useful paper becomes a method, an implementation target, an experiment, and a result your research agents can retrieve later.

read

paper

extract

method

map

implementation

edit

code target

learn

result

The shift

Not a paper library

ResearchFS is not storage for PDFs. It is the working memory between your papers, codebase, experiments, and agents.

The user

Researchers and their agents

Humans search and inspect the graph. Research agents retrieve the methods, caveats, and prior results before touching code.

The compounding loop

Every run leaves the lab smarter

When an experiment works or fails, the lesson stays behind. Future research agents do not have to rediscover it.

Start with one repo. Grow into the lab memory layer.

The path is deliberately simple: prove one paper can create one useful code change, then turn that loop into team workflow, agent API, and enterprise research memory.

One repo, one research loop

Ingest papers, extract methods, retrieve methods, run a code-edit research agent, log results, and compare baseline vs. ResearchFS.

prove one paper can become one useful code change

A workspace for research teams

Upload papers, ingest arXiv and GitHub, search methods, watch experiments, inspect provenance, and review what changed after each run.

a research team can use it manually and through research agents

A memory API for research agents

Let any coding or research agent ask for relevant methods, prior outcomes, implementation notes, and lessons before it proposes a change.

any AI coding/research agent can use ResearchFS as memory

Institutional research memory

Bring in private papers, internal repos, Slack, Notion, lab notes, experiment trackers, permissions, audit logs, and hardware-specific lessons.

become the institutional memory layer for AI research orgs

A memory API your research agents can call

Instead of pasting twenty papers into every prompt, a research agent asks ResearchFS what the lab already knows: which methods map to this code, where they came from, and what happened last time.

POST/ingest/arxiv

POST/ingest/repo

GET/retrieve?query=...

GET/method/:id

POST/experiment

POST/lesson

# What a research agent should get back
paper        = "SwiGLU improves transformer MLPs"
method       = "replace GELU MLP with gated SwiGLU"
code_target  = "train.py::MLP"
prior_result = "helped on H100, hurt on MPS"
next_action  = "try a smaller hidden multiplier"
      

The budget comes from wasted research

Teams will not buy “paper storage.” They will buy fewer repeated failures, lower context cost, fewer wasted GPU runs, and research knowledge that survives team churn.

Lower token spend

Research agents retrieve the relevant method instead of stuffing entire papers into context.

Less researcher rework

People stop re-explaining the same papers, failures, and implementation details.

Fewer wasted GPU runs

Bad ideas get filtered against prior results before they hit expensive training jobs.

Memory that survives turnover

When a researcher leaves, the method/result graph stays with the lab.

Better research agents

Agents propose stronger experiments because they can retrieve tested internal lessons.

Early GLM 5.2 smoke-test signal

Cheaper runs today. Better research memory tomorrow.

In a local nanochat benchmark using zai-org-glm-5-2 via Venice, ResearchFS made the research-agent loop cheaper and faster. The baseline still won BPB, which is exactly the kind of lesson the memory graph should keep.

BPB quality gap

0.02419

Baseline best BPB was 2.128834 vs. ResearchFS 2.153019. Lower BPB is better.

Token savings

10.5%

ResearchFS used 28,431 tokens vs. 31,755 for the baseline.

Wall-clock savings

9.5%

ResearchFS finished in 284.6 seconds vs. 314.6 seconds for the baseline.

Query reuse

Run 3

ResearchFS reused the prior query MLP SwiGLU during the GLM run.

Honest read: this is an early research-agent loop smoke test, not a SOTA claim. The useful signal is that ResearchFS cut tokens and wall time while preserving the full paper-method-result trail. The next milestone is turning that cheaper exploration into better BPB.

Make every research run compound

If your team is experimenting with autonomous research agents, the question is not whether they can read papers. It is whether they remember what those papers did in your codebase.

Request early access Explore the code