
Future House is a nonprofit research organization that develops artificial intelligence (AI)-based tools to accelerate scientific discovery. Established in 2024 with the support of former Google CEO Eric Schmidt, the organization aims to develop a functional “AI scientist” within a decade. Its work focuses on building AI systems that can automate research processes in biology and other complex scientific fields. The organization is headquartered in the United States.
Future House was co-founded by CEO Sam Rodriques and Chief Science Officer Andrew White. The organization is staffed by a multidisciplinary team of technical and operational experts. Members of the team include Sam Cox, Michael Skarlinski, Jon Laurent, Lauren Jaeger, Siddharth Narayanan, James Braza, Michaela Hinks, Ryan-Rhys Griffiths, Tyler Nadolski, Kiki Szostkiewicz, Geemi Wellawatte, Jasmine Dhaliwal, Mayk Caldas, Ludovico Mitchener, Ali Ghareeb, Albert Bou, and Remo Storni.
Future House’s main goal is to develop an AI scientist that can automate scientific research and accelerate the discovery process. This vision is structured across four layers:
In 2025, Future House launched its first platform and API, consisting of four AI tools: Crow, Falcon, Owl, and Phoenix. These tools assist researchers in literature review, data mining, literature comparison, and experiment planning.
All tools offer access to high-quality open-access scientific literature and domain-specific utilities. They support multi-step reasoning processes, enabling in-depth evaluation of each source. The platform is accessible via both a web interface and an API.
Crow, Falcon, and Owl have outperformed existing frontier models in information retrieval accuracy and summarization quality, at times exceeding the performance of PhD-level researchers.
However, Future House has not yet achieved a scientific breakthrough or original discovery using these tools. The reliability of AI systems in addressing complex scientific problems remains a topic of concern, with risks stemming from their tendency to produce incorrect or fabricated information.
To evaluate the scientific proficiency of AI models, Future House released BixBench, a bioinformatics-focused benchmark dataset comprising 53 scenarios and 296 research questions. Claude 3.5 Sonnet and GPT-4o scored only 17% and 9% respectively, demonstrating that autonomous scientific discovery is still in its infancy.
Conversely, the organization's literature tool PaperQA2 achieved the highest accuracy in the RAG-QA Arena benchmark for scientific tasks. PaperQA2 uses multi-stage techniques such as query expansion, source re-ranking, and contextual summarization to answer scientific queries with high accuracy.
Future House plans to expand the BixBench dataset, conduct performance comparisons against human researchers, and develop new models with improved scientific reasoning abilities. Its ultimate objective is to create an AI scientist capable of automating tasks such as hypothesis generation, experimental design, data analysis, and scientific writing. These systems are expected to significantly accelerate research processes in fields such as medicine, climate change, and emerging technologies.

Henüz Tartışma Girilmemiştir
"Future House" maddesi için tartışma başlatın
Founding
Mission and Research Layers
Platform and AI Tools
Tool Performance and Criticism
Benchmarks and PaperQA2
Future Outlook
Bu madde yapay zeka desteği ile üretilmiştir.