Arthur AI is an artificial intelligence platform that offers versatile solutions developed for the evaluation, monitoring, security, and performance enhancement of AI models. The company aims to increase the auditability and reliability of various AI types, especially large language models (LLMs), natural language processing (NLP), computer vision (CV), and tabular data-based models. The platform aims to enable organizations to use their AI solutions securely and effectively in production environments with its open-source tools, customizable security systems, and enterprise-grade observability features.
Founding
Arthur AI was founded to ensure that artificial intelligence systems operate more securely, transparently, and effectively in production environments. The company's founders include Adam Wenchel (CEO) and John Dickerson (Chief Scientist). John Dickerson is also a faculty member in the Computer Science Department at the University of Maryland and is known for his academic work at the intersection of artificial intelligence and economics. The founding process of Arthur AI is based on the objective of responding to increasing corporate needs in terms of transparency, performance, and security, especially for large language models and complex machine learning systems. The company is headquartered in the United States, and its investors include venture capital firms such as Acrew, Greycroft, Index Ventures, Homebrew, Plex Capital, and Ame Cloud Ventures. Since its establishment, the company has adopted a strategy focused on open-source product development, research-based innovation, and providing enterprise-grade AI solutions.
General Features and Products
The Arthur AI platform offers various features such as performance monitoring, data drift detection, explainability, bias reduction, real-time protection, model benchmarking, and chat interfaces. Some of the platform's main components include:
Arthur Engine
Arthur Evals Engine is an open-source evaluation engine. Users can run the engine with a Docker-supported installation and evaluate AI models using multiple metrics such as accuracy, bias, fairness, and toxicity. This engine, offering real-time evaluation capabilities, allows for observing model behaviors in production environments. Additionally, it includes configurable protection systems for phenomena such as sensitive data leakage, hallucination, prompt injection, and toxic language generation.
Arthur Shield
Shield is a firewall developed for large language models. This system operates between the application layer and the deployment layer, monitoring the security of user inputs and model outputs. Working compatibly with various providers like OpenAI, Shield offers real-time solutions for detecting and preventing sensitive data leakage, hallucination, toxic output, and malicious prompt injections. Its model and platform independence allow for easy integration into different infrastructures.
Arthur Bench
Bench is an open-source solution that enables comparative evaluation of large language models. It allows companies to analyze different LLM alternatives based on criteria such as cost, privacy, and performance. Users can utilize pre-defined metrics like summarization quality and hallucination rate, or integrate their custom metrics into the system. The Bench interface offers the ability to easily visualize and compare model results. Both local and cloud-based versions are available.
Arthur Scope
Scope is a comprehensive performance monitoring system developed for NLP, CV, LLM, and tabular model types. It is used to detect data drift and accuracy degradation, provide explainability, and assess fairness and bias in model outputs. Thanks to its real-time alerting system, potential performance issues can be reported in advance. The platform's microservice architecture provides enterprise-grade scalability.
Arthur Chat
Chat allows organizations to develop custom AI chat applications built on their own documents and data. The system is supported by user-specific data sources and integrates with Arthur Shield to provide a secure experience. Chat is designed to increase businesses' productivity with quick setup and customization options.
Model Types and Application Areas
Arthur AI has developed customized solutions for different model types:
Recommender Systems: It offers accuracy, data drift, and bias analysis for personalized recommendation engines. It includes explainability features for identifying the causes of errors through segment-based analysis and cause-and-effect relationships.
Tabular Models: It provides automatic anomaly detection, explainability, bias reduction, and performance visualization features for models using tabular data.
Computer Vision: It offers image region-based evaluation for explainability and error analysis in visual classification and object recognition applications. The detection of biases in visual data is also supported by the system.
Natural Language Processing: It provides monitoring of information extraction accuracy, data drift analysis, explainability techniques, and document content-based prediction explanations for NLP models.
Research and Development
Arthur AI follows a research-based approach in its product development process. The company's Chief Scientist, John Dickerson, conducts research at the intersection of artificial intelligence and economics. Under the Research Fellows Program conducted within Arthur AI, researchers from various universities contribute to projects on AI safety, fair modeling, and explainability. The company produces scientific publications on topics such as fair classification, counterfactual explanations, model behavior monitoring, and large language model evaluation methods.
Organization and Leadership
Adam Wenchel, co-founder and CEO of Arthur AI, leads the company's vision. John Dickerson assumes scientific leadership, while various experienced individuals serve in engineering, product management, and customer support teams. The company's investors include venture capital firms such as Acrew, Greycroft, Index Ventures, Work Bench, Homebrew, and Plex Capital. Arthur AI is a holistic platform offering tools for monitoring, securing, increasing the explainability of, and optimizing the performance of artificial intelligence models used in production environments. With its open-source components, scalable architecture, and model-type independent solutions, it provides support for enterprise AI applications across various sectors.
Future Vision
While offering solutions for the evaluation, monitoring, and security of artificial intelligence systems, Arthur AI bases its long-term strategy on developing a control infrastructure that covers the entire lifecycle of these systems. The company's focus areas include model observability, firewalls, explainability mechanisms, bias detection, and performance analytics. Future plans include enabling user communities to develop open-source components, allowing users to create custom metrics and analysis systems, and providing customizable monitoring solutions for industry-specific data structures. Furthermore, increasing collaborations aimed at ensuring AI systems operate in compliance with regulations and ethical principles is targeted.