Open Source Research & Tools

Independent AI safety evaluation frameworks, alignment protocols, and governance tools for frontier model testing. The Human Mark classification system, AI Inspector browser extension, GyroDiagnostics evaluation suite, Alignment Infrastructure Routing for collective superintelligence, Moments Economy for transformative AI mitigation, and Gyroscopic Global Governance sandbox. Production-ready solutions for AI risk assessment, dangerous capability evaluations, AI pathology detection, and responsible AI development. All repositories are open source and actively maintained.

The Human Mark (THM)

AI Safety Framework

Formal classification system mapping all AI safety failures to four structural displacement risks: Governance Traceability (GTD), Information Variety (IVD), Inference Accountability (IAD), and Intelligence Integrity (IID). Machine-readable grammar grounded in evidence law, epistemology, and speech act theory. Applications include jailbreak testing, control evaluations, alignment detection, research funding, and regulatory compliance.

AI Safety FrameworkJailbreak TestingControl EvaluationsAlignment DetectionRegulatory Compliance
View Repository
🔍

AI Inspector Browser Extension

AI Output Evaluation & Governance

Transform AI outputs for evaluation, interpretability, and governance. Features gadgets for rapid testing, policy auditing, AI infection sanitization, content enhancement, and THM meta-evaluation. Includes evaluation suite with quality index, superintelligence index, alignment rate, and 20+ metrics. Local-first storage works with ChatGPT, Claude, Gemini - no API keys required.

Browser ExtensionAI EvaluationPolicy AuditingContent EnhancementLocal-first
View Repository
🌟

GyroDiagnostics

AI Safety Evaluation Framework

Independent AI testing framework for frontier model safety evaluation and dangerous capability assessments. Detects AI pathologies including deceptive alignment, hallucination, sycophancy, goal drift, and semantic instability through mathematical physics-informed diagnostics. Enables third-party AI evaluation and AI risk assessment with 5 targeted challenges and 20-metric quantitative analysis. First framework to operationalize superintelligence measurement from axiomatic principles.

AI Safety EvaluationPathology DetectionRisk AssessmentFrontier Models
View Repository
🍃

Alignment Infrastructure Routing (AIR)

Collective Superintelligence Architecture

Coordination infrastructure that amplifies human potential alongside AI. Routes workforce capacity, funding, and safety tasks into a unified, verifiable history. Connects three critical groups to build collective superintelligence: labs for scaling without chaos, funders for portfolio risk visibility, and everyone for paid, verifiable contribution units. Treats AI as part of collective network ensuring human agency scales with systems.

Collective SuperintelligenceWorkforce RoutingSafety TasksHuman-AI IntegrationCoordination Infrastructure
View Repository
💰

Moments Economy

Mitigating Risks of Transformative AI (TAI)

Monetary system grounded in physical capacity rather than debt, using caesium-133 atomic standard for quantifiable physical states. Provides unconditional high income (UHI) as baseline for everyone, with four tiers up to 60× UHI for higher responsibility roles. Supports both monetary distribution and complete governance records including scientific research provenance, AI model auditing, supply chain traceability, and personal consent tracking. Adversarial manipulation operationally impossible.

Transformative AIPhysical CapacityUnconditional IncomeGovernance RecordsMonetary System
View Repository
🌐

Gyroscopic Global Governance (GGG)

Post-AGI Multi-domain Governance Sandbox

Models how human-AI systems align across Economy, Employment, Education, and Ecology, showing robust convergence to a stable equilibrium under seven coordination strategies. Demonstrates that poverty resolves through coherent surplus distribution, unemployment becomes alignment work rather than residual labour, miseducation shifts toward epistemic literacy, and ecological degradation appears as upstream displacement, not external constraint.

Post-AGI GovernanceMulti-domain ModelingEconomic EquilibriumAlignment StrategiesGovernance Simulation
View Repository
⚙️

Gyroscope Protocol

LLM Alignment Protocol

AI alignment protocol implementing scalable oversight and AI control mechanisms for responsible AI development. Delivers proven AI safety improvements: +32.9% quality gains for ChatGPT, +37.7% for Claude Sonnet. Enhances structural reasoning, AI accountability, AI traceability, and behavioral integrity without model retraining. Addresses AI misalignment through systematic approach to AI governance and transparency metrics. Works with any foundation model including large language models and AI agents.

LLM AlignmentAI ControlScalable OversightSafety Protocol
View Repository

Gyroscopic Alignment Research Lab

Mathematical Physics Foundations

AI alignment theory grounded in mathematical physics and gyroscopic dynamics for structural AI alignment research. Explores mechanistic interpretability, AI value alignment, and quantitative AI safety metrics from first principles. Provides theoretical foundations for understanding AI control problem, catastrophic AI risks, and alignment challenges in complex intelligent systems. Advances AI safety science through physics-informed approaches to stability, coherence, and temporal dynamics.

AI Alignment TheoryMathematical PhysicsMechanistic InterpretabilitySafety Science
View Repository
❤️

Gyroscopic Alignment Models Lab

Artificial Superintelligence Architecture (ASI/AGI)

AGI safety research and superintelligence alignment architectures addressing fundamental challenges in artificial general intelligence development. Explores AI control problem solutions, AI value alignment frameworks, and mechanisms for safe superintelligence by design. Addresses coherence degradation, AI autonomy risks, and behavioral alignment in advanced AI systems. Develops AI governance tools and safety frameworks that prioritize AI transparency, human values, and responsible AI development for transformative AI.

AGI SafetySuperintelligence AlignmentAI Control ProblemAdvanced AI
View Repository

Contribute to AI Safety Research

All repositories welcome contributions. Whether you're a researcher, developer, or AI safety enthusiast, your insights and code contributions help advance the field of AI alignment and governance.

AI Safety Frameworks, Alignment Tools & Governance Solutions

Gyro Governance develops comprehensive open source AI safety frameworks, AI alignment protocols, andAI governance tools for frontier model testing, dangerous capability assessments, and AI pathology detection. Our repositories include The Human Mark classification system, AI Inspector browser extension,GyroDiagnostics evaluation suite, Alignment Infrastructure Routing for collective superintelligence,Moments Economy for transformative AI mitigation, and Gyroscopic Global Governance sandbox. Production-ready solutions for AI risk assessment, AI safety evaluation, and responsible AI development.

The Human Mark (THM) - AI Safety Classification System

The Human Mark provides a formal classification system mapping all AI safety failures to four structural displacement risks: Governance Traceability (GTD), Information Variety (IVD),Inference Accountability (IAD), and Intelligence Integrity (IID). Machine-readable grammar grounded in evidence law, epistemology, and speech act theory. Applications include jailbreak testing,control evaluations, alignment detection, research funding, andregulatory compliance.

AI Inspector Browser Extension

Transform AI outputs for evaluation, interpretability, and governance. Features gadgets for rapid testing, policy auditing, AI infection sanitization, content enhancement, and THM meta-evaluation. Includes comprehensive evaluation suite with quality index, superintelligence index, alignment rate, and 20+ metrics. Local-first storage works with ChatGPT, Claude, Gemini - no API keys required.

AI Safety Evaluation & Risk Assessment

  • AI Pathology Detection: Identify AI hallucination, AI sycophancy, deceptive AI alignment,AI goal drift, and AI semantic drift through structural diagnostics
  • Dangerous Capability Evaluations: Assess AI scheming, AI autonomy risks, and potential for catastrophic failure in large language models (LLMs) and frontier models
  • AI Alignment Metrics: Measure structural AI alignment, behavioral integrity, and AI transparencyusing physics-informed quantitative methods
  • Third-Party AI Evaluation: External AI evaluation framework enabling democratic AI evaluationand independent AI testing by researchers worldwide

Collective Superintelligence & Transformative AI

Alignment Infrastructure Routing (AIR) provides coordination infrastructure that amplifies human potential alongside AI, routing workforce capacity, funding, and safety tasks into unified, verifiable history. The Moments Economy implements a monetary system grounded in physical capacity rather than debt, using caesium-133 atomic standard for unconditional high income (UHI) and complete governance records. Together these address transformative AI risks while preserving human authority and accountability.

Post-AGI Multi-domain Governance

Gyroscopic Global Governance (GGG) models how human-AI systems align across Economy, Employment,Education, and Ecology, demonstrating robust convergence to stable equilibrium under seven coordination strategies. Shows that poverty resolves through coherent surplus distribution, unemployment becomes alignment work,miseducation shifts toward epistemic literacy, and ecological degradation appears as upstream displacement.

LLM Alignment & AI Control Mechanisms

Our AI alignment protocol addresses core challenges in AI safety governance by providingAI control mechanisms that improve AI accountability, traceability, and responsible AI development. The Gyroscope protocol demonstrates proven improvements in AI model evaluation across leading foundation models, enhancing scalable oversight and reducing risks of superficial AI optimization.

AGI Safety & Superintelligence Research

Our research addresses AGI safety and superintelligence alignment through mechanistic interpretability,AI safety theory, and gyroscopic physics foundations. We explore AI control problem solutions,AI value alignment frameworks, and architectures for safe artificial general intelligence (AGI) development that prioritize AI safety governance and human values.

For AI Safety Researchers & Developers

These repositories serve AI safety researchers, AI evaluators, machine learning engineers, and organizations implementing AI risk assessment and AI safety testing. Each project provides comprehensive documentation, AI safety benchmarks, and practical implementation guides for AI red teaming,AI safety audits, and continuous AI safety monitoring. Contributions welcome from researchers working on AI alignment research, AI safety frameworks, and AI governance solutions.

Open Source AI Safety Commitment

All tools support AI safety transparency, AI whistleblower protection, and AI public benefit goals. Our open-weight AI models approach enables AI safety culture through AI independent review,AI third-party oversight, and community-driven AI safety best practices. Mathematical physics foundations ensure structural coherence, gyroscopic stability, and quantitative rigor in all implementations.