How It Works

Transparency in action — Learn how we collect, validate, and publish AI environmental data from public sources to create the most comprehensive sustainability ranking.

100% Transparent Process
Open Source at Launch - Q1 2026

Our Mission

GreenCodeAI is an activist transparency platform — the 'Greenpeace of AI' — that exposes the real environmental impact of artificial intelligence. We believe that transparency drives change, and that every consumer, investor, and journalist should have access to verified data about AI companies' environmental footprint.

100% Independent

No corporate sponsorships from ranked companies. No conflicts of interest. We accept no funding from AI vendors or cloud providers.

Open Source

All our code, algorithms, and data sources will be public at launch (Q1 2026). Anyone will be able to audit, contribute, or replicate our work.

Data Integrity

We only use verifiable public data: ESG reports, scientific papers, government databases, and verified APIs. No estimates without clear disclaimers.

How We Collect Data

A fully automated and transparent pipeline

Our data collection process combines multiple sources to create the most comprehensive view of each AI company's environmental impact. Everything is automated, scheduled, and version-controlled.

Our Data Sources

1. Corporate ESG Reports

We scrape sustainability reports published by companies (Google, Microsoft, Meta, Amazon, etc.)

Examples: Google Environmental Report, Microsoft Sustainability Report, Meta Sustainability Report

Updated: Annually (when companies publish new reports)

2. Public APIs

We integrate real-time data from verified environmental APIs

  • Electricity Maps API — Real-time grid carbon intensity by region (gCO₂/kWh)
  • Google Cloud Carbon Footprint — Public carbon data for GCP datacenters
  • ML CO2 Impact — Estimated emissions for AI models based on scientific research
  • Company APIs — Some companies (like Anthropic, Hugging Face) provide public metrics

3. Scientific Research

Peer-reviewed studies and academic papers about AI energy consumption

  • "Energy and Policy Considerations for Deep Learning in NLP" (Strubell et al., 2019)
  • "Carbon Emissions and Large Neural Network Training" (Patterson et al., 2021)
  • University research labs (Stanford HAI, MIT Climate, Berkeley AI Research)

4. Manual Verification

When data is incomplete, our team manually verifies through:

  • Press releases and company blog posts
  • Investor presentations (ESG sections)
  • Regulatory filings (CSRD compliance in EU)
  • Datacenter specifications (PUE, WUE, renewable %)

The Data Pipeline

From raw data to rankings

Our automated ETL (Extract, Transform, Load) pipeline runs on a weekly schedule, ensuring our rankings reflect the latest available data.

01

Scraping & Extraction

Every Monday at 3:00 AM UTC, our Python scraping bots extract data from corporate reports, press releases, and research papers.

Tech: Scrapy (web scraping) + Playwright (JavaScript-heavy pages) + Pydantic (data validation)

02

API Integration

We fetch real-time data from Electricity Maps, Google Cloud Carbon Footprint, and other verified APIs. Carbon intensity data is updated daily.

Tech: REST APIs + rate limiting + caching (5-minute intervals)

03

Data Validation

All collected data passes through validation rules: detecting outliers, cross-referencing sources, flagging missing data.

  • ✓ Cross-reference with 2+ sources when possible
  • ✓ Flag values outside expected ranges (outliers)
  • ✓ Verify datacenter locations via geolocation APIs
  • ✓ Timestamp all data points for historical tracking
04

Score Calculation

We apply our open methodology to calculate the Environmental Impact Score (EIS) for each company. All algorithms are public on GitHub.

See detailed methodology
05

Storage & Versioning

Processed data is stored in PostgreSQL (relational data) with full version history. Every change is tracked with timestamps.

Tech: Supabase PostgreSQL + Time-series optimization

06

Publication

Updated rankings are automatically published to the website. Our public API makes all data available to journalists, researchers, and developers.

Tech: Next.js ISR (Incremental Static Regeneration) + Public REST API

Update Schedule

Our data is refreshed automatically:

Daily

Carbon intensity updates (Electricity Maps API)

Weekly

Web scraping of news, blog posts, press releases

Monthly

Full ranking recalculation with new data

Quarterly

ESG reports ingestion + methodology review

Radical Transparency

Every step is auditable

Unlike proprietary ranking systems, we make everything public. Our commitment to transparency is non-negotiable.

Open Data

All our processed data will be available via:

  • Public API (free, rate-limited to 100 req/hour)
  • CSV/JSON exports (download raw rankings)
  • Open datasets (historical data available at launch)

Open Algorithms

Our scoring methodology is fully documented:

  • Methodology page — Explains EIS formula in detail
  • Source code — All calculations will be public at launch (Q1 2026)
  • Peer review — We welcome scientific feedback

Community Corrections

We welcome corrections from anyone:

  • Contact form — Report data errors or outdated info
  • Submit Data form — Companies can submit official metrics
  • Community contributions — Open at launch (Q1 2026)

No Conflicts of Interest

We maintain complete independence:

  • ❌ No funding from ranked AI companies
  • ❌ No cloud provider sponsorships
  • ❌ No pay-to-improve or pay-to-remove options
  • ✓ Funded by donations, ESG-aligned partners, paid audits

Our Impact

How transparency drives change

By exposing environmental data, we empower multiple audiences to make informed decisions and pressure companies to improve.

Consumers & Developers

Choose greener AI providers

  • Prefer companies with high EIS scores
  • Avoid blacklisted companies with poor transparency
  • Support open-source models hosted on renewable energy

ESG Investors

Assess sustainability risks

  • Integrate EIS scores into investment decisions
  • Demand better environmental reporting from portfolio companies
  • Benchmark companies against industry averages

Journalists & Media

Investigate greenwashing

  • Expose companies making false sustainability claims
  • Compare public commitments vs. real metrics
  • Use our data in investigative journalism

Policymakers & Regulators

Inform AI sustainability policies

  • Reference our data in climate legislation
  • Mandate environmental reporting requirements
  • Establish industry benchmarks for AI emissions

Green List & Blacklist

We amplify the best and expose the worst:

Green List

Companies that excel in transparency and sustainability

Criteria: 100% renewable energy, PUE <1.2, quarterly public reporting, verified certifications

Blacklist

Companies with poor transparency or high environmental impact

Criteria: No public data >12 months, high-carbon datacenters without offsets, water stress violations

Join the Movement

Every AI request has a carbon cost. It's time to make that cost visible.

💚 Code will be open source at public launch (Q1 2026)