Governing the Gold Rush: Visualizing AI Policy vs. Private Investment#

Project 2: Working Across Datasets#

Hello, it’s Wuhao here! Welcome to my Project 2 notebook. The goal of this assignment is to take two different datasets, combine them in Python, and create a single visualization that shows a relationship.

For my project, I wanted to explore a topic I’m passionate about: AI Governance.

My research question is: Globally, is the adoption of national AI policies growing at the same rate as private investment in AI?

To answer this, I’ll be combining two world-class datasets:

Dataset 1: The OECD AI Policy Observatory.

Dataset 2: Our World in Data (from Stanford AI Index).

Let’s get started!

Part 1: Loading & Cleaning Dataset 1 (AI Policies)#

First, I’ll load the data on AI policies from the OECD.

OECD.AI is an online interactive platform dedicated to promoting trustworthy, human-centric artificial intelligence (AI). Launched by the Organisation for Economic Co-operation and Development in 2020, the Observatory is an essential resource for policymakers, researchers, businesses, and civil society, offering a comprehensive view of global AI initiatives, trends, and governance frameworks.

Source: OECD.AI Database of National AI Policies

import plotly.io as pio

pio.renderers.default = "notebook_connected+plotly_mimetype"

import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

policies_df = pd.read_csv("oecd-ai-all-ai-policies.csv", encoding='utf-8', encoding_errors='ignore')

print("OECD AI Policy Raw Data")
policies_df.info()
policies_df.head()
OECD AI Policy Raw Data
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1884 entries, 0 to 1883
Data columns (total 52 columns):
 #   Column                                                                                                                          Non-Null Count  Dtype  
---  ------                                                                                                                          --------------  -----  
 0   Policy initiative ID                                                                                                            1884 non-null   object 
 1   Platform URL                                                                                                                    1884 non-null   object 
 2   English name                                                                                                                    1883 non-null   object 
 3   Original name(s)                                                                                                                985 non-null    object 
 4   Acronym                                                                                                                         529 non-null    object 
 5   Country                                                                                                                         1884 non-null   object 
 6   Start date                                                                                                                      1830 non-null   float64
 7   End date                                                                                                                        466 non-null    float64
 8   Description                                                                                                                     1854 non-null   object 
 9   Theme area(s)                                                                                                                   1884 non-null   object 
 10  Theme(s)                                                                                                                        1884 non-null   object 
 11  Background                                                                                                                      1151 non-null   object 
 12  Objective(s)                                                                                                                    1831 non-null   object 
 13  Target group type(s)                                                                                                            1826 non-null   object 
 14  Target group(s)                                                                                                                 1826 non-null   object 
 15  Responsible organisation(s)                                                                                                     1793 non-null   object 
 16  Yearly budget range                                                                                                             1884 non-null   object 
 17  Budget amount
(in local currency)                                                                                               5 non-null      float64
 18  Has funding from private sector ?                                                                                               1884 non-null   bool   
 19  Public access URL                                                                                                               1546 non-null   object 
 20  Is a structural reform ?                                                                                                        1884 non-null   bool   
 21  Is evaluated ?                                                                                                                  1884 non-null   bool   
 22  Evaluation URL                                                                                                                  83 non-null     object 
 23  AI Principle(s)                                                                                                                 1745 non-null   object 
 24  AI Policy Area(s)                                                                                                               1555 non-null   object 
 25  Other AI Policy Area(s)                                                                                                         18 non-null     object 
 26  Shift(s) related to Covid                                                                                                       36 non-null     object 
 27  Evaluation performed by                                                                                                         71 non-null     object 
 28  Evaluation type                                                                                                                 69 non-null     object 
 29  Evaluation provides input to                                                                                                    58 non-null     object 
 30  Policy instrument ID                                                                                                            1884 non-null   object 
 31  Policy instrument type category                                                                                                 1837 non-null   object 
 32  Policy instrument type                                                                                                          1837 non-null   object 
 33  Policy instrument name                                                                                                          859 non-null    object 
 34  Policy instrument description(s)                                                                                                500 non-null    object 
 35  Strategy priority targets and deadlines                                                                                         48 non-null     object 
 36  Coordinating institution name                                                                                                   20 non-null     object 
 37  Consultation process objective                                                                                                  15 non-null     object 
 38  Consultation process begin date                                                                                                 18 non-null     object 
 39  Consultation process end date                                                                                                   12 non-null     object 
 40  Link                                                                                                                            431 non-null    object 
 41  Policy instrument mini-field(s)                                                                                                 1351 non-null   object 
 42  Objective                                                                                                                       70 non-null     object 
 43  Deployment year                                                                                                                 48 non-null     float64
 44  Cancellation reason                                                                                                             9 non-null      object 
 45  Entities involvement                                                                                                            23 non-null     object 
 46  Allocated funding                                                                                                               13 non-null     float64
 47  Methodology in place to assess the risk and evaluate the impact of AI in public services                                        5 non-null      object 
 48  Measures taken to communicate the use of the AI system to citizens (transparency)                                               23 non-null     object 
 49  Measures taken to enable citizens to understand and challenge the outcome of the AI system (explainability and accountability)  4 non-null      object 
 50  Audit, certification, monitoring, evaluation or regulation process                                                              12 non-null     object 
 51  Entered into force on                                                                                                           42 non-null     object 
dtypes: bool(3), float64(5), object(44)
memory usage: 726.9+ KB
Policy initiative ID Platform URL English name Original name(s) Acronym Country Start date End date Description Theme area(s) ... Objective Deployment year Cancellation reason Entities involvement Allocated funding Methodology in place to assess the risk and evaluate the impact of AI in public services Measures taken to communicate the use of the AI system to citizens (transparency) Measures taken to enable citizens to understand and challenge the outcome of the AI system (explainability and accountability) Audit, certification, monitoring, evaluation or regulation process Entered into force on
0 2021/data/policyInitiatives/1335 https://oecd.ai/en/dashboards/policy-initiativ... SPACERESOURCES.LU NaN NaN Luxembourg 2016.0 NaN Within the SpaceResources.lu initiative, the c... National AI Policies ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 2021/data/policyInitiatives/1337 https://oecd.ai/en/dashboards/policy-initiativ... DIGITAL LUXEMBOURG Digital L??tzebuerg NaN Luxembourg 2014.0 NaN Consolidating Luxembourgs position in the ICT ... National AI Policies ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 2021/data/policyInitiatives/1337 https://oecd.ai/en/dashboards/policy-initiativ... DIGITAL LUXEMBOURG Digital L??tzebuerg NaN Luxembourg 2014.0 NaN Consolidating Luxembourgs position in the ICT ... National AI Policies ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 2021/data/policyInitiatives/1355 https://oecd.ai/en/dashboards/policy-initiativ... DIGITAL TECH FUND NaN NaN Luxembourg 2016.0 NaN A seed fund was set up in 2016 jointly by the ... National AI Policies ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 2021/data/policyInitiatives/13968 https://oecd.ai/en/dashboards/policy-initiativ... GAMEINN NaN NaN Poland 2016.0 NaN Funding opportunities for the producers of vid... National AI Policies ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 52 columns

Cleaning the Policy Data#

The raw data is great, but info() tells me the year column needs cleaning.

My goal is to get a global, cumulative count of policies by year.

Thus, we conduct the following steps:

  1. Drop any rows where the Start date is missing.

  2. Rename Start date to Year for simplicity.

  3. Convert Year to an integer.

  4. Filter for the modern AI era (2015 onwards).

  5. Group by Year and count policies.

  6. Calculate the cumsum() (cumulative sum).

# We'll drop rows without a 'Start date'.
policies_cleaned = policies_df.dropna(subset=['Start date']).copy()

# Rename 'Start date' to 'Year' for better understanding
policies_cleaned = policies_cleaned.rename(columns={'Start date': 'Year'})

# Convert 'Year' to integer
policies_cleaned['Year'] = policies_cleaned['Year'].astype(int)

# After all, let's check the count for last 10 years first.
print("Years present in data:\n", policies_cleaned['Year'].value_counts().sort_index().tail(10))
Years present in data:
 2015     24
2016     56
2017     85
2018    299
2019    397
2020    363
2021    263
2022    119
2023    100
2024      9
Name: Year, dtype: int64
# Filter for the modern AI era (after 2016)
policies_modern = policies_cleaned[policies_cleaned['Year'] >= 2015].copy()

# Group by year and count policies
policies_by_year = policies_modern.groupby('Year')['Policy initiative ID'].count().reset_index()
policies_by_year = policies_by_year.rename(columns={'Policy initiative ID': 'annual_policies'})

# Calculate the cumulative sum
policies_by_year['cumulative_policies'] = policies_by_year['annual_policies'].cumsum()

print("\nProcessed Policy Data (Global, Cumulative)")
policies_by_year.tail(10)
Processed Policy Data (Global, Cumulative)
Year annual_policies cumulative_policies
0 2015 24 24
1 2016 56 80
2 2017 85 165
3 2018 299 464
4 2019 397 861
5 2020 363 1224
6 2021 263 1487
7 2022 119 1606
8 2023 100 1706
9 2024 9 1715

Part 2: Loading & Cleaning Dataset 2 (AI Investment)#

Now for the AI investment. I’m using the Our World in Data (OWID) dataset, sourced from the Stanford AI Index.

Source: Our World in Data - Private Investment in AI

investment_df = pd.read_csv("private-investment-in-artificial-intelligence.csv")

print("OWID AI Investment Raw Data")
investment_df.info()
investment_df.head()
OWID AI Investment Raw Data
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 48 entries, 0 to 47
Data columns (total 4 columns):
 #   Column                                 Non-Null Count  Dtype 
---  ------                                 --------------  ----- 
 0   Entity                                 48 non-null     object
 1   Code                                   36 non-null     object
 2   Year                                   48 non-null     int64 
 3   Global total private investment in AI  48 non-null     int64 
dtypes: int64(2), object(2)
memory usage: 1.6+ KB
Entity Code Year Global total private investment in AI
0 China CHN 2013 717196188
1 China CHN 2014 771392286
2 China CHN 2015 2385249620
3 China CHN 2016 5102962786
4 China CHN 2017 7314146469

Cleaning the Investment Data#

This data is already in great shape. My only steps needed are:

  1. Filter for just the ‘World’ total investment amount in AI.

  2. Rename the main investment column

  3. Ensure Year is an integer

  4. Select only the columns we need

# Filter for just the 'World' total
investment_global = investment_df[investment_df['Entity'] == 'World'].copy()

# Use Billions to count the investment and rename the main investment column
investment_global['Global total private investment in AI'] = investment_global['Global total private investment in AI'] / 1000000000
investment_global = investment_global.rename(columns={
    'Global total private investment in AI': 'Investment_Billions_USD'
})

# Ensure Year is an integer
investment_global['Year'] = investment_global['Year'].astype(int)

# Select only the columns we need
investment_global_clean = investment_global[['Year', 'Investment_Billions_USD']]
print("\n--- Processed Investment Data (Global, Annual) ---")
print(investment_global_clean)
--- Processed Investment Data (Global, Annual) ---
    Year  Investment_Billions_USD
36  2013                 6.013620
37  2014                10.942456
38  2015                15.262405
39  2016                19.339919
40  2017                28.432395
41  2018                46.509286
42  2019                61.664788
43  2020                77.256670
44  2021               145.400000
45  2022               104.636244
46  2023                92.789054
47  2024               130.255020

Part 3: Merging for the Final Visualization#

Now I’ll merge the two DataFrames on the Year column.

# Merge the two datasets on the 'Year' column
merged_df = pd.merge(policies_by_year, investment_global_clean, on='Year', how='inner')

print(' Merged Data for Plotting')
merged_df
 Merged Data for Plotting
Year annual_policies cumulative_policies Investment_Billions_USD
0 2015 24 24 15.262405
1 2016 56 80 19.339919
2 2017 85 165 28.432395
3 2018 299 464 46.509286
4 2019 397 861 61.664788
5 2020 363 1224 77.256670
6 2021 263 1487 145.400000
7 2022 119 1606 104.636244
8 2023 100 1706 92.789054
9 2024 9 1715 130.255020

Part 4: The Main Visualization#

In this section I’ll use a dual-axis chart to show cumulative_policies (Bars) and Investment_Billions_USD (Line).

# Create a figure with a secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add Annual Policies as a Bar chart
fig.add_trace(
    go.Bar(
        x=merged_df['Year'],
        y=merged_df['annual_policies'],
        name='Annual Number of New AI Policies',
        marker_color='royalblue'
    ),
    secondary_y=False,
)

# Add Annual Investment as a Line chart
fig.add_trace(
    go.Scatter(
        x=merged_df['Year'],
        y=merged_df['Investment_Billions_USD'],
        name='Annual AI Investment (Billions USD)',
        marker_color='red'
    ),
    secondary_y=True,
)

# Add figure titles and axis labels
fig.update_layout(
    title_text='<b>AI Policy Adoption vs. Private Investment (Global)</b>',
    xaxis_title='Year',
    legend_title='Metrics',
    legend=dict(
        orientation="h",
        yanchor="bottom",
        y=1.02,
        xanchor="right",
        x=1
    )
)

# Set the y-axes titles
fig.update_yaxes(
    title_text='Annual Number of New AI Policies',
    secondary_y=False,
    color='royalblue'
)
fig.update_yaxes(
    title_text='Annual Private AI Investment (Billions USD)',
    secondary_y=True,
    color='red'
)

# Make the X-axis show proper years
fig.update_xaxes(
    tickvals=merged_df['Year']
)

# Display the interactive plot
fig.show()

Part 5: Takeaways#

This “annual vs. annual” chart tells a complex and interesting story about a proactive government and an explosive market.

Takeaway 1: Policy and Investment Come in Waves#

Policy (Blue Bars): Starting around 2018, governments around the world suddenly got busy. There’s a clear policy wave with new strategies, regulations, guidelines—building up year after year and hitting a peak around 2020.

Investment (Red Line): Private investment doesn’t follow that pattern at all. Instead of increasing steadily, it goes absolutely vertical in 2021.

It’s not two curves following each other, instead it’s two totally different rhythms.

Takeaway 2: Governments Weren’t Reacting. They Were Preparing.#

This is the most critical insight, and it reverses our common knowledge in terms of government policies on AI.

The data shows governments were proactive. The global policy wave (2018-2020) clearly precedes the 2021 investment explosion. This suggests that governments saw the AI “gold rush” coming and were actively trying to build frameworks, strategies, and guardrails before the market peaked.

Takeaway 3: The Market’s Scale is Unimaginable#

Even though governments were proactive, the sheer scale of the 2021 investment spike ($140B+) shows that the market’s eventual force was beyond anyone’s predictions.

This suggests that while policy can be forward-thinking, it cannot fully contain or predict the explosive, speculative nature of a technological gold rush.

This finding is made even stronger by our knowledge from the readme.doc. That $140B+ spike is a conservative underestimate that excludes all R&D from public companies (Google, Microsoft, etc.) and all public spending. The true market explosion that policymakers were trying to get ahead of was even larger.

Takeaway 4: The Post-2021 Policy Decline#

The chart shows a sharp drop in new AI policies after 2021. This doesn’t mean governments gave up on governance. Rather, it signals a critical shift into the second phase of policymaking.

Phase I (2018-2020): This was the High-Level Strategy phase. Governments were racing to publish broad National AI Strategy blueprints, leading to the 2020 spike.

Phase II (2021-Present): This is the Execution & Regulation phase. The focus shifted from announcing new strategies to the much slower, harder work of writing specific regulations (like the multi-year EU AI Act) and handling implementation details. This work is more difficult, takes far longer, and doesn’t appear in the database as a large number of new initiatives.

Part 6: Conclusions#

This project successfully combined two datasets to illustrate the complex relationship between AI governance and private investment.

Our final analysis, using a more rigorous “annual vs. annual” comparison, refutes the simple narrative of a “governance gap.” Instead, it reveals a more sophisticated story: Proactive governments laid the policy groundwork from 2018-2020, only to be followed by a private investment explosion in 2021 of a magnitude no one could have fully anticipated. After this peak, policymaking has shifted from “strategic breadth” to “regulatory depth,” entering a slower, more difficult phase of implementation.

It’s a great reminder that in AI, the visualization you choose isn’t just about aesthetics, it can completely change the narrative.

Part 7: Data Sources#

  1. AI Policy Data:

Source: OECD.AI Policy Observatory

Dataset: “Database of National AI Policies”

Link: https://wp.oecd.ai/app/uploads/2024/03/oecd-ai-all-ai-policies.csv

  1. AI Investment Data:

Source: Our World in Data

Dataset: “Global total private investment in AI”

Link: https://ourworldindata.org/grapher/private-investment-in-artificial-intelligence