5-minute quickstart guide

Gutenberg uses sparse autoencoders for hypothesis generation. We discovered features that constitute AI-generated text on Pangram’s EditLens dataset. Run experiments on your dataset with our SDK.

1. Sign up and create an API key

Create an account at console.gutenberg.ai, then make a key on the API keys settings page.

2. Install the SDK

uv add gutenberg-sdk
export GUTENBERG_API_KEY=gtn_...

3. Upload and run an experiment

Upload your dataset, extract SAE features over it, and run an experiment. In our example, we use a 100-sample subset of Pangram’s EditLens dataset to see which features distinguish AI-generated vs. human-generated text.

from gutenberg import Gutenberg

client = Gutenberg()

# 1. upload your data, a parquet with a text column and a binary label
dataset = client.datasets.upload(
    "pangram-editlens-100.parquet",
    name="Pangram EditLens 100",
)

# 2. extract sae features over the text
job = client.jobs.wait(
    client.jobs.create(
        dataset_id=dataset.dataset_id,
        model_id="google/gemma-3-27b-it",
        sae_id="layer_31_width_262k_l0_medium",
    ).job_id
)

# 3. run an experiment, scoring features against your label
exp = client.experiments.create(
    job_id=job.job_id,
    target_column="is_ai",
    target_column_type="binary",
)
print(exp.url)  # the experiment's page, open it to watch features populate

4. Review results

Check the console to see the results of your experiment. Here’s ours for reference. In this case, the features are sorted by how strongly they correlate with AI-generated text.