> ## Documentation Index
> Fetch the complete documentation index at: https://docs.arcmira.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Transcripts

> Read full video transcripts with entity annotations, submit videos for transcription, and correct transcripts through community review.

Arcmira serves **premium transcripts** — diarized, speaker-identified, entity-annotated, community-correctable — and can generate one on demand for any public YouTube video. Every segment is matched against the entity graph, so people, organizations, products, and topics are annotated inline with character-level spans.

Many videos in the index carry a lightweight **preliminary analysis** (detected entities, moments, counts) without a premium transcript. Transcript surfaces are premium-only: until premium transcription runs, the transcript endpoint returns a counts-only summary of what the preliminary analysis detected, plus the quote to run the full premium analysis.

```text theme={null}
GET  /v1/transcripts/{video_id}          # read (teaser until unlocked)
POST /v1/transcriptions                  # submit a video for transcription
GET  /v1/transcriptions/{id}             # poll status (Retry-After)
POST /v1/videos/{video_id}/corrections   # submit corrections (free)
```

## Pricing

Transcript access is priced in **rows and 15-minute blocks** — the same row credits the rest of the API uses:

* **125 rows per 15-minute block** of video, rounded up, minimum one block.
* A 62-minute podcast is 5 blocks = **625 rows**. A 9-minute clip is 1 block = **125 rows**.
* **Same price either way**: whether a premium transcript already exists in the index or Arcmira generates it fresh for you.
* Unlocks are **permanent and per-account**. You pay once per video; every subsequent read is free — and if you unlocked a video before its premium transcript existed, running the premium analysis later costs nothing extra.
* If a transcription job fails permanently, the rows are **refunded automatically** and the unlock is revoked.

Every transcript response includes the exact quote in `meta.quote`:

```json theme={null}
{ "quarters": 5, "rows": 625 }
```

<Note>
  Corrections are always free (zero rows), and reading a transcript you have already unlocked is free.
</Note>

## Reading a transcript

```bash theme={null}
curl 'https://api.arcmira.com/v1/transcripts/dQw4w9WgXcQ' \
  -H "Authorization: Bearer $ARCMIRA_API_KEY"
```

The `access` field tells you where you stand:

| `access`          | Meaning                                                                                                                                                                                                                                                |
| ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `unlocked`        | Full premium transcript in the payload.                                                                                                                                                                                                                |
| `locked`          | Premium transcript exists; you get \~5 teaser segments and the `meta.quote`.                                                                                                                                                                           |
| `premium_pending` | A preliminary analysis exists but premium transcription hasn't run. The payload carries `detected` — entity-type counts only (`{ people, organizations, products, topics }`) — plus the quote. Run the premium analysis via `POST /v1/transcriptions`. |
| `not_transcribed` | Not in the index; `meta.quote` is present when the duration is known — submit via `POST /v1/transcriptions`.                                                                                                                                           |
| `unauthenticated` | Teaser only; authenticate to unlock.                                                                                                                                                                                                                   |

Gating is enforced server-side — locked responses simply do not contain the rest of the transcript, and `premium_pending` responses contain no transcript text, entity names, or timestamps at all.

### Unlocking

When a premium transcript exists (`locked`), pass `?unlock=true` to purchase in the same request: the quoted rows are debited, the permanent unlock is granted, and the full transcript comes back in one round trip. On `premium_pending` videos there is nothing to unlock yet — `unlock=true` is ignored; the purchase is the premium generation itself (`POST /v1/transcriptions`).

```bash theme={null}
curl 'https://api.arcmira.com/v1/transcripts/dQw4w9WgXcQ?unlock=true' \
  -H "Authorization: Bearer $ARCMIRA_API_KEY"
```

* `402` — not enough rows remaining this period (the body includes the quote).
* `403` — transcript access requires a paid plan.

### Response shape

```json theme={null}
{
  "video": { "videoId": "…", "title": "…", "channelName": "…", "durationSeconds": 3720 },
  "access": "unlocked",
  "segments": [
    { "index": 0, "start": 0.4, "end": 6.1, "text": "Welcome back to the show…", "speaker": 0 }
  ],
  "speakers": [
    { "id": 0, "label": "Speaker 0", "entity": { "id": 147403, "name": "Emad Mostaque", "slug": "emad-mostaque" } }
  ],
  "annotations": [
    { "segment_index": 12, "char_start": 34, "char_end": 40, "entity_id": 120034, "entity_type": "organization", "name": "OpenAI", "slug": "openai" }
  ],
  "meta": { "diarized": true, "locked": false, "revision": "rwmksmk", "quote": { "quarters": 5, "rows": 625 } }
}
```

* **`segments`** — transcript lines with start/end seconds and a diarization `speaker` id.
* **`speakers`** — the diarization map; entries gain an `entity` once a speaker has been identified as a person.
* **`annotations`** — entity mentions as character spans inside segment text.
* **`meta.revision`** — an opaque id for the served transcript plus its approved-correction state. **Save it**: corrections echo it back, and a changed revision means the transcript changed underneath you.

## Submitting a video for transcription

Any public YouTube video, up to 12 hours, paid tiers only. Rows are debited **up front** and the permanent unlock is granted at submit time — when the pipeline finishes, the transcript GET just works.

```bash theme={null}
curl -X POST 'https://api.arcmira.com/v1/transcriptions' \
  -H "Authorization: Bearer $ARCMIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{ "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ" }'
```

```json theme={null}
{
  "request": {
    "id": "0b0e8f3a-…",
    "videoId": "dQw4w9WgXcQ",
    "status": "queued",
    "stage": "queued",
    "quote": { "quarters": 5, "rows": 625 },
    "etaSeconds": 678,
    "nextPollSeconds": 30
  }
}
```

Useful properties of this endpoint:

* **Premium transcript already exists?** The request short-circuits to `complete` and the unlock still applies — unified pricing, no duplicate work. A video with only a preliminary analysis does **not** short-circuit: you're buying the premium generation, so the pipeline runs.
* **Already unlocked?** You are not charged again (`quote.rows` in the response shows what was actually debited) — including unlocks purchased before the premium transcript existed.
* **Already in flight?** A second submit for the same video returns the existing request with `existing: true`.
* **Idempotency-Key** is supported, per the standard [idempotency](/idempotency) contract.
* **User requests ride a reserved fast lane** through the pipeline — submission latency doesn't degrade when background indexing is busy.

### Polling

`GET /v1/transcriptions/{id}` derives live status from the pipeline. While the request is in flight the response carries a **`Retry-After` header** (seconds) plus `etaSeconds` and `nextPollSeconds` in the body. The polite loop is simply:

```python theme={null}
while True:
    res = get(f"/v1/transcriptions/{request_id}")
    if res.json()["status"] in ("complete", "failed", "refunded"):
        break
    time.sleep(int(res.headers.get("Retry-After", "30")))
```

Status walks `queued → downloading → transcribing → analyzing → complete`. Terminal statuses drop the `Retry-After` header. On `complete`, fetch `GET /v1/transcripts/{video_id}` — you were unlocked at submission.

If the pipeline fails permanently (or a request goes stale past 24 hours), the status becomes `refunded`: the rows come back and the unlock is revoked.

`GET /v1/transcriptions` (optionally `?video_id=`) lists your recent requests.

## Corrections

Transcripts are community-correctable, and corrections are **free** — they cost zero rows and accrue to the submitting account and key, like [Community Review](/feedback). Everything you submit is **optimistic for you, pending review for everyone else**: your own pending corrections ride back on the transcript GET immediately; once a reviewer approves them, they apply for all callers (and bump `meta.revision`).

### Lifecycle

1. **Submit** — the correction lands as a `pending` row, attributed to your key. Speaker identifications additionally create a community-flagged appearance on the person's page right away.
2. **Pending** — visible to you in the transcript GET (`edits[]`, `speakerIdentifications[]`, …); withdrawable via the matching `DELETE`.
3. **Approved** — applied for everyone at read time; `meta.revision` changes.
4. **Rejected / reverted** — removed from view; reverting a speaker identification also deletes the appearance it created.

### Unified endpoint

`POST /v1/videos/{video_id}/corrections` accepts every correction kind with a discriminated `kind`:

| `kind`             | What it does                                                                                                     |
| ------------------ | ---------------------------------------------------------------------------------------------------------------- |
| `line_edit`        | Fix the text of one segment.                                                                                     |
| `speaker_reassign` | Move lines (or part of a line — a sub-line split) to another speaker, a new speaker, or a non-person voice role. |
| `speaker_identify` | Link a diarization speaker to a person entity.                                                                   |
| `add_person`       | Propose a person not in the index yet and link the speaker in one action.                                        |
| `entity_tag`       | Tag an entity mention the pipeline missed, as a character span.                                                  |

```bash theme={null}
curl -X POST 'https://api.arcmira.com/v1/videos/dQw4w9WgXcQ/corrections' \
  -H "Authorization: Bearer $ARCMIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{
    "kind": "line_edit",
    "revision": "rwmksmk",
    "anchor": { "segmentIndex": 12, "contentHash": "1a2b3c" },
    "payload": {
      "segmentIndex": 12,
      "originalText": "Anthropics new cloud 4.5",
      "correctedText": "Anthropic'\''s new Claude 4.5"
    }
  }'
```

### Anchors, revisions, and ordering

Kinds that reference segment content (`line_edit`, `speaker_reassign`, `entity_tag`) must prove they were made against the transcript you actually saw:

* **`revision`** — echo `meta.revision` from the transcript GET.
* **`anchor.contentHash`** — a djb2 hash of the covered segment text (segments joined with `\n` for multi-segment selections), base-36 encoded:

```js theme={null}
function djb2(input) {
  let hash = 5381;
  for (let i = 0; i < input.length; i++) {
    hash = ((hash << 5) + hash + input.charCodeAt(i)) >>> 0;
  }
  return hash.toString(36);
}
```

* **`seq`** (optional) — a per-video monotonic counter for clients that submit streams of dependent corrections (sub-line splits re-index later segments, so order matters). One-off submissions can omit it.

The error semantics are designed for at-least-once, queue-style clients:

| Status        | Meaning                                                                                                              | What to do                                                                        |
| ------------- | -------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- |
| `409`         | Revision or anchor mismatch — the transcript changed underneath the correction. Body: `{ reason, currentRevision }`. | Drop or re-anchor this correction and continue; the sequence number was consumed. |
| `412`         | Sequence mismatch. Body: `{ expectedSeq }`.                                                                          | Refetch the transcript, rebase your local counter, resend.                        |
| `401`         | Authentication expired.                                                                                              | Pause the queue, re-authenticate, resume — never drop.                            |
| `429` / `5xx` | Transient.                                                                                                           | Retry the same event with backoff.                                                |

**Idempotency-Key** (use the correction's client-generated UUID) makes retries safe: replays return the stored final response verbatim with `Idempotency-Replayed: true`.

### Purpose-built wrappers

If you don't need the unified contract, three simpler routes cover the common cases — each supports `Idempotency-Key`, and each `POST` returns the row `id` you can `DELETE` to withdraw while still pending:

```text theme={null}
POST   /v1/transcripts/{video_id}/edits          { segmentIndex, originalText, correctedText }
DELETE /v1/transcripts/{video_id}/edits/{id}

POST   /v1/transcripts/{video_id}/speakers       { speakerId, entityId }  or  { speakerId, name }
DELETE /v1/transcripts/{video_id}/speakers/{id}

POST   /v1/transcripts/{video_id}/merges         { sourceName, targetEntityId, replaceWith? }
GET    /v1/transcripts/{video_id}/merges
DELETE /v1/transcripts/{video_id}/merges/{id}
```

* **Edits** fix segment text.
* **Speakers** identify who a diarized voice is — this creates the community appearance immediately; withdrawing removes it.
* **Merges** fix misattributed name mentions within one video ("mentions of *Imad* in this video are Emad Mostaque"), optionally respelling the transcript text. Mentions of the same name in other videos are untouched.

Rows created through the unified endpoint's `speaker_reassign` / `entity_tag` kinds are withdrawn at:

```text theme={null}
DELETE /v1/corrections/speaker-edits/{id}
DELETE /v1/corrections/entity-tags/{id}
```
