把你的故事变成插画

透明六阶段、随处可停、每阶段配方与产出物都可免费拷走；只有生图耗额度。

开始创作

耗额度

贴故事或给个主题，选个风格 —— 一步直接生成板子图（人物、配角池、世界、构图、光影）。这一步会消耗额度。

故事输入

一句话主题

尚未填入任何故事材料

风格输入

从风格库选一个

正在载入风格库…

尚未填入任何故事材料
从风格库选一个

输入未完成 · 请先补齐故事与风格

填好上面的故事和风格，点「生成板子」

查看配方

引擎配方 · 03-boards.md

# 03 — Production Boards (the board set — a pool, not a fixed four)

The board set is a **pool**, not a fixed four. It forms the production bible; the board images are the **image references for final scene generation**. The pool is: **one Character board per main character (normally ≥2)** + **one Supporting Cast / Extras Pool board** (skip if `cast_pool` is null) + **World** + **Composition** + **Lighting**. Each template below is a fill-slot generation prompt: fill `{…}` from `style-lock.json` / `story.json`, send to the image model. Foundation boards (character, world) first, from style ref images; secondary boards (composition, lighting) after, attaching the character + world board images as additional references.

**The board set is the production bible; per-scene generation selects a *subset* of it** (present characters' boards + world/composition/lighting, plus the cast board only when the scene has extras) — see 05. There is **no fixed 4-reference cap**: attach exactly the boards a scene needs.

## Common rules (apply to all boards)

**Produce the board image directly. Do NOT ask clarifying questions, do NOT request more detail, do NOT reply with text — generate the image from the specification given.** (gpt-image is chat-capable and will stall an unattended run by asking follow-ups otherwise — gap-bucket ② / story-v1/014.)

Every board prompt **opens with** `{style_sentence}` and **ends with** the negative block:

> Board sheet on a warm off-white background, clean separated studies, generous spacing, small hand-lettered labels only. Not a story illustration, not a single scene. No logos, no watermarks, no brand names, no fake UI panels, no dense paragraphs of text, no copied reference content. {3-5 top items from negative_style_rules}.

## Board A — Character (one board per main character → 2..N boards)

Normally there are **≥2** of these — one per entry in `character_briefs` (a single Character board only when the story is genuinely single-character, i.e. `solo_justification` is set).

> {style_sentence} Character identity board for {name}: {visual_anchor}. Personality: {personality}.
> Layout: left — large main full-body identity pose; center — neutral front view plus side and back turnaround, one attitude pose typical of the character; right — four to six expression studies, outfit and detail close-ups, close-up of {signature prop from key_props}; bottom strip — solid silhouette study, color palette swatches ({palette}), short identity labels (name, role, traits).
> Same face, same body type, same outfit logic in every view. {character avoids from things_to_avoid.character}.

## Board B — World / Setting

> {style_sentence} World and setting board for this story: {world_brief.anchor}.
> Layout: top — one wide world-overview panel establishing geography and atmosphere; center — thumbnails of key locations: {locations: name + visual, 3-5}; right — material and surface swatches (the world's woods, stones, fabrics, waters as small studies), studies of recurring props: {motif names}; bottom strip — era/technology cue objects (what belongs in this world), color palette swatches ({palette}).
> One material logic, one era, one palette across all panels. {world avoids from things_to_avoid.world}.

## Board C — Composition (secondary; attach boards A+B as references)

> {style_sentence} Composition grammar board for this story, drawn as small simplified thumbnail studies using this story's character and locations (consistent with the reference boards).
> Layout: row 1 — shot distance library: extreme wide, wide, full-body environmental, medium, close-up, object insert; row 2 — camera angle library: eye-level, high angle, low angle, over-the-shoulder, profile, back view, through-doorway; row 3 — depth studies showing distinct foreground / midground / background layering, and character placement patterns: left third, right third, small figure in large space, edge of frame, partially obscured by foreground; row 4 — narrative focal point studies (face, hands, prop, threshold, light source, empty space) and one negative-space quiet composition.
> Thumbnails are grammar samples, not finished scenes; keep them simple and readable. No centered-portrait default anywhere.

## Board D — Lighting / Color (secondary; attach boards A+B as references)

> {style_sentence} Lighting and color board for this story.
> Layout: top — emotional color arc strip: one swatch-and-mini-thumbnail per story beat following {emotional_arc}; center — time-of-day lighting library (morning, midday, evening, night) applied to {primary location}, and interior versus exterior light studies; right — warm/cool contrast studies, the main character under three different lights with face readable in each; bottom strip — material light response (how {2-3 signature materials} catch light), full palette swatches ({palette}) with one accent row.
> Lighting stays simplified per the style: {lighting_logic}. {lighting avoids from negative_style_rules if any}.

## Board E — Supporting Cast / Extras Pool (skip if `cast_pool` is null)

One 16:9 board drawn as a **lineup / casting sheet** of the `cast_pool.figures` (7–8 figures). This is the casting pool for crowd / passerby / onlooker scenes so they stop being avoided. NOT full multi-view turnarounds (too dense to be usable as a reference).

> {style_sentence} Supporting cast / extras casting sheet for this world: {cast_pool.rationale}.
> Layout: a clean lineup of the world's background figures, one study block per figure for all of {cast_pool.figures}: each block — one clear full-body study + one face close-up + a small hand-lettered label ({archetype}); per figure render {visual}. Group them as a casting sheet with generous spacing, varied age, role, and silhouette across the set.
> Shared world palette and material logic across all figures ({palette}); consistent with the world board. These are anonymous background people, not the main characters. {world avoids from things_to_avoid.world}.

## Output handling

Persist each board image before proceeding (OSS in production; local file in spike runs). The board set then goes to 04 (QA). At generation each scene attaches **only its relevant subset** (05) — there is no fixed reference count and no 4-image cap. Regenerate only the board that 04 marks REGEN.

挑场景

故事切好段、打了分，你来挑要画哪几幕。

可选 —— 第 2、3 步把故事切成场景并合成每场景提示词，第 4 步再把提示词变成场景图。

先生成板子，再来挑场景

查看配方

引擎配方 · 02-story-scenes.md

# 02 — Story Text → Sections → Scenes (text-first, all provenance tiers)

One pipeline stage, four sub-steps. **Every story is text-first**: scenes are picked from real paragraphs, never invented free-floating — this is what makes every image slot land on a `story_sections` anchor (`section_id`) with zero fuzzy matching at render time.

```
(A) story text          T1: 1 LLM call writes it | T2/T3: provided rights-cleared text
(B) split → sections    script, deterministic (this also populates story_sections)
(C) score paragraphs    cheap LLM (deepseek-flash tier), batched, EVERY paragraph judged
(D) select → scenes     script picks sections by rule, then TWO main LLM calls:
                          Call 1 = Story Brief (story facts; no scenes),
                          Call 2 = Scene Expansion (expands picks into scenes)
```

The story-facts call is split out from scene expansion **on purpose**: a downstream UI needs the Story Brief (and the boards it drives) available *before* scenes exist, so the brief must be produced — and persisted — first.

Artifacts, persisted in order: `sections.json` → `scores.json` → `story.json` (two sub-artifacts: `story-brief` from Call 1, then `scenes` from Call 2, both merged into `story.json`).

## (A) Story text — generative writing (canonical prompt lives in `06-story-writer.md`)

The writing prompt body is **not** duplicated here. The single source of truth is
the stage-06 asset `06-story-writer.md`:

- **T1** (original story) → `06-story-writer.md` § (T1)
- **T2** (public-domain rework) → `06-story-writer.md` § (T2)
- **T3** (in-copyright) → rejected at the writer contract; never reaches a prompt.

The writer entry (`story/write.ts`) validates the `provenance` contract field,
loads the matching section, and runs it on the `story-main` lane (M3). For T2/T3
source text that is *provided* rather than generated, skip stage 06 and feed the
rights-cleared text straight into (B) below. Either way, the prose handed to (B)
is **plain paragraphs separated by blank lines, no headings** — the format below
assumes nothing more.

## (B) Split — script spec (no LLM)

Split on blank lines / paragraph boundaries; assign sparse `idx` (10, 20, 30…) matching `story_sections` insertion. Output `sections.json`: `[{ "idx": 10, "text": "..." }]`. Headings/pull-quotes (if any) get `kind` tags and are excluded from scoring.

## (C) Paragraph scoring — cheap-model prompt, batched (~30-40 paragraphs per call)

> You will receive numbered story paragraphs. For EVERY paragraph — no skipping — judge how well it could become one illustrated scene. Return one JSON array, one entry per paragraph:
> `{ "idx": 10, "score": 0-10, "why": "one short line", "type": "opening | routine | social | object | transition | tension | turning_point | aftermath | closing" }`
> Score high (7+): one concrete moment, characters acting, strong objects/light/setting, emotional charge made visible. Score low (0-3): exposition, summary, abstract reflection, dialogue with no stage. Judge each paragraph on its own; do not ration high scores.

Persist merged `scores.json`. Every scoring call must return exactly one entry per input paragraph — re-ask on count mismatch.

## (D) Selection — script rule (deterministic, no LLM judgment)

Given target scene count K (default 10):
1. Rank paragraphs by score; take top K.
2. **Coverage**: split the text into thirds; each third must contribute ≥1 pick — if a third has none, drop the global lowest pick and take that third's highest-scoring paragraph instead (floor: accept any score ≥1 rather than leave a third empty).
3. **Adjacent merge**: consecutive picked paragraphs merge into one scene spanning a contiguous `idx` range (then backfill from the next-ranked paragraph to keep K scenes, re-checking coverage).
4. Output: K entries of `{ "sections": [idx, ...] }`, document order. Same text + same scores ⇒ same picks. Thresholds (K, floor, batch size) are tunable constants, not judgment calls.

## Call 1 — Story Brief (story facts only; NO scenes)

Input: full text (with idx numbers) + style lock (optional). Picks are **not** needed here. Return **exactly one JSON object** then the brief digest table. All values in English.

```json
{
  "story_summary": "2-4 sentences: premise, protagonist, movement, emotional core.",
  "audience_tone": "audience + tone + visual maturity in one line.",
  "emotional_arc": "one line tracing the emotional progression start → end.",
  "character_briefs": [
    {
      "name": "",
      "role": "protagonist / companion / minor",
      "visual_anchor": "30-50 words, prompt-ready physical description: age impression, face, build, outfit, signature marks/props, posture language. Pasted verbatim into board and scene prompts.",
      "personality": "one line: temperament + how it shows in body language.",
      "key_props": ["objects bound to this character"]
    }
  ],
  "solo_justification": "OMIT this field entirely when there are ≥2 characters. Include it ONLY for a genuine single-character story — one line explaining why the story has exactly one character present throughout (and, if applicable, why cast_pool is null).",
  "world_brief": {
    "anchor": "40-60 words, prompt-ready: place, era/technology level, materials, atmosphere, what belongs and what must never appear.",
    "locations": [{ "name": "", "visual": "one line", "function": "one line" }]
  },
  "cast_pool": {
    "rationale": "one line: why these figures populate this world",
    "figures": [
      { "archetype": "e.g. harbor fishmonger", "visual": "20-30 words, prompt-ready physical description + typical dress", "role_in_world": "one line" }
    ]
  },
  "motifs": [{ "name": "", "meaning": "one line", "arc": "first appears → changes → ends as" }],
  "things_to_avoid": {
    "character": ["story-level character avoids"],
    "world": ["story-level world avoids"]
  }
}
```

**`character_briefs` rule (important).** Extract **at least 2 main characters** (role protagonist/companion). Actively look for a real second character present in the text before concluding there is only one — do not lazily return a single protagonist. ONLY if the story genuinely has a single character present throughout (e.g. solo survival, a lone keeper) may you return one — and then you MUST set a top-level `solo_justification` string explaining why. **Never invent a phantom character to hit the count.**

**`cast_pool` rule.** A single grouped set of **7–8 world-fitting background archetypes** (passersby / NPCs / extras), NOT required to be named in the story text — they are a reusable "casting pool" so crowd/passerby scenes stop being avoided. The 7–8 figures must vary in age, role, and silhouette and all be consistent with `world_brief`. If the world genuinely has no other people possible (e.g. alone at sea), set `"cast_pool": null` and explain in `solo_justification`.

After the JSON, output `## Story brief digest` — `summary | audience_tone | characters (names + roles) | cast_pool (figure count or null) | locations | motifs`. This is the brief review surface, and the input that boards (03) and Scene Expansion (Call 2) both build on.

## Call 2 — Scene Expansion (scenes only)

Input: full text (with idx numbers), the K picks with their `why`/`type`, the **Story Brief from Call 1**, style lock (optional). Return **exactly one JSON object** (`scenes` only) then the digest table. All values in English.

```json
{
  "scenes": [
    {
      "id": "scene_01",
      "sections": [30, 40],
      "title": "short concrete title (what the image shows, not poetry)",
      "role": "from the pick's type tag",
      "moment": "2-4 sentences: ONE exact moment grounded in the anchored paragraphs — do not import events from elsewhere in the text.",
      "characters_present": ["names; [] for empty scenes"],
      "extras": false,
      "character_state": "emotion + body language of each present character, one line.",
      "location": "from world_brief.locations",
      "time_light": "time of day + weather + light quality, one line.",
      "key_objects": ["2-5 objects present in the anchored paragraphs"],
      "mood": "one line emotional beat.",
      "visual_hook": "one line: the single most image-worthy thing in this scene.",
      "carry": "one line: what object/state carries over from the previous scene."
    }
  ]
}
```

`extras` is a boolean — set `true` when this scene's setting should include background people (crowd, passersby, onlookers). This drives whether the cast board is attached at generation; `characters_present` still lists only named characters in shot.

Hard rules: one scene per pick, `sections` copied from the pick unchanged (never re-anchor); scene order = document order; characters accumulate emotion across scenes (use `carry`); concrete over abstract — every scene drawable from its own fields.

After the JSON, output `## Scene digest` — one row per scene: `id | sections | role | location | time_light | characters | extras | key object | mood`. This is the human review surface.

## Downstream contract

- `sections.json` rows → `story_sections` (text body, reader v2). `story.json` → `story_bibles.structured_brief` (permissive jsonb; `scenes[].sections` is the lean-v1 **Section Anchor**). The Story Brief (Call 1) and the `scenes` array (Call 2) are persisted into the same `story.json` — Call 1 first (so boards/UI can start), Call 2 merged in once scenes exist.
- Image slots are created per scene with `section_id` = first idx of `scenes[].sections` (join by scene `id` when 05's prompts come back).
- `visual_anchor` / `world_brief.anchor` / `things_to_avoid` are pasted verbatim into board prompts (03) and compact prompts (05).

每场景提示词

每个场景合成一段可直接生图的提示词 —— 免费可拷。

先挑好场景

查看配方

引擎配方 · 05-final-prompts.md

# 05 — Compact Scene Prompts (one LLM call per story, all scenes)

Replaces spike's master-scene + per-scene final-builder + platform exporter. One call writes the compact prompt for **every** scene. Input: `story.json` + `style-lock.json` + `board-qa.json` `patch_instructions`. Output: `final-prompts.json` — each `prompt` string is what gets stored in `story_image_slots.prompt` and sent to the image model with the scene's **selected subset of boards** attached (see the generation block below — no fixed reference count).

## Prompt

You are a senior image-generation prompt writer. Using the story data, style lock, and QA patch instructions provided, write one **compact, self-contained** image-generation prompt per scene.

**Lock vs choose.** Locked by the inputs: character identity (`visual_anchor`), world logic (`world_brief.anchor`), style (`prompt_insert`, `palette`), story-level avoids. You choose per scene: ONE moment, ONE camera distance, ONE camera angle, one character placement, one main light source, one narrative focal point, a small set of props from `key_objects`.

Each prompt, in this order, as flowing English prose (no markdown, no headings, no lists):

1. **Style opener** — `{prompt_insert}` as the first clause.
2. **Camera** — distance (extreme wide / wide / full-body environmental / medium / medium close-up / close-up / object insert) + angle (eye-level / high / low / over-the-shoulder / profile / back view / top-down / through-doorway / through-window) + viewpoint in one sentence.
3. **Subject** — each present character: full `visual_anchor` inline (verbatim, compressed only if >50 words) + this scene's pose, gesture, expression from `character_state` and `moment`.
4. **Staging** — explicit foreground, midground, background; placement off-center unless justified; the focal point stated.
5. **World & props** — location rendered per `world_brief.anchor`; only this scene's `key_objects`, each placed concretely.
6. **Light & color** — `time_light` expanded into one light source, shadow quality, warm/cool balance; then `Palette: {palette}.`
7. **Mood line** — `Mood: {mood}.` plus one short closing image-sentence (the scene's emotional truth).
8. **Negative tail** — one sentence: top style negatives + this scene's specific avoids from `things_to_avoid`, e.g. *"No hard outlines, no anime, no vector polish, no photorealism, no logos, no text."*

**Hard rules**
- 250–350 words per prompt. Self-contained: a reader with only this string can paint the scene — no "see board", no "than scene 04" cross-references, no field names, no Chinese.
- Integrate every `patch_instructions` item into every applicable prompt as positive description.
- **Batch anti-repetition**: between consecutive scenes vary at least 3 of {camera distance, camera angle, character placement, foreground element, focal point, lighting}. Never use centered front-facing medium shot as a default.

Return **exactly one JSON object**, then the digest:

```json
{
  "scenes": [ { "id": "scene_01", "title": "", "prompt": "the compact prompt string" } ]
}
```

After the JSON, output `## Variety map` — one row per scene: `id | distance | angle | placement | focal point | light`. This is how the human (and the next batch) checks rhythm at a glance.

## Exemplar (spike output that passed acceptance, 儿童气象站 scene 01)

> Stylized 2D matte gouache coastal-storybook illustration. High-angle overview of a top-floor apartment balcony in a seaside town, mid-morning of midsummer, looking down from just inside the doorway. Pale cream tiled floor. In the center stands Pip — eight-year-old child, round face, dot eyes, small smile, candy-yellow t-shirt, soft blue shorts, sneakers, tape roll on pocket — hands on hips, chin up, excited breath. Behind her along the railing a freshly tied string of paper bunting (candy red, sun yellow, leaf green, soft pink, sky blue triangles) curves across the frame. To her right a plastic pinwheel windmill on a taped cardboard base. To her left three clean empty glass bottles in a loose row with soft horizontal highlights. On the floor in front of her: an open blank notebook, scattered crayons, tape roll, cardboard sign, safety scissors, colored paper scraps. At the railing edge lower-right a small soft rounded pigeon shape — round body, tilted head, wings tight — suggested not formally introduced. Beyond the railing: pale rooftops of a seaside town, calm turquoise sea, wide open pale-blue sky with simple rounded clouds. Visible crayon-collage and soft gouache texture throughout. Rounded simplified shapes, chalky pastel edges, large calm areas of sky and sea. No hard outlines, no anime, no vector polish, no photorealism. Palette: pale sky cream, seafoam blue, sunny turquoise, candy red, sun yellow, leaf green, soft pink. Mood: bright, cheerful, slightly breathless with announcement. A child has just opened a small world on a balcony.

## Generation call (per-scene reference selection)

Send the scene's `prompt` plus a **selected subset** of the board set — not a fixed list. There is **no fixed reference count and no 4-image cap**; attach exactly the boards this scene needs:

- **Always attach**: World board, Composition board, Lighting board.
- **Character board(s)** — attach only the board(s) for the characters in this scene's `characters_present`. Do NOT attach a character board for a character who is absent from the scene: it pulls that character into the image.
- **Cast / Extras Pool board** — attach only if the scene's `extras` is `true`.

Label whichever boards are attached and instruct consistency, e.g.:

```
World / Setting Board — location, materials, era, atmosphere
Composition Board — camera grammar only
Lighting / Color Board — light recipe and palette only
Character Board (<name>) — identity (face, body, outfit, palette)   [one per present character]
Supporting Cast / Extras Pool Board — background figures            [only when extras is true]
Follow the references for consistency. Do not copy any board layout or thumbnail as the final scene.
```

Provider notes: **midjourney** — append `--ar 16:9`, boards via image prompts/`--sref`; **gpt-image / nano-banana** — boards as input images, ref block prepended to the prompt; **text-only providers** — send the prompt alone (it is self-sufficient; expect weaker cross-scene consistency).

生成场景图

耗额度

把每场景提示词变成场景图 —— 同样消耗额度。

先在上一步生成每场景提示词

N 张场景图 + 可拷提示词 · 到此为止 —— 不渲染成书、不翻页、不发布。