This time we're going to do something different AI skitsSPLITS THE PRODUCTION PROCESS FROM 0 TO 1 INTO TWO ARTICLES. THE PUBLIC SIGN SPEAKS OF METHODS, EFFECTS AND JUDGEMENT, AND OF INTRODUCTORY WORDS THAT ALLOW PATIENT READERS TO FOLLOW THE CURRICULUM AND GET INTO THE AI SHORTS。
Here is a list of the main contents of the article。
1. Tool preparation
2. A DETAILED STUDY OF THE CREATIVE WORK OF THE AI SHORT PLAY
- Script creation and mirror dismantling of core asset design and deposition
- Multimodular Straight Out Video
- Partial complement for video images
- Late editing and audio-visual reconstruction
3. Short play script speeds
- SET-AI QUICK SCRIPT ALERT TEMPLATE FOR SHORT PLAY SCRIPTS
4. Asset construction
- Role Design scene design
- Props asset design
All right, let's get this over with
I. Tool preparation
|
|
|
|
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Attention:
The above tools only need to be downloaded and installed on local computers. The remaining tools are available online directly。
TRY NOT TO OPERATE ON THE PHONE. LET'S DO AN AI VIDEO. COMPUTERS ARE EASIER AND BETTER。
II. EXCHANGE OF WORK IN THE ADVICE
The five-step theory that we were the first to introduce the video "View-Screen-Screen-Screen-Screen" was also applicable in the creation of the Ai short play, although it was fine-tuned. This theory is appropriate for all AI short/drama work streams before the reedance 2.0 comes out。

But with the outbreak of a new generation of multi-modular video models like Seedance 2.0, the production logic at the bottom of the industry has changed completely。
A new generation of models supports a combination of text, pictures, videos and audio references that allow for a precise understanding of the instructions on the structure, mirrors, actions and even sound. This has brought content production very close to industrial currents. When the binding capacity of the input end is significantly enhanced, the key to the size of the film is no longer blind and inefficient “repeat maps”, but rather to the preparation, transfer and deposition of assets. We can even skip the frame-drive spectroscopy for short plays where quality demands do not require perfect details。
Therefore, we have formally upgraded the traditional production process to a completely new "text-resource-vision-cut" system:

The script。
Quality: Core assets, including high-replicability role maps, scenes, props, and role sound material. It bears the primary responsibility for stability and coherence。
Vision: Generate dynamic video clips with original acoustics directly using the "Role/Scene Assets+Scene/Scene Indices" driver model。
Cut: later clips and image patches。
Under this new system, our operations are highly refined into the following five core steps:
The script and the spectroscopy
THE FIRST STEP OF THE SHORT PLAY IS ALWAYS A STORY. IN THE PAST, BECAUSE OF THE DIFFICULTY OF AI IN DEALING WITH DELICATE EMOTIONS AND EXPRESSIONS, WE PREFERRED TO CHOOSE TOPICS SUCH AS OBSCURANTISM, HYSTERIA AND FACE-TO-FACE RETICENCE, HIGH RHYTHM AND SIMPLE PERSONALITY RELATIONSHIPS。
But now, as the quality of these tools changes in micro-expressions and physical action control, AI has a very high degree of competence for emotional expression. This means that the decisive effect of an extremely good, robust script on the success of the work has been amplified as never before。

When we get a high-quality story, we need to turn it into a standard video format — a clear picture, a visualization of psychological activity, and the extraction of emotional whites. This process is part of the creation of the script. In practice, in the vast majority of cases this is done by a specialized script, which is usually directly available to us。
SUBSEQUENTLY, THROUGH THE RAPID DISASSEMBLY OF AI, THE SPECTROSCOPY SCRIPTS CONTAINING THE SCENERY, MIRRORS AND TIME-LONG SPECTROMETERS WERE REMOVED TO PROVIDE A PRECISE “CONSTRUCTION DRAWING” FOR SUBSEQUENT GENERATION, AND TO CONVERT THE SCRIPT INTO A FORM OF SPECTROSCOPY TIPS THAT IS MORE EASILY UNDERSTOOD BY AI VIDEO MODELS。
Core asset design and deposition
This is the most subversive step in the new process — the replacement of the “Spectroscopy” with “Building Asset Bank”。
Assets are the cornerstone that can be repeatedly called in the whole play to ensure visual consistency。
Core assets are divided into two main categories. First of all, role assets, like the make-up of the show, we need to create a stable human view of the key players (head, side, back) and to finalize the role-specific sound material。
The second is scenario assets, such as the Songmen Palace, modern offices, etc. The scenario assets are generated in as much as possible in a non-person “spectrum” state, and are prepared for multiple points of view, such as a frontal, backlash, panorama, etc. Once established, these assets can be used repeatedly throughout the project ' s production cycle。

Multimodular Straight Out Video
When the word and the word are ready, we move directly across the cumbersome drawings and into high-throughput video generation。
At the heart of this step is to throw prepared character pictures, a set of scene pictures into a video model such as Seedance 2.0 and enter the corresponding spectroscopy narrative. Thanks to the strong understanding of the model, it is able to directly output a video clip that contains the precise movement of the person, flow mirrors。
And more importantly, the current process has made it possible to "show together"-- – While generating images, the direct integration of the early sound assets, the completion of the role, the alignment of the oral and the phonic alignment, and the elimination of the cumbersome, separate phonograms in the traditional process。
Partial complement for video images
DIRECT VIDEO GENERATION, WHILE SIGNIFICANTLY INCREASING EFFICIENCY, SOMETIMES AI ALSO SUFFERS FROM MODELING, BROKEN LIMBS OR MISSING DETAILS. IN THIS COMPLETELY NEW PROCESS, WE ARE NOT GOING BACK TO THE START OF RE-DRAWING CARDS, BUT RATHER USING CRITICAL FRAMES OF INTERVENTION。
We can fine-tune and re-engineer video images by fine-tune control, such as re-drawing, local redrawing, re-drawing and regeneration of the video's head frame. This step ensures that each segment of the video material sent to the cutout software meets the commercially available standards。
Late editing and audio-visual reconstruction
The last step is editing. In the “literate cut” process, “cut” actions have become more focused, as a large number of dynamic and voice problems have been resolved in the previous period。
We put together all the suitably tested video clips in a charade to control the entire narrative. On this basis, background music generated by tools such as Suno is further laid, with more full environmental sound and action sound to enhance immersion. Finally, we're going to grind the field, add subtitles and color. When this set of audio-visual laws was put in place, an efficient and high-quality AI commercial short play was announced。
III. Quickness of short play
What's a short play like? What's its whole rhythm like
It's a short play
Single set structure formula
Pre-conflict + emotional pull + phase face + end hook
The short series is usually in 1-2 minutes, and there is no time for a smooth background, and it must go straight in。
Here, we'll share with you a short-time scripter's introduction template, which will include the skills described earlier。
Note: Manually modify [type] and [set]
Opens any one of the AI chat tools (DeepSeek/peas bag, with a condition to use ChatGPT/Gemini/Claude)
Here are some examples of bean buns:
Whatever tool is used, the script must at least open the “thinking mode”, which is a fast-track model for day-to-day questions。

And then the model will set you up according to the story that you entered, and you will be able to output the short play outline, and once the outline is correct, you will be able to write the script on the basis of the outline. It is recommended that a series be written。

Outline

script for play, opera, movie etc
IV. ASSETS BUILDING
i) Role Design
The role design is like selecting actors for their own scripts。
THE ROLE IMAGE CANNOT BE SET AT RANDOM, BUT THE SCRIPT MUST BE READ AND THE NATURE OF THE ROLE MUST BE UNDERSTOOD BEFORE THE PROCESS OF “HOW TO LAND A PERSON WITH AI” BEGINS。
Overall, we recommend a workstream:
TAKE THE CHARACTER INFORMATION FROM THE SCRIPT/FICTION. THE VISUAL KEYWORDS ARE EXTRACTED FROM THE SMALL PROFILE OF THE PERSON, AND THE VISUAL KEYWORDS ARE ORGANIZED INTO THE AI PAINTING HINTS, AND THEN THE THREE VIEWS ARE ADDED AFTER THE IMAGE IS STABILIZED。
AI-ASSISTED EXTRACTION OF PERSON INFORMATION
For specific operations, you can send the following reminder templates to the AIGM tool (DeepSeek/ soybag/Gemini/ChatGPT/Claude) to allow AI to extract the first version of the asset alert。
YOU'RE A PROFESSIONAL AI GRAPHIC DESIGNER AND SHORT PLAY VISUAL ASSET DISMANTLING. ACCORDING TO MY STORY SCRIPT, PLEASE DISMANTLE THE VISUAL ASSETS THAT THE PLAY NEEDS TO GENERATE AND EXPORT THE AI GRAPHICS OF EACH OF THE "PERSON ASSET" "SITUATION ASSET." THE OVERALL PRESENTATION STYLE REQUIRES:
IT'S TRUE, IT'S NATURAL, IT'S TRUE, IT'S TRUE. THE CHARACTERS, COSTUMES, SCENES AND PROPS MUST FIT THE CHARACTER OF THE SCRIPT, THE TIME BACKGROUND, THE GEOGRAPHICAL ENVIRONMENT, THE SOCIAL CLASS AND THE DRAMA. I. THE ASSETS OF THE PERSON REQUIRE THAT ALL CHARACTER CHARACTERS IN THE SCRIPT THAT NEED TO BE SHOWN BE LISTED AND THAT A CHINESE AI TEXT BE PRODUCED FOR EACH ROLE. THE FOLLOWING REQUIREMENTS MUST BE MET: 1. THE BODY MUST HAVE A FACE AND A PURE WHITE BACKGROUND。
The “positive perspective” “from the head to the sole foot” must be clearly written。
3. The description of the shoes must be included in order to avoid the production of semi-physical maps。
The person does not take anything。
5. Hands fall naturally and bodies stand up。
6. The person had no expression, which was neutral。
7. The person ' s height, size, age, hair, hair, clothing, shoes, face, eyes must be described。
Role images must be consistent with the person's ethnic origin, age background, identity occupation, hairdressing and dressmaking。
The adult role is as close as possible to the two or eight body size ratio, with a good image, naturality and lens。
10. Not to be too ugly or too extreme, even in opposition; to be as attractive as possible, while preserving character characteristics。
11. No images of children or older persons should emerge unless the script was clearly required。
_Other Organiser
[Phone Name]
Role positioning:
AI ILLUSTRATIONS:
II. Scenario asset requirements
PLEASE LIST ALL THE SCENES THAT NEED TO BE MAPPED ACCORDING TO THE LOCATION OF IMPORTANCE IN THE SCRIPT AND GENERATE THE AI HINT FOR EACH SCENE。
The scenario phrase must meet the following requirements:
1. No person shall appear in the scene。
2. The scenery style must be a real, visual, reality-based, visual-type feeling。
3. The scenery shall be compatible with the functions of the script era, geography, class, atmosphere and drama。
4. Each important scene produces at least three hints:
– We're making a scene
– Against scenery
– Side Panorama
5. The counter-attack is designed to be based on the potential location of the role and the direction of the dialogue, to facilitate subsequent spectroscopy reference。
6. The side panorama is intended to show the full spatial relationship and facilitate the understanding of the person ' s position and movement。
7. No text, watermarks, posters, character clippings or irrelevant elements are present in the scene。
Scene Introduction Format:
"Scene"
Site uses:
Playing scene-chart tip:
Invert scene:
Side panorama hint:
III. PURPOSE ASSETS REQUIREMENTS
PLEASE LIST ALL IMPORTANT PROPS IN THE SCRIPT AND GENERATE A CHINESE-LANGUAGE AI HINT FOR EACH PROPS。
The props must meet the following requirements:
1. Must be a purely white background。
2. No person shall appear。
3. No real environmental context。
The props should be shown separately and the subject should be clear。
5. Prototypes, materials, old senses and use traces must conform to the script。
6. If the props are related to the storyline, identity reverses, suspenses, or euphoria, please reflect its importance in the introduction。
7. Do not appear in words, watermarks or undecorated decorations。
Punctuation format:
[Property Name]
Purpose of props:
AI ILLUSTRATIONS:
Final output requirements:
1. ONLY THE LIST OF ASSETS AND AI REPRESENTATIONAL REFERENCES ARE EXPORTED。
2. There is no need to explain creative thinking。
3. No real production of pictures is required。
4. All references are in Chinese。
EACH REMINDER IS SPECIFIC AND CAN BE COPIED DIRECTLY TO USE IN THE AI DRAWING TOOL。
6. If there is insufficient information on the script, please provide a reasonable completion according to the type of short play and the situation, but do not deviate from the script。
The images generated are illustrated below。

It is important to focus on background stories, character characteristics and visual styles. The background story is ancient, modern. Characteristic characteristics are the external and intrinsic compatibility of the role, what character is, and the external expression is the same. Whether it is male or female is also important to have visual styles, such as dress colours, which must meet the needs of A。
BY COMBINING THIS INFORMATION WITH AI, THE CONTENT GENERATED IS MUCH BETTER THAN THE QUALITY AND RICHNESS GENERATED BY A DIRECT SENTENCE. NOTE THE NEED FOR A DESCRIPTION OF THE SHOES, OTHERWISE THE BODY CHART OF THE PERSON IS PROBABLY NOT FULL。
libtv map basic operations
Enter Libtv and click to start creating a new canvas。

Double-click canvas to add a photo node

Select a picture model as required

Enter a character/scene/protocol message:

Select resolution and figure ratio:
the general role-head body-scenario ratio of 9:16 is sufficient。
the role three view/four view can select a figure ratio of 16:9 or 21:9, with a 2k resolution。
the scenario can select the 16:9 ratio, with a 2k resolution。
The prop chart can select the scale of 1:1 and the resolution of 1 k/2K。

One moment, we can get a first-page role image. It is important to constantly adapt the hints to the world view of the script, the content of the play, the aesthetic style, etc., and try them over and over again, in order to finally get a satisfactory image of the person。

Different image model generation effect
THE ROLE MAP FOR THE AI HUMAN SHORTPLAY IS BASED ON FOUR MODELS:
The first is bananas
The second is the dream picture model
Third Z-image series model
The fourth is the new GPT Image 2 model
THE MORE BEAUTIFUL, THE MORE REAL AND THE MORE BREATHING THE CHARACTERS WILL BE, ON THE BASIS OF THE SCRIPT。
Models are usually chosen without clear boundaries, and the effects of which are produced are best used。
For example, the role of the leading woman, Shen Ning, in our homemade short play "The White-Eyed Wolves of Rebirth":

This is the dream Seedream series: a better understanding of Chinese wind elements at a cheaper price. But it's more obvious that the person that's generated is the whole AI。
Z-image turbo: A strong sense of realism is generated by the use of a large number of small, red-booked people. There is, however, a problem of bias towards the red face, which is the cheapest and most accessible model of open source. It is recommended that it be used only for single-person graphics and not for scenes, three-views and lateral redraws。
Nano Banana Series: Launched by Google, lower-level training data are mostly European and American data, and the resulting image of the person is biased towards European and American people. If the hint is detailed, you can generate an excellent character role map that is suitable for creating a 3D-style image. For example, Nano Banana 2 achieved a balance between speed, quality and cost through technological innovation。
GPT Image 2: A model of OpenAI's public measurements on April 21st, 2026, with amazing effects, aesthetics, and a good understanding of the hints. It integrates reasoning into image generation, integrates features such as web search, and significantly improves the accuracy of words, with excellent style reduction but relatively costly。
Frequent problem with role design
1) Unformed clothing
The problem with modern dramas is not so much, and ancient dramas are a disaster area。
Classics are a bit of a problem with the ceremonial system. The clothes are unique from different historical periods, and the costume plays can make targeted choices by type。
The design of a puppet can favour beauty and valour; the shape of a fairy-man can accentuate the emptiness and illusions and emphasize the extraordinary odour; the shape of a martial-man can show a flaunting, evasive and unruly style; and the traditional dynasty, the emptiness-like classic power-programming, home-fighting, etc., can best be based on the ceremonial incarnation of a particular Han dynasty, thus creating a historical atmosphere and style。

IF THERE'S NO CONCEPT IN YOUR HEAD, YOU DON'T KNOW WHAT'S GOOD AND WHAT'S BEAUTIFUL, YOU LOOK FOR THE SAME KIND OF MOVIE, YOU LOOK FOR PICTURES OF THEM, YOU SEE THEIR IMAGE, AND YOU LET AI IMITATE THE CHARACTERS。

⚠️: THERE IS ABSOLUTELY NO DIRECT USE OF ACTOR'S FACE FOR VIDEO PRODUCTION. IT ALLOWS AI TO REFER TO THE SHAPE OF THE ACTOR, BUT NEVER DIRECTLY TO THEIR IMAGE。
2. Proportion of persons

The role make-up photo must have at least one positive picture of the body, not just a half-shot. In the follow-up video, some distant/overview images are required to present the image of the person as a whole。
In the process of creation, it is important that the body size of the person be as consistent as possible with public aesthetic standards. Place people in the middle of the picture as far as possible to ensure that the visual focus is high. At the same time, it is important to focus on the fact that short legs are not allowed in order to create a harmonious and beautiful vision。
Therefore, as far as possible, these key elements should be reflected in the introduction: the use of a whole-of-the-body image, in which the person is presented in a position where he or she is naturally standing and has a ratio of two or eight body parts. If it is not possible to produce a full body photograph, the shoe or foot will be described in the hint。
3) The issue of beauty
There is a difference between “simple” and “wrong” for what is really a platform for the general public. Pureness can be clean, clean, low-key, often unsatisfied, stiff-faced, coloured。
You can set the most basic check list for character design:
Thumbnail check: reduce the figure to 1/6 on the cell phone stand and see if the profile of the person can be identified
Emoticon examination: symmetry in the eyes, strange direction in the absence of an eye and natural mouth
Color checks: whether the whole body is grey and whether there is a large number of mixed middle colours that make the picture look like “a piece of dust”。
AI, STRONG ENOUGH, CAN ONLY MAGNIFY THE AUTHOR'S OWN AESTHETIC. THE BEST WAY TO UPGRADE THIS SEGMENT IS TO KEEP A CONSTANT COMPARISON OF OUTSTANDING FILM/SHORT PLAY WORKS。
4. Other issues

Unless the whole character holds the object, the character map does not carry anything。
Role maps should not have other backgrounds, otherwise they may have an impact when they are generated。
The full-body makeup of the role remains as standing as possible and is visible。
Role Three View and Four View Assets
After identifying all the good people as positive, it's often a face-specific + character three-view, which is what we often call "Four View."
The prompt words are as follows:
Generate a three-dimensional view of the body and a facial close-up. (The top left is full of one third of the super-large facial features, 2/3 of the right, with a positive view, a side view, a back view, a figure of two or eight, and a pure white background. Real-life style。
N.B.: A hint should describe the style of the painting, otherwise it may result in a different style of the same picture。

The results were as follows:

If you want a real video, you have to finish the last step. This is done by right-clicking on "Seedance 2.0 Compliance " , which allows the free generation of real-person videos in libtv as long as the picture is successfully validated。
Generally, as long as the resulting picture does not hit a star in the face, the basically AI-generated picture passes. It is on the basis of such a mechanism that Libtv is now easing the restrictions on the review of human beings, which is why we first recommend libtv。
if it is intended to produce a video in a dream, it is likely that this face feature will be difficult to pass through the real person, in which case there will be a need for other treatment of the picture, which will be discussed in detail later。


ii) Site Design
Design principles
• Speculation: The scene should be more inclusive, using as far as possible a panorama or hyper-wide lens to cover more content。
• Three views / Back-to-back: Depending on the location of the different players, the same scene may require a different perspective。
• No person requests: in the design of the scene, it is important to avoid the presence of any person. Because once there are characters on the scene, they may be brought into the scene at a later stage, thereby disrupting what we expect to achieve。
Note: Both scenarios are not necessary. In the light of the needs of the script, for example, the following example, a positive and negative scene is required because of the need for a girl to kneel at the door with her master. Otherwise, a positive scene would normally suffice。

It's a multimagic scenario
Once we have a very satisfactory positive scene, we can also produce a four-gauge map, with the following hints, with a different angle, and then choose one of our satisfactory splits。
●
A 2*2-four-gauge multi-perspective reference map based on an upload map。
The upload can be from any angle, and please use it as the only visual basis for the scene. The same complete three-dimensional space was restored on the basis of the uploading map and four standard perspectives were generated: the top left, with a perpendicular view, with the overall plane relationship of the scene from the top; the top right, with a positive view, with a view of the scene, to the main area; the bottom left, with a view of the left, to the main area; and the bottom, with a right, with a view of the right, to the main area. The four cells must be presented in the same scenario, with all elements having the same number, relative location, scale, direction, back and forth, material and light. For areas not directly shown in the original map, it may be reasonable to complete them on the basis of a visible thread, but not to add new elements unrelated to the original map. The left and right perspective must be straight, stable, clear, non-slanted, non-observed, or protected by a large area of near-view objects such as columns, trees, fences, wall corners, vehicles and rocks. The whole remains in the same space, the same light, the same tone and the same atmosphere, without adding words or labels。

It can be seen that, in general, the particulars of the various angles are still relevant, that the indoor scene is relatively complex, and that some of the local details (e.g., the placing order of the tables and chairs) can be fine-tuned separately with models such as bananas if changes are needed。
The Four Palace scenes should not be used directly to produce the video, but rather to single out the one you need。

LibTV Panorama
if there is only a single positive scene, if you want to generate another view, you can use its embedded panorama in libtv。

You can create a 720-degree panorama, rotate it manually to the angle you want, just click on the top screenshot button. However, for indoor scenes, if the angles are too different and deformed, the effects are less natural. More suitable for the construction of outdoors。

Manually generate an inverse scene
Basic logic: Find visual anchor
Most of the complex inverse scenes can be generated as long as they follow the following set of hint formulas。
Phrasing formula:
Handheld + style + angle and image + image + environment

(iii) Props asset design
The prop asset design, which reproduces the AI message directly to the Libtv, allows us to select a different size ratio, usually 1:1。

The props are relatively simple in terms of assets, but it is important to keep in mind that the arcades are of the same quality. Modern elements could not be present, but, at the same time, it should be avoided as much as possible。

Let's just click on the generation and select the more appropriate picture。
Note that props are also designed to be white-coloured. Don't get anywhere else contaminated
As it's too much, it's too much to put in a public article INTRODUCTION TO THE AI SHORT PLAY, FROM 0 TO 1Share it out