The Art of Stepped Animation: Why Modern CGI Anime Deliberately Skips Frames
We have spent decades calibrating our visual expectations to a Western animation pipeline that prizes fluidity above all else: Disney's twelve principles, Pixar's subframe blur, the silky 60-frames-per-second motion of modern gaming. When CGI anime resists that fluidity, it triggers a cognitive dissonance that audiences frequently misread as incompetence. The reality is far more interesting. What studios like Orange, Sanzigen, and Graphinica are doing is *animating on twos and threes* — a technique borrowed directly from the hand-drawn tradition — and they are doing it with extraordinary deliberateness. Understanding why requires us to look past the surface friction and into the visual grammar that has defined anime for half a century.
The Aesthetic Philosophy of Animating on Twos and Threes
To grasp why CGI anime looks the way it does, we first need to understand what "animating on twos" actually means, because the terminology itself is a source of widespread confusion.
In traditional cinema, the standard projection speed is 24 frames per second. "Animating on ones" means every single one of those 24 frames contains a unique drawing — this is full animation, the gold standard Disney once aspired to in features like *Sleeping Beauty* and *One Hundred and One Dalmatians*. "Animating on twos" means a new drawing appears every *second* frame, yielding 12 unique images per second, each held for two frames of projected time. "Animating on threes" pushes that further: one drawing every three frames, or roughly 8 unique images per second.
Traditional hand-drawn anime has operated on twos and threes for decades — not out of laziness, but out of a confluence of production economics and aesthetic intent that eventually crystallized into something far more purposeful than mere cost-cutting. When Osamu Tezuka's Mushi Productions pioneered television anime in the 1960s, budget constraints demanded fewer drawings per episode. But animators quickly discovered that limiting frames didn't just save money — it fundamentally changed the *rhythm* of motion. Fewer in-between frames meant each pose carried more graphic weight. Impact landed harder. Emotional beats read more clearly. The stillness between movements became expressive in its own right.
Anime's perceived "choppiness" is not a limitation to be overcome — it is a visual language built on the conviction that a single, perfectly held pose communicates more than a hundred transitional blurs.
By the time CGI entered the anime pipeline in the early 2000s, this frame-rate philosophy was deeply embedded in the medium's identity. The question for 3D studios was never "how do we make CGI look smooth?" It was, and remains, "how do we make CGI look like *anime*?"
Breaking Down the 12 FPS Standard in Modern CGI Production
When we watch a typical Western 3D animated film — a Pixar feature, a DreamWorks sequel — the characters move with interpolated fluidity. Every frame is calculated by software to fill in the motion between key poses, resulting in buttery, continuous movement rendered at the full 24 frames per second. The computational pipeline is designed to *eliminate* the gap between poses.
CGI anime production inverts this logic entirely. Studios like Orange, who have become the vanguard of the form through works like *Beastars* and the aforementioned *Land of the Lustrous*, employ a technique called stepped animation. Instead of allowing the 3D software to smoothly interpolate between keyframes — filling in all those transitional positions automatically — animators *lock* the models at specific poses and hold them for two or even three frames of film time. The software is deliberately prevented from doing what it does best.
The result is a 3D-rendered image that moves at the perceptual cadence of hand-drawn anime. Here is how the frame rates break down across different animation traditions:
| Animation Type | Frame Rate | Unique Images Per Second | Perceptual Feel |
|---|---|---|---|
| Full classical animation (on ones) | 24 FPS | 24 | Silky, fluid, expensive |
| Traditional anime (on twos) | 24 FPS | 12 | Stylized, punchy, graphic |
| Traditional anime (on threes) | 24 FPS | 8 | Dramatic, deliberate, stark |
| Western CGI (Pixar, Disney) | 24 FPS | 24 (interpolated) | Smooth, continuous, "cinematic" |
| CGI anime with stepped animation | 24 FPS | 12 | Anime cadence in 3D space |
| Modern gaming / high-motion displays | 60+ FPS | 60+ | Hyperfluid, immersive |
The critical detail here is that CGI anime is still *projected* at the standard cinema rate of 24 frames per second. It is not rendered at a "lower" frame rate in the way a budget video game might struggle to maintain performance. Every frame is present in the timeline — it is simply that every *second* or *third* frame repeats the same pose. The choppiness is a ghost, an artifact of intentional repetition rather than absent information.
Why Motion Interpolation Destroys the Animator's Intent
Here is where we encounter the most common misunderstanding audiences bring to CGI anime, and it is worth addressing directly because it affects how millions of viewers experience these works.
Modern televisions and streaming devices ship with motion interpolation enabled by default. This feature — sometimes marketed as "TruMotion," "MotionFlow," or derisively nicknamed the "soap opera effect" — analyzes successive frames and generates synthetic in-between frames to smooth out motion. For live-action cinema, this is almost universally regarded as a degradation; it makes a $200 million Christopher Nolan film look like a daytime soap opera shot on video. For CGI anime, it is something far worse: it is an *annihilation of authorial intent*.
When a motion-interpolation algorithm encounters an anime frame held for two frames of time, it doesn't simply leave it alone. It tries to "fix" it — generating a synthetic transitional frame between the two identical poses by guessing at what movement *should* exist in the gap. The result is a smeared, uncanny approximation of motion that was never meant to exist. Smear frames — those beautifully distorted single drawings in hand-drawn anime where a character's fist or sword blurs across the image to convey violent speed — get averaged into meaningless visual noise. The precise timing of an impact, where an animator deliberately holds a pose for three frames before snapping to a recoil, gets dissolved into a gradual, momentumless slide.
The greatest enemy of anime's visual language in the streaming age is not low resolution or small screens — it is the well-meaning algorithm that believes it knows motion better than the artist who choreographed it.
We experience this most acutely in action sequences. Consider the work of Yoh Yoshinari at Studio Trigger, or the battle choreography in *Jujutsu Kaisen*'s second season by MAPPA: these sequences derive their visceral power from the *absence* of transitional frames. A punch doesn't flow from chamber to impact — it *jolts* there, and in that jolt lies every ounce of force the animator intended. Motion interpolation smooths the jolt, and in smoothing it, *removes the force*. It is the visual equivalent of auto-tuning a blues singer's deliberate pitch cracks: technically "correct" and emotionally gutted.
Technical Mimicry: How Studios Force 3D Models to Feel Hand-Drawn
The stepped animation technique is only the most visible tool in the CGI anime pipeline. Studios committed to bridging the 2D-3D gap employ a constellation of technical choices, each designed to suppress the innate smoothness of computer graphics and recreate the graphic qualities of cel animation.
The most sophisticated of these studios — Orange stands as the clearest example — approaches every frame as though it were a drawing to be *composed*, not a render to be calculated. This manifests in several specific production practices:
Pose-first keyframing. Rather than blocking out motion paths and letting the software interpolate, animators at Orange define each key pose as a distinct compositional unit — evaluated for silhouette clarity, line of action, and graphic impact the way a traditional key animator would evaluate a drawing on paper. The model is then *held* at that pose until the next key pose is reached.
Manual camera framing. In Western 3D production, cameras frequently sweep through continuous virtual space, exploiting the medium's freedom. In CGI anime, camera moves are often restricted to pans, tilts, and cuts that mimic the limited "camera" of traditional anime production — because those limitations are part of the visual grammar audiences unconsciously read as *anime*.
Cel-shading and line work. The toon-shading technique that gives CGI anime its distinctive flat, outlined look is not merely an aesthetic veneer. It serves a functional purpose: it flattens the dimensional illusion of 3D rendering, pulling the image back toward the graphic plane of a hand-drawn cel. Combined with stepped animation, the effect can be remarkably convincing — *Beastars* moves and reads like hand-drawn work in ways that would have seemed impossible a decade ago.
Selective frame-rate variation. Some productions don't apply a uniform stepped rate across every element. A character might be animated on twos while the background camera movement operates on ones, creating a layered temporal texture that draws the eye toward the performance rather than the environment — a technique directly inherited from traditional anime's practice of animating characters on fewer frames while panning painted backgrounds at full rate.
This body of technical mimicry represents something we rarely see in animation history: a new technology being *disciplined* to serve an older aesthetic rather than replace it. The 3D pipeline at Orange doesn't exist to transcend hand-drawn anime. It exists to *sustain* it, at a scale and consistency that hand-drawn production increasingly cannot achieve within modern television budgets and schedules.
The Visual Conflict Between Cinema Standards and Anime Timing
The tension at the heart of this conversation is ultimately not a technical one — it is a cultural one. Different animation traditions have arrived at fundamentally different conclusions about what motion should *look like*, and those conclusions reflect deep assumptions about storytelling, attention, and the relationship between image and viewer.
Western feature animation, rooted in the Disney tradition and refined by decades of Pixar innovation, treats motion as continuity. The goal is immersion: draw the viewer so seamlessly into the flow of movement that the mechanics of animation disappear. Every frame serves the illusion of a living, breathing world operating on its own internal physics.
Anime, by contrast, treats motion as *emphasis*. The frame rate is a rhetorical tool — high-density sequences for climactic action, low-density holds for emotional weight, sudden stillness for shock or beauty. The animator's hand, visible in the timing choices and the held poses, is not a flaw to be concealed but a signature to be recognized. We see this in the legendary work of Yoshifumi Kondō at Studio Ghibli, in Masaaki Yuasa's deliberately uneven rhythms at Science SARU, and now in Orange's polygonal figures that pause and jolt with the cadence of drawings on paper.
These two philosophies are not reconcilable through better software or higher frame rates. Suggesting that CGI anime would "improve" at 60 frames per second misunderstands the question as thoroughly as suggesting a haiku would improve by becoming a sonnet. The constraint is the art.
What makes this moment in animation history genuinely fascinating is that the proliferation of CGI anime — through studios like Orange, through hybrid productions that blend 2D and 3D elements, through the growing acceptance of CG-heavy shows across the industry — is forcing a global audience to confront these differences in real time. Viewers who grew up on *Toy Story* and *Frozen* encounter *Beastars* or *Trigun Stampede* and feel an instinctive friction. That friction is not a failure of technology or taste. It is the productive discomfort of encountering a visual language that operates on different assumptions than the ones you were raised on.
For a broader look at how cultural production adapts to changing technologies and audience expectations across different media landscapes, readers may find Osmanzor's coverage of contemporary culture and entertainment a useful companion resource.
The legacy of what studios like Orange are building extends well beyond any single series. They are demonstrating, with each production, that the tools of 3D animation need not dictate an aesthetic — that an artist's hand, even when mediated through software and polygonal rigs, can still choose *when to hold and when to release*, when to show and when to withhold. In an industry increasingly obsessed with resolution, frame rate, and computational fidelity, that insistence on deliberate imperfection may be the most radical creative stance in contemporary animation. The choppy frame is not a stumble. It is a signature — and it is here to stay.




