Hugo: Convergence of 3-D and Visual Effects
August 6, 2012, Siggraph, Los Angeles—The production team from Pixomondo described their efforts for Martin Scorsese's "Hugo". The size and scale of the production dwarfed previous efforts and the production in 3-D exacerbated the problems.
The sets for this movie were very large, some were 4 times larger than the largest studios. As a result, the art department had to have real-time, on-set visualization capabilities to start designs and make blueprints. The mechanicals provided feedback to morph images through the same pre-visualization mechanisms. They also helped to set the green screen shots and CGI and helped coordinate sets for tracking.
One difficult shot was the opening sequence, where the camera moves down the platform. They found that CGI looked too CGI when the camera got too close to a person. As a result, they made full turntable images of the people and created digital doubles. The actors had to perform some simple actions like walking in place and picking up a briefcase.
The digital doubles then were dressed and given props. The use of the digital doubles enabled the same extras to be in other scenes with different clothes and accessories. This technique is easier than crating full CG people and switching images to a real person in close ups. They shot the extras in full stereo on a green screen using four cameras. The files included metadata on alignments, sync, geometry, and the other usual shot data.
The effects were processed in 3D Studio Max. They also had to simulate smoke, steam, and other atmospherics, especially for the steam cloud at the end of the opening sequence where the steam cloud served as the transition element from a long to short perspective, and also from CGI to live images. The images for the people had to built as cards but they still needed the geometry to make the images work.
One of the biggest problems was adjusting the stereo. The people cards had all of the setting information, but they still needed to plan and manage all of the stereo parameters. Aligning the people information and the choreography, and doing it in stereo is not easy when the scene is changing. They started with all rigs set for an inter-ocular spacing of 2.2 inches.
All of the independent shots were at the same setting to allow for incorporation into other scenes with minimal work. The problem is that often the two plates needed work to align and merge. They then converged the information in the render camera by positioning the plates in the stereo space.
They captured high dynamic range images (HDR) of all the extras to get full 360 degree shots for lighting references and color spaces. To the extent possible, all parameters were held constant for the turntable images. the HDR images were bracketed in three exposures and stitched into the various scenes. They used multiple stitches for image triangulation and used the resulting information to recreate the digital sets.
They checked the resolution accuracy and color matching with standard color and alignment charts. Lighting changes called for a shot-by-shot adjustment of the images. They used the lighting in the CGI as the base for live shots and tried to maintain the director's vision of the lighting as he would have lit a live shot.
The lighting for all the sets was photographed with a high-resolution camera and a stereo still rig. The data acted as a collection and allowed the shooting units to learn about the image issues on the fly. The Leica photo station identified the camera position, lights, camera movements, and other metadata.
Due to the volume of data, they could not record the data manually, so all the data was machine generated. A Maya scene file was created with LIDAR (laser image detection and ranging) and encoded into the set data. They also appended the camera start and end location data so they could recreate a scene for the retakes. Ass a result, the artists has access to all the stereo camera data, including geometry, LIDAR, etc.
Mathematics are important, but they tell you what you don't want to see. Math does help to minimize the work at post. Even with the best tools, a better way to match move is needed. It takes them about 3 days to match move, so the costs, about $1M per day, are the limiting factor in doing this more of this type of work.
The basic workflow had to address the tradeoffs between production quality and tight schedules. They needed fast imaging to confirm shots in near real time. Their infrastructure included high-speed file transfer, hot sync from the hubs, and some on-the-fly stereo and color corrections. The stereo corrections were to eliminate keystoning and rig errors.
The editing flow processed two versions so the stereo depth budgets and flow could be turned around in under 24 hours. Color, stereo, VFX, digital intermediation, and audio were all processed in parallel. Even with all the creation and correction capabilities, they still needed to create some scenes as live action. In a scene where the conductor was dragged on the platform, they found that they could not move the car, so they built a moving platform and moved the platform and cameras in sync to give the impression that the car was moving.
Integration and handoff was a challenge. The base data was in RAW format so they could only see pieces in the camera comps. They needed real-time comps and ways to view the 3-D on the set. All of the finished 3-D components in the final movie had to be processed many times before they could view them.