Photographic Image Synthesis with Cascaded Refinement Networks

Qifeng Chen Vladlen Koltun
IEEE International Conference on Computer Vision (ICCV), 2017
Oral Presentation



Abstract

We present an approach to synthesizing photographic images conditioned on semantic layouts. Given a semantic label map, our approach produces an image with photographic appearance that conforms to the input layout. The approach thus functions as a rendering engine that takes a two-dimensional semantic specification of the scene and produces a corresponding photographic image. Unlike recent and contemporaneous work, our approach does not rely on adversarial training. We show that photographic images can be synthesized from semantic layouts by a single feedforward network with appropriate structure, trained end-to-end with a direct regression objective. The presented approach scales seamlessly to high resolutions; we demonstrate this by synthesizing photographic images at 2-megapixel resolution, the full resolution of our training data. Extensive perceptual experiments on datasets of outdoor and indoor scenes demonstrate that images synthesized by the presented approach are considerably more realistic than alternative approaches.

Video



Materials

Paper

Source code on GitHub

Video download

Supplement

Media

AI artist conjures up convincing fake worlds from memories
AI creates fictional scenes out of real-life photos
AI paints dreamy cityscape that blurs the line between code and art
Scientists help AI dream new worlds, like taking a picture of a subconscious landscape
This AI Draws Cities From Scratch
THIS AI GENERATES FAKE STREET VIEW IMAGES IN IMPRESSIVE HIGH DEFINITION
Le peintre d’Intel Labs produit des tableaux ultra-réaliste
如此逼真的高清图像居然是端到端网络生成的?GANs 自叹不如 | ICCV 2017
这梦一般的街景, 全是AI伪造的| 把GAN秒成渣渣的 paper+code
利用已知語義分割產生合成圖片: PHOTOGRAPHIC IMAGE SYNTHESIS WITH CASCADED REFINEMENT NETWORKS
边缘PC处理器 Intel做新黑科技:玩家福音
这梦一般的街景,全是AI伪造的(把GAN秒成渣渣)|Paper+Code