Edit model card
YAML Metadata Warning: The pipeline tag "normals-estimation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, text2text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, other

Marigold Normals (LCM) Model Card

This model belongs to the family of diffusion-based Marigold models for solving various computer vision tasks. The Marigold Normals model focuses on the surface normals task. It takes an input image and computes surface normals in each pixel. The LCM stands for Latent Consistency Models, which is a technique for making the diffusion model fast. The Marigold Normals model is trained from Stable Diffusion with synthetic data, and the LCM model is further fine-tuned from it. Thanks to the rich visual knowledge stored in Stable Diffusion, Marigold models possess deep scene understanding and excel at solving computer vision tasks. Read more about Marigold in our paper titled "Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation".

Website GitHub Paper Hugging Face Space

Developed by: Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler

teaser

πŸŽ“ Citation

@InProceedings{ke2023repurposing,
      title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
      author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2024}
}

🎫 License

This work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).

By downloading and using the code and model you agree to the terms in the LICENSE.

License

Downloads last month
82
Unable to determine this model’s pipeline type. Check the docs .