Multimodal RAG with Granite Vision 3.2

This experimental demo highlights granite-vision-3.2-2b capabilities within a multimodal retrieval-augmented generation (RAG) pipeline, demonstrating Granite's document understanding in real-world applications. Explore the sample document excerpts and try the sample prompts or enter your own. Keep in mind that AI can occasionally make mistakes. View Documentation

{
  • "path": "/tmp/gradio/d70ba0badd754690e2dea91db204158ea4fe8a5032edf29367d15457f96bcd1b/IBM_Annual_Report_2007_3-20.pdf",
  • "url": "http://0.0.0.0:7860/gradio_api/file=/tmp/gradio/d70ba0badd754690e2dea91db204158ea4fe8a5032edf29367d15457f96bcd1b/IBM_Annual_Report_2007_3-20.pdf",
  • "size": null,
  • "orig_name": null,
  • "mime_type": null,
  • "is_stream": false,
  • "meta": {
    • "_type": "gradio.FileData"
    }
}