A Culturally Aware Multimodal AI Model

Vansh Kumar

Journal of Current Trends in Computer Science Research(JCTCSR)

ISSN: 2836-8495 | DOI: 10.33140/JCTCSR

Impact Factor: 0.9

Researchers and authors can directly submit their manuscript online through this link Online Manuscript Submission.

Track Your Submission

Share this page:

Indexing

Open Access Journals

A Culturally Aware Multimodal AI Model

Abstract

Vansh Kumar

This paper introduces Vision, a novel 175billion parameter multimodal AI model. Vision is trained from scratch to natively understand text, images, video, and audio and to generate text and images, setting it apart from existing models. Developed with a focus on incorporating Indian context, values, and culture, Vision aims to empower users with a culturally relevant AI experience. A unique security feature allows generated images to be backtracked to Vision, mitigating concerns about potential mis use for misinformation. Evaluations on standard benchmarks demonstrate that Vision achieves stateoftheart performance in a diverse range of tasks, including reasoning, solving mathematical problems, code generation, and image understanding. Furthermore, Vision exhibits remarkable proficiency in multilingual chat, supporting a wide array of global languages as well as regional Indian languages such as Hindi, Punjabi, and Marathi. We believe that Vision represents a significant step towards building more inclusive and culturally relevant AI systems, with the potential to positively impact various domains in India and beyond.

PDF

Journal of Current Trends in Computer Science Research(JCTCSR)

ISSN: 2836-8495 | DOI: 10.33140/JCTCSR

Impact Factor: 0.9

Journal of Current Trends in Computer Science Research

Indexing

Open Access Journals

A Culturally Aware Multimodal AI Model

Abstract

Connect with your Global Scientific World by Subject

Important Links

Locate Us