AI Researcher & Data Scientist

Souvik Ghosh

Innovating at the intersection of AI and Generative multimedia. Currently researching advanced lipreading and speech synthesis technologies at IIIT Hyderabad.

Souvik Ghosh

Education

MS by Research, CSE

IIIT Hyderabad | Grade: 8.5

2024-2026

  • Courses taken: Statistical Methods in AI, Digital Image Processing, Advanced NLP(LLMs), Computer Vision, Technology, Product and Entrepreneurship
  • Research focus on multimodal AI, Speech Technologies, LLMs and Representation Learning.
  • Working with CVIT Lab in the Audio Visual Team guided by Professor CV Jawahar and Professor Vinay Namboodiri

BTech, Applied Electronics and Instrumentation

HITK, Kolkata | Grade: 8.12

2019-2023

  • Explored the beauty of interdisciplinary education.
  • Final Year Thesis on IOT and Edge ML for patients with Epilepsy
  • Major Projects in Harassment Detection and Women Safety. Runners up at Nasscom Lab 2 Market

Experience

Technical Expertise

AI & ML

  • Multimodal Generative AI
  • LipSync & Speech Technologies
  • Computer Vision & LLMs

MLOps

  • AWS, Azure, Google Cloud
  • Docker
  • FastAPI & Flask

Frameworks

  • PyTorch
  • TensorFlow
  • Hugging Face

Publications

reSenseNet: Ensemble Early Fusion Deep Learning Architecture

IHCI 2021

Published in International Conference on Intelligent Human Computer Interaction

Speech@SCIS: Annotated Indian Video Dataset

Smart Intelligent Computing and Applications

Published in Smart Intelligent Computing and Applications, Volume 1