MACHINE LEARNING, VISION & LANGUAGE LAB

기계 학습 비전 및 언어처리 랩

관련기사 바로가기
기계 학습 비전 및 언어처리 랩

Our lab aims to help understanding and implement human intelligence for most common communication media: vision, natural language, and speech. Since they are connected and correlated to each other, we work on developing effective and efficient machine learning models for multi-modalities.

In Machine learning, Vision & Language lab, we are interested in Machine Learning and applications to Computer Vision and Language Processing. Specifically, we work on Multimodal Learning, Generative Models, and Deep Learning and our research topics include (but not limited to) embodied AI, text-to-image generation, multi-modal conversational models, video understanding and question answering, and explainable AI.

Major research field

Machine Learning and applications to Computer Vision and Language Processing

Desired field of research

Multimodal Learning, Generative Models, Machine Learning, and Deep Learning

Research Keywords and Topics

Text-to-image/video generation
Multi-modal conversational models
Embodied AI
Video understanding and question answering

Research Publications

Taegyeong Lee, Soyeong Kwon and Taehwan Kim, Grid Diffusion Models for Text- to-Video Generation, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024
Taegyeong Lee, Jeonghun Kang, Hyeonyu Kim and Taehwan Kim, Generating Realistic Images from In-the-wild Sounds, IEEE/CVF International Conference on Computer Vision (ICCV), October 2023
Taehwan Kim, Yisong Yue, Sarah Taylor and Iain Matthews, A Decision Tree Framework for Spatiotemporal Sequence Prediction, ACM Conference on Knowledge Discovery and Data Mining (KDD), August 2015
Taehwan Kim, Greg Shakhnarovich and Karen Livescu, Fingerspelling Recognition with semi- Markov Conditional Random Fields, IEEE International Conference on Computer Vision (ICCV), December 2013
Taehwan Kim, Greg Shakhnarovich, and Raquel Urtasun, Sparse Coding for Learning Interpretable Spatio-Temporal Primitives, Neural Information Processing Systems (NeurIPS), December 2010