Course Details
01:198:334 - Introduction to Imaging and Multimedia
- Course Number: 01:198:334
- Instructor: Ahmed Elgammal
- Course Type: Undergraduate
- Semester(s) Offered: Fall, Spring
- Credits: 4
- Description:
This is a basic undergraduate-level class that covers the fundamentals of multimedia computing and multimodal AI. This includes introduction to image processing, computer vision, and basics of audio and video processing. The students learn about the basics of image, video, and audio formation, representations, and processing, the basics of multimedia compression and representation. The course also introduces artificial intelligence concepts to deal with multimedia through the use of multimodal large language models. The students will be exposed to dealing with image and video data through programming assignments using Java and Python.
- Syllabus: https://rutgersconnect-my.sharepoint.com/:w:/g/personal/elgammal_cs_rutgers_edu/ETRv8OG6AKVDs7r--GdcT5sBOtSh5VWcjNtZVsT4V_cEtw?e=nWAcBt
- Instructor Profile: Elgammal, Ahmed
- Prerequisite Information:
01:198:112 or 14:332:351; 01:198:206 or 14:332:226 or 01:640:477; 01:640:250.
- A grade below a "C" in a prerequisite course will not satisfy that prerequisite requirement.
- Course Links: 01:198:112 - Data Structures, 01:198:206 - Introduction to Discrete Structures II
- Topics:
Introduction to Multimedia and Multimodal Computing
Multimedia Digitization with digital cameras as an example. Standard image formats. Colors in images and videos.
Image Computing: Point Operations, Filters, and convolution
Convolution Neural Networks: recognition, detection, and segmentation.
Multimedia at the age of AI: embedding of text, images, and other media and their applications (text-to-image, text-to-speech, …), Transformers basics, Multimodal Large Language Models.
Fourier Transform: Understanding frequency components of signals, focusing on imaging.
Multimedia compression basics: Lossless Compression: Variable length coding, Dictionary-based coding. Basics for Lossy Compression: Fourier Transform, Discrete Cosine Transform. Application to image compression (JPEG compression), Video compression (MPEGs), Audio compression (MP3)
- Expected Work: Homework/programming assignments and small projects
- Exams: Quizzes, Midterm and Final
- Learning Goals:
The aim of CS334 is to introduce fundamental techniques and concepts used in computational imaging and multimedia. Upon completion of this course, a successful student should be able to design and implement programs that deal with image, video, and audio data.