Advanced Computer Vision with Deep Learning
This two day workshop is designed to give participants the skills needed to build cutting edge Computer Vision models with Deep Learning.
What You'll Learn
- ✓ Building networks such as Resnets & Inception Networks
- ✓ How to build Segmentation models with UNets
- ✓ How facial recognition models and their networks work
- ✓ How Object Detection Networks work
- ✓ Doing Image translation with Pix2Pix Networks
- ✓ Using Advanced Image Augmentation strategies
In the second course in the series, we venture beyond the basic skills to delve into advanced usage of Convolutional Neural Networks and modern image network architectures. This journey takes us into the future of image segmentation with Meta's Segment Anything Model (SAM), a groundbreaking project that democratizes segmentation, allowing for adaptation to specific tasks without additional training. This makes SAM versatile for a broad set of use cases, from underwater photography to cell microscopy.
As we understand the current state-of-the-art technologies, we will review the history of ImageNet winning models, the impact of Inception, Residual architectures, NASNet, and EfficientNet, and how they enable the field to go beyond hand-engineered models. Alongside these foundational concepts, we will explore the fascinating world of Stable Diffusion, where randomness meets structure, and chaos turns into order through neural networks like VAEs and GANs.
Our exploration continues with a deep dive into advanced skills and techniques such as object detection with models like YOLO, person tracking with Deep Sort, and pixel-level image segmentation with U-Nets and DenseNets. We will also examine top models using Transformers, and the principles of Generative Adversarial Networks (GANs), including how StyleGAN architectures have evolved over time.
One of the exciting new areas we'll touch on is Diffusion Models, guiding random noise into meaningful structures for tasks like image denoising and inpainting. We'll also introduce BLIP-2, a visual-language pre-training paradigm that bridges the gap between vision and language models, allowing state-of-the-art results for tasks like visual Q&A.
Participants will have the opportunity to build with tools like TensorFlow, Keras, PyTorch, and TorchVision, which are often used for cutting-edge computer vision research. Hands-on experience will be an essential part of the course, allowing participants to build models and apply new skills to projects in their field.
The course duration is 3 days, encompassing 28 hours including online sessions. It offers an introduction to PyTorch and TorchVision, advanced classification, objection detection, and the skills to create applications like image search and similarity comparisons.
Join us in this comprehensive exploration of computer vision's newest frontiers, arming yourself with skills to apply in your own area of work. This workshop brings together the past, present, and future of computer vision, offering a thrilling opportunity to learn and grow in this dynamic field.
In this course, participants will learn:
- Advanced usage of Convolutional Neural Networks and modern image network architectures
- Understanding and working with the Segment Anything Model (SAM) for versatile image segmentation
- Exploration of Stable Diffusion, where randomness is transformed into meaningful structures
- Techniques for object detection, person tracking, and pixel-level image segmentation with models like YOLO, Deep Sort, U-Nets, and DenseNets
- Examination of Transformers in computer vision and principles of Generative Adversarial Networks (GANs), including StyleGAN architectures
- Introduction to cutting-edge concepts like Diffusion Models and BLIP-2
- Building experience with tools like TensorFlow, Keras, PyTorch, and TorchVision
- Hands-on practice in creating applications for advanced classification, objection detection, image search, and similarity comparisons
- Acquire skills to create applications like image search and similarity comparisons
- To build various types of Deep Learning Computer Vision models
- Learn about image segmentation and classifying at the pixel level with architectures like U-Nets and DenseNets and how they are used in a variety of image segmentation tasks
- Tracking people and objects through videos
- Person Re-identification and object tracking with Deep Sort
- ViT - Vision Transformer
- StyleGANs for and visual GANs for image augmentation
Technologies we teach will include:
Earn a certificate upon completion
Time to Complete
Approx. 28 hours to complete
Why Study Artificial Inteligence?
1. Demand for AI/Dl jobs has never been at this all time high.
2. Developers need these skills
3. AI/DL Jobs pay more than standard developer jobs
Request More Info
Sign up to receive additional information about this course. Find out what other learners are doing with the skills they gamed, and evaluate if this course is the right fit for you.
Frequently Asked Questions
Do I have to log in at a set time? How does the 360-degree assessment work? At this point, you probably have a few questions, and we’ve got answers.