Tuesday, October 12, 2021

Computer vision phd thesis

Computer vision phd thesis

computer vision phd thesis

Chronocam is a rapidly growing startup developing event-based technology, with more than 15 PhDs working on problems like tracking, detection, classification, SLAM, etc. Event-based computer vision has the potential to solve many long-standing problems in traditional computer vision, and this is a super exciting time as this potential is becoming more and more tangible Ph.D. Theses. The doctoral dissertation represents the culmination of the entire graduate school experience. It is a snapshot of all that a student has accomplished and learned about their dissertation topics. In computer vision classification problems, it is often possible to generate an informative feature vector representation of an The PhD thesis exposes students to cutting-edge and unsolved research problems in the field of Computer Vision, where they are required to propose new solutions and significantly contribute towards the body of knowledge. Students pursue an independent research study, under the guidance of a supervisory panel, for a period of 3 - 4 years



PhD in Computer Vision | MBZUAI



A list of completed theses and new thesis computer vision phd thesis from the Computer Vision Group. Are you about to start a BSc or MSc thesis? Please read our instructions for preparing and delivering your work. Below we list possible thesis topics for Bachelor and Master students in the areas of Computer Vision, Machine Learning, Deep Learning and Pattern Recognition. The project descriptions leave plenty of room for your own ideas.


If you would like to discuss a topic in detail, please contact the supervisor listed below and Prof. Paolo Favaro to schedule a meeting.


Note that for MSc students in Computer Science it is required that the official advisor is a professor in CS. Using sensors to detect sleep and waking behavior has as of yet unexplored potential to reveal insights into health. In this study, we make use of a watch-like device, called an actigraph, which tracks motion to quantify sleep behavior and waking activity. Participants in the study consist of healthy and depressed adolescents and wear actigraphs for a year during which time we query their mental health status monthly using online questionnaires.


For this masters thesis we aim to make use of machine learning methods to predict mental health based on the data from the actigraph. The ability to predict mental health crises based on sleep and wake behavior would provide an opportunity for intervention, significantly impacting the lives of patients and their families.


This Masters thesis is a collaboration between Professor Paolo Favaro at the Institute of Computer Science paolo. favaro inf. ch and Dr Leila Tarokh at the Universitäre Psychiatrische Dienste UPD computer vision phd thesis. tarokh upd.


We are looking for a highly motivated individual interested in bridging disciplines, computer vision phd thesis. The Gerontechnology and Rehabilitation group at the ARTORG Center for Biomedical Engineering is offering multiple BSc- and MSc thesis projects to students, which are interested in working with real patient data, artificial intelligence and machine learning algorithms. Contact: Dr. Stephan Gerber stephan. gerber artorg. chMichael Single michael.


single artorg. Visual Transformers have obtained state of the art classification accuracies [ViT, DeiT, T2T, BoTNet]. Mixture of experts could be used to increase the capacity of a neural network by learning instance dependent execution pathways in a network [MoE].


In this research project we aim to push the transformers to their limit and combine their dynamic attention with MoEs, compared to Switch Transformer [Switch], we will use a much more efficient formulation of mixing [CondConv, DynamicConv] and we will use this idea in the attention part of the transformer, not the fully connected layer.


Publication Opportunity: Dynamic Neural Networks Computer vision phd thesis Computer Vision a CVPR Workshop. Contact: Sepehr Sameni. Visual Transformers have obtained state of the art classification accuracies for 2d images[ViT, DeiT, T2T, BoTNet]. In this project, we aim to extend the same ideas to 3d data videoswhich requires a more efficient attention mechanism [Performer, Axial, Linformer].


In order to accelerate the training process, computer vision phd thesis, we could use [Multigrid] technique.


Publication Opportunity: LOVEU a CVPR workshopHolistic Video Understanding a CVPR workshopActivityNet a CVPR workshop. GIRAFFE is a newly introduced GAN that can generate scenes via composition with minimal supervision [GIRAFFE]. Generative methods can implicitly learn interpretable representation as can be seen in GAN image interpretations [GANSpace, GanLatentDiscovery].


Decoding GIRAFFE could give us per-object interpretable representations that could be used for scene manipulation, data augmentation, scene understanding, semantic segmentation, pose estimation [iNeRF], and more. Computer vision phd thesis order to invert a GIRAFFE model, we will first train the generative model on Clevr and CompCars datasets, then we add a decoder to the pipeline and train this autoencoder, computer vision phd thesis.


Scene Manipulation and Decomposition by Inverting the GIRAFFE. Computer vision phd thesis Opportunity: DynaVis a CVPR workshop on Dynamic Scene Reconstruction. Visual Transformers have obtained state of the art classification accuracies [ViT, CLIP, DeiT], but the best ViT models are extremely compute heavy and running them even only for inference not doing computer vision phd thesis is expensive.


Running transformers cheaply by quantization is not a new problem and it has been tackled before for BERT [BERT] in NLP [Q-BERT, Q8BERT, computer vision phd thesis, TernaryBERT, BinaryBERT]. In this project we will be trying to quantize pretrained ViT models. Quantizing ViT models for faster inference and smaller models without losing accuracy. Publication Opportunity: Binary Networks for Computer Vision a CVPR workshop.


Recently contrastive learning has gained a lot of attention for self-supervised image representation learning [SimCLR, MoCo]. Contrastive learning could be extended to multimodal data, like videos images and audio [CMC, CoCLR]. Most contrastive methods require large batch sizes or large memory pools which makes them expensive for training. In this project we are going to use non batch size dependent contrastive methods [SwAV, BYOL, SimSiam] to train multimodal representation extractors.


Our main goal is to compare the proposed method with the CMC baseline, so we will be working with STL10, ImageNet, UCF, HMDB51, and NYU Depth-V2 datasets. Inspired by the recent works on smaller datasets [ConVIRT, CPD], to accelerate the training speed, we could start with two pretrained single-modal models and finetune them with the proposed method.


Publication Opportunity: MULA a CVPR workshop on Multimodal Learning and Applications. Neural Networks have been found to achieve surprising performance in several tasks such as classification, detection and segmentation. However, they are also very sensitive to small controlled changes to the input.


It has been shown that some changes to an image that are not visible to the naked eye may lead the network to output an incorrect label, computer vision phd thesis.


This thesis will focus on studying recent progress in this area and aim to build a procedure for a trained network to self-assess its reliability in classification or one of the popular computer vision tasks.


The Personalised Medicine Research Group at the sitem Center for Translational Medicine and Biomedical Entrepreneurship is offering multiple MSc thesis projects to the biomed eng MSc students that may also be of interest to the computer science students. Kate Gerber. Chronocam is a rapidly growing startup developing event-based technology, with more than 15 PhDs working on problems like tracking, detection, classification, computer vision phd thesis, SLAM, etc.


Event-based computer vision has the potential computer vision phd thesis solve many long-standing problems in traditional computer vision, computer vision phd thesis, and this is a super exciting time as this potential is becoming more and more tangible in many real-world applications.


PhD internships will be more research focused and possibly lead to a publication. For each intern we offer a compensation to cover the expenses of living in Paris. List of some of the topics we want to explore:. Email with attached CV to Daniele Perrone at dperrone chronocam. Today we have many 3D scanning techniques that allow us to computer vision phd thesis the shape and appearance of objects. It is easier than ever to scan real 3D objects and transform them into a digital model for further processing, such as modeling, rendering or animation.


However, the output of a 3D scanner is often a raw point cloud with computer vision phd thesis to no annotations. The unstructured nature of the point cloud representation makes it difficult for processing, e.


surface reconstruction. One application is the detection and segmentation of an object of interest, computer vision phd thesis. In this project, the student is challenged to design a system that takes a point cloud a 3D scan as input and outputs the names of objects contained in the scan.


This output can then be used to eliminate outliers or points that belong to the background. The approach involves collecting a large dataset of 3D scans and training a neural network on it. A photograph accurately captures the world in a moment of time and from a specific perspective. Since it is a projection of the 3D space to a 2D image plane, the depth information is lost. Is it possible to restore it, given only a single photograph?


In general, the answer is no. This problem is ill-posed, meaning that many different plausible depth maps exist, and there is no way of telling which one is the correct one, computer vision phd thesis. However, if we cover one of our eyes, we are still able to recognize objects and estimate how far away they are.


This motivates the exploration of an approach where prior knowledge can be leveraged to reduce the ill-posedness of the problem. Such a prior could be learned by a deep neural network, trained with many images and depth maps. Deblurring finds many applications in our everyday life. It is particularly useful when taking pictures on handheld devices e. smartphones where camera shake can degrade important details.


Therefore, it is desired to have a good deblurring algorithm implemented directly in the device. In this project, the student will implement and optimize a state-of-the-art deblurring method based on a deep neural network for deployment on mobile phones Android. The goal is to reduce the number of network weights in order to reduce the memory footprint while preserving the quality of the deblurred images.


The result will be a camera app that automatically deblurs the pictures, giving the user a choice of keeping the original or computer vision phd thesis deblurred image. If an object in front of the camera or the camera itself moves while the aperture is open, the region of motion becomes blurred because the incoming light is accumulated in different positions across the sensor. If there is camera motion, there is also parallax.


Thus, a motion blurred image contains depth information. In this project, the student will tackle the problem of recovering a depth-map from a motion-blurred image. This includes the collection of a large dataset of blurred- and sharp images or videos using a pair or triplet of GoPro action cameras. Two cameras will be used in stereo to estimate the depth map, and the third captures the blurred frames, computer vision phd thesis.


This data is then used to train a convolutional neural network that will predict the depth map from the blurry image. The idea of this project is that we have two types of neural networks that work together: There is one network A that assigns images to k clusters and k simple networks of type B perform a self-supervised task on those clusters.


The goal of all the networks is to make the k networks of type B perform well on the task. The assumption is that clustering in semantically similar groups will help the networks of type B to perform well. This could be done on the MNIST dataset with B being linear classifiers and the task being rotation prediction.


The student designs a data augmentation network that transforms training images in such a way that image realism is preserved e.




Joachim Dehais - Phd Defense - Computer Vision for Diet Assessment

, time: 1:03:35





Theses - Computer Vision Group


computer vision phd thesis

The PhD thesis exposes students to cutting-edge and unsolved research problems in the field of Computer Vision, where they are required to propose new solutions and significantly contribute towards the body of knowledge. Students pursue an independent research study, under the guidance of a supervisory panel, for a period of 3 - 4 years Last modified May 28, AM © FKI: Research Group on Computer Vision and Artificial Intelligence INF, University of Bern Research Group on Computer Vision Stanford Computer Vision Lab

No comments:

Post a Comment