Max Planck ETH Center for Learning SystemsAcronym | MPG ETH CLS | Homepage | http://learning-systems.org/ | Country | [nothing] | ZIP, City | | Address | | Phone | | Type | Alliance | Current organization | Max Planck ETH Center for Learning Systems | Members | |
Open OpportunitiesThe process of evaluating sleep examinations and diagnosing sleep disorders through polysomnographies (PSGs) is labor-intensive as it requires manual analysis from sleep technicians and doctors. In collaboration with Clinic Barmelweid, a leading sleep and rehabilitation clinic in northwestern Switzerland, we plan to automate this process using machine learning models. Clinic Barmelweid conducts approximately 400-450 PSGs annually and has access to a dataset of more than 5,000 recordings. - Artificial Intelligence and Signal and Image Processing, Biomedical Engineering, Medical and Health Sciences
- Collaboration, ETH Zurich (ETHZ), Internship, Master Thesis, Semester Project
| This thesis aims to utilize deep learning techniques to analyze eye-tracking data during a goal-directed upper limb task, particularly focusing on participants under the influence of alcohol. The objective is to develop digital health metrics that can elucidate differences in movement planning. - Engineering and Technology, Information, Computing and Communication Sciences, Medical and Health Sciences
- Bachelor Thesis, Master Thesis, Semester Project
| The field of image restoration is continually evolving with the introduction of advanced deep learning models capable of tackling increasingly complex restoration tasks. The use of foundation models, which are pre-trained on diverse data before being fine-tuned for specific tasks, has demonstrated considerable promise in various domains of artificial intelligence. This proposal aims to develop a new foundation model for image restoration by incorporating the state-space model and enhancing it with text prompt capabilities. This approach will allow the model to perform targeted restorations based on descriptive textual prompts, significantly improving the precision and quality of the restoration process. - Computer Vision
- Collaboration, ETH for Development (ETH4D) (ETHZ), Master Thesis
| Reinforcement learning (RL) can potentially solve complex problems in a purely data-driven manner. Still, the state-of-the-art in applying RL in robotics, relies heavily on high-fidelity simulators. While learning in simulation allows to circumvent sample complexity challenges that are common in model-free RL, even slight distribution shift ("sim-to-real gap") between simulation and the real system can cause these algorithms to easily fail. Recent advances in model-based reinforcement learning have led to superior sample efficiency, enabling online learning without a simulator. Nonetheless, learning online cannot cause any damage and should adhere to safety requirements (for obvious reasons). The proposed project aims to demonstrate how existing safe model-based RL methods can be used to solve the foregoing challenges. - Engineering and Technology
- Master Thesis
| The objective of this project is to create a comprehensive robotic platform capable of autonomously administering injections into the human eye. The project includes mechanical design, motion planning, and the implementation of a force control algorithm. - Mechanical and Industrial Engineering
- Bachelor Thesis, ETH Zurich (ETHZ), Master Thesis, Semester Project
| This project aims at automatically learning problem-dependent uncertainty sets by exploiting available data on the uncertain parameters, hence surpassing the limitations of traditional methods such as robust and stochastic optimization approaches that assume the exact knowledge of the support set and of the probability distribution respectively. - Information, Computing and Communication Sciences, Optimisation, Systems Theory and Control
- Master Thesis, Semester Project
| 3D hand pose forecasting is a new benchmark introduced by HoloAssist [1]. Existing action forecasting work mostly focuses on providing semantic labels of future actions and does not provide explicit 3D guidance on hand poses. Predicting 3D hand poses can be useful for various applications, and it can augment instructions and spatially guide users in different tasks. In this benchmark, we take 3 seconds inputs similar to other 3D body location forecasting literature and forecast the continuous 3D hand poses for the next 0.5, 1.0, and 1.5 seconds. The evaluation metric is the average of mean per joint position error over time in centimeters compared to ground truth. To have a proper evaluation metric that can help 3D action guidance, we remove the mistakes from the action sequences and only forecast 3D hand pose for the correct labels.
[1] Wang, X., Kwon, T., Rad, M., Pan, B., Chakraborty, I., Andrist, S., ... & Pollefeys, M. (2023). Holoassist: an egocentric human interaction dataset for interactive ai assistants in the real world. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 20270-20281). - Computer Vision, Virtual Reality and Related Simulation
- ETH Zurich (ETHZ), Master Thesis, Semester Project
| Action recognition is an essential task in computer vision and has numerous applications in various fields, including robotics, surveillance, and healthcare. The recognition of actions involves the analysis of temporal and spatial information within a video sequence. Current state-of-the-art methods use 3D hand and object poses for action recognition, where the object's corners are commonly used for representation. However, this approach has limitations in accurately modeling the hand-object interaction. In [1], we show that leveraging hand-object contact-map representation helps improve action recognition. However, this representation can be learned implicitly for the task of action recognition.
[1] https://arxiv.org/pdf/2309.10001.pdf - Computer Vision, Virtual Reality and Related Simulation
- ETH Zurich (ETHZ), Master Thesis, Semester Project
| The recent development of LLMs (Large Language Models), such as ChatGPT and Llama, opens up new possibilities for understanding procedural actions. In the past, action recognition was restricted to the classification of visual frames. However, with LLMs, the model can observe the whole action sequence in a more effective way and even predict the future actions [1]. In this project, students will explore how LLMs can improve action recognition in procedural tasks. Specifically, given a high-level procedural task (e.g., making coffee, copying a paper), students will use existing pretrained action recognition models to predict the top 5 actions for each clip and feed them into the LLMs to refine and correct the predicted actions. As a comparison, students will also establish a baseline using simple machine learning and statistical methods to correct actions.
[1] Palm: Predicting Actions through Language Models @ Ego4D Long-Term Action Anticipation Challenge 2023, CVPR'23 workshop
- Computer Vision, Text Processing
- ETH Zurich (ETHZ), Master Thesis, Semester Project
| Reading text manuals to set up and manipulate devices takes a lot of time and is not intuitive when it comes to 3D instruction. Despite the advent of Mixed Reality (MR) devices, 3D instruction is still limited and expensive to set up. In this project, we will develop an app, an adaptive 3D hand guidance system that projects instructional 3D hand poses in MR devices with pre-recorded instructional videos using MR devices. - Computer Vision, Virtual Reality and Related Simulation
- ETH Zurich (ETHZ), Master Thesis, Semester Project
|
|