视觉合成重点实验室
开放交流

学术交流

首 页 > 开放交流 > 学术交流 > 正文

2021年度初夏国际学术研讨会

日期:2021年05月29日 编辑:管理员 点击:

日期

时间

内容

报告嘉宾

61日上午

08:00 ~ 08:50

签到


08:50 ~ 09:00

开幕致辞


09:00 ~ 10:00

High-speed 3D structured light   imaging and applications

张松教授,美国普渡大学

10:00 ~ 11:00

Face Recognition Beyond Visible

陈存建博士,美国密歇根州立大学

11:00 ~ 12:00

Deep Learning for Robot Perception

刘宇博士,澳大利亚南澳大学

61日中午

12:00 ~ 14:30

午餐/午休


61日下午

14:30 ~ 15:30

基于深度学习的快速条纹投影三维成像技术

左超教授,南京理工大学

15:30 ~ 16:30

Machine Learning: An Information   Theoretic Perspective

Josef Kittler教授,英国萨里大学

16:30 ~ 17:30

如何为直播中的人脸自动打码

周吉喆博士,澳门大学

17:30 ~ 17:40

闭幕致辞


报告一题目:High-speed 3D structured light imaging and applications

报告人:Song Zhang (Professor and Assistant Head for Experiential LearningSchool of Mechanical EngineeringPurdue University, West Lafayette, IN, USA)

报告时间:619:00-10:00

报告地点:逸夫科学馆一楼演播厅

报告内容:

Advances in optical imaging and machine/computer vision could have profound impact on biomedical engineering. My research addresses the challenges in high-speed, high-resolution 3D imaging and optical information processing. My current research focuses on achieving speed breakthroughs by developing the binary defocusing techniques; effectively storing enormously large 3D data by innovating geometry/video compression methods; and automating structured light 3D imaging techniques using the recently developed electrically tunable lens (ETL). The binary defocusing methods coincide with the inherent operation mechanism of the digital-light-processing (DLP) technology, permitting tens of kHz 3D imaging speed at camera pixel spatial resolution. The novel methods of converting 3D data to regular 2D counterparts offer us the opportunity to leverage mature 2D data compression platform, achieving extremely high compression ratios without reinventing the whole data compression infrastructure. The electronically controllability of ETL lens shows promise for 3D imaging autofocusing. In this talk, I will present our recent work in these areas and discuss some of the applications that we have been exploring including biomedical engineering, forensic sciences, along with others.

报告人简介:

Song Zhang is a Professor and the Assistant Head for Experiential Learning in School of Mechanical Engineering at Purdue University. He received his Ph.D. degree in mechanical engineering from Stony Brook University in 2005. His research addresses the challenges in high-speed structured light 3D imaging techniques, and 3D image/video analyses. His work has been cited over 12,000 times with an h-index of 54. Besides being utilized in academia, technologies developed by his team have been used by Radiohead (a rock band) to create a music video House of Cards; and by the law enforcement personnel to document crime scenes. His work has been recognized by winning the AIAA Best Paper Award, the IEEE ROBIO Best Conference Paper Award, the IS&T Electronic Imaging Best Student Paper Award, the Best of SIGGRAPH Disney Emerging Technologies Award, the NSF CAREER award, the Stony Brook University’s “40 under 40 Alumni Award”, and the College of Engineering Early Career Faculty Research Excellence Awards from both Iowa State and Purdue. He serves as an associate editor for Optics Express and Optics and Lasers in Engineering. He is a Purdue University Faculty Scholar, and a Fellow of OSA and SPIE.

报告二题目:Face Recognition Beyond Visible

报告人:Dr. Cunjian Chen (Michigan State University, U.S.A.)

报告时间:6110:00-11:00

报告地点:逸夫科学馆一楼演播厅

报告内容:

Thermal sensors are essential for developing face recognition systems that can operate in low-light and nighttime environments, as well as detecting presentation attacks such as 3D mask and makeup. In this talk, I will showcase two approaches that have been developed to address thermal-to-visible face matching: SG-GAN, in which we embed the semantic information to better preserve the shape; and LG-GAN, a latent-guided generative adversarial network that offers useful insights on interpreting and explaining thermal-to-visible face image translation. I will also describe potential applications of utilizing thermal sensors for face presentation attack detection and identify key challenges for future thermal-to-visible face recognition technologies.

报告人简介:

Dr. Cunjian Chen is currently a Senior Research Associate at Michigan State University. He obtained his Ph.D. in computer science from West Virginia University. His work on presentation attack detection has been recognized with Best Paper Awards from ISBA 2017, WACVW 2018, and WACVW 2021. He published over 20 peer-reviewed journal/conference papers and has received more than 1,000 citations. He is currently Associate Editor of IET Image Processing and served as Tutorial Chair for IJCB, Area Chair for ICME, ICIP and FG. He is a Senior Member of IEEE.

报告三题目:Deep Learning for Robot Perception

报告人:Yu Liu (Bytedance AI Lab)

报告时间:6111:00-12:00

报告地点:逸夫科学馆一楼演播厅

报告内容:

This report comprises a body of work that investigates the use of deep learning for 2D and 3D scene understanding. Although there has been significant progress made in computer vision using deep learning, most of that progress has been relative to performance benchmarks, and for static images; it is common to find that good performance on one benchmark does not necessarily mean good generalisation to the kind of viewing conditions that might be encountered by an autonomous robot or agent. We address a lot of problems motivated by the desire to see deep learning algorithms generalise better to robotic vision scenarios.

报告人简介:

Yu Liu is a research scientist in Bytedance AI Lab, he was a joint training Ph.D student with the University of Adelaide and ETH Zurich. He received the B.S. degree in software engineering from Southwest University in 2013 and the M.S. degree from the State Key Lab of CAD&CG, Zhejiang University, in 2016. He visited University of Southern California in 2015. His research interests mainly lies in computer vision, robotics, and deep learning.

报告四题目:基于深度学习的快速条纹投影三维成像技术

报告人:左超(南京理工大学电子工程与光电技术学院教授、博士生导师)

报告时间:6114:30-15:30

报告地点:逸夫科学馆一楼演播厅

报告内容:

暨激光器发明实现干涉记录、CCD相机的发明实现光记录的数字化后,深度学习为光学测量技术的下一轮革新创造了新的机会。在这个报告中,我们介绍了我们最近将深度学习方法应用于条纹投影轮廓术的一系列工作。我们展示了与传统的傅里叶变换和加窗傅里叶变换法相比,深度学习的条纹分析方法可以显著提高相位重建的准确性和质量。深度学习还可以被用来进行时域相位解包裹,并且在解包可靠性和鲁棒性方面都优于传统的多频时域相位解包裹方法。在深度学习的帮助下,我们可以使用更少甚至单帧条纹图像实现绝对相位获取,这使得条纹投影轮廓术在高速、高精度甚至瞬态三维成像方面应用得以更进一步。

报告人简介:

左超,南京理工大学电子工程与光电技术学院教授、博士生导师。南京理工大学智能计算成像实验室(SCILab: www.scilaboratory.com)学术带头人。研究方向为计算光学成像与光信息处理技术,在非干涉定量相位显微成像、高速结构光三维传感等领域取得系列研究成果。已在SCI源刊上发表论文140余篇,包括JCR一区论文80余篇。12篇论文被选作AP, OL, OE等期刊的封面文章,10篇论文入选ESI高被引/热点论文。研究成果多次被SPIE NewsroomOSA Image of the Week等报道。获国家发明专利57项,PCT/美国专利15项。入选国家优青、江苏省杰青Elsevier中国高被引学者、斯坦福World’s Top 2% Scientists (Career& Singleyr)。成果获江苏省科学技术奖基础类一等奖、中国光学工程学会技术发明一等奖等奖项。指导研究生获全国挑战杯/互联网+/创青春全国特等奖/金奖,研电赛/物联网大赛全国第一名。现任多个国际期刊PhotoniX, PhotonicsTopical Editor, Microwave and Optical Technology LettersIEEE Access Associate Editor,《光学学报》、《激光与光电子学报》专题编辑,《红外与激光工程》、中国激光杂志杂志社青年编委等。

报告五题目:Machine Learning: An Information Theoretic Perspective

报告人:Josef Kittler (Centre for Vision, Speech and Signal Processing, University of Surrey)

报告时间:6115:30-16:30

报告地点:逸夫科学馆一楼演播厅

报告内容:

The core of any Artificial Intelligence (AI) application is machine learning. During the last decade the huge potential of AI has been accentuated by a revolutionary progress in deep learning, whereby a task is solved by training a deep neural network (DNN) using training data and an appropriate objective function. The quest for an effective DNN architecture, as well as the learning objective, is the subject of hundreds, if not thousands, of papers published annually. The talk will focus on the problem of measuring the loss of DNN that drives the learning process. Noting that most researchers use heuristic methods to define the loss function, we resort to information theory to provide a better basis for selecting an objective function that is cognizant of the fact that in machine learning we are dealing with a multitude of probability distributions. The first question to consider is whether the classical information measures such as Shannon entropy and Kullback-Leibler divergence that have been developed for communication applications are equally relevant for decision making tasks. We will show that there are arguments for adopting or developing variants that are better suited for machine learning. We will also address the problem of modelling the various distributions that play an important part in deep learning. The advocated comprehensive information theoretic approach to machine learning will be illustrated on a number of AI tasks, including classification, retrieval, regression and classifier incongruence detection.

报告人简介:

Josef Kittler is Professor of Machine Intelligence at the Centre for Vision, Speech and Signal Processing, University of Surrey. He received his BA, PhD and DSc degrees from the University of Cambridge. He teaches and conducts research in Machine Intelligence, with a focus on Biometrics, Video and Image Database retrieval, and Cognitive Vision. He published a Prentice Hall textbook on Pattern Recognition: A Statistical Approach and several edited volumes, as well as more than 500 scientific papers, including more than 200 journal papers. He serves on the Editorial Board of several scientific journals in Pattern Recognition and Computer Vision and was Series Editor of Springer Lecture Notes on Computer Science 2004-2016.

He served as President of the International Association for Pattern Recognition (IAPR) 1994-1996. He received Honorary Doctorates from the Lappeenranta University of Technology and from the Czech Technical University in Prague. He is Fellow of IET, IAPR, EURASIP, and BMVA. He was elected Fellow of the Royal Academy of Engineering in 2000. In 2006 he was awarded the KS Fu Prize from IAPR and the IET Faraday Medal in 2008. He consulted for many companies and was one of the founders of OmniPerception Ltd.

报告六题目:如何为直播中的人脸自动打码

报告人:周吉喆澳门大学图像处理与模式识别实验室

报告时间:6116:30-17:30

报告地点:逸夫科学馆一楼演播厅

报告内容:

打码是图像和视频中最常见的隐私保护方式。尽管视频或图像编辑工具已可部分辅助生成马赛克,但迄今为止,仍主要依赖人工来手动放置马赛克。YouTube Microsoft都已开始自动打码的相关的研究,但主要集中在托管视频的离线打码上。在如今的直播时代中,开发一种可以自动实时审查直播中隐私或敏感的内容并为其生成马赛克的方法则是势在必行。在这个报告中,我将展示我们专为直播中的路人人脸而开发的自动实时马赛克算法:FPVLSFPVLS基于我们提出的PIAP聚类算法和再检测算法,将逐帧检测和嵌入的人脸向量聚类,以生成稳定而准确的人脸轨迹。PIAP利用帧之间的关联性给予了传统AP聚类一定的抗噪性,联合再检测算法以解决人脸检测和嵌入网络在帧中表现不稳的问题。PIAP和再检测算法使用了增量聚类以及经验似然比测试来满足运算速度的要求并尽量保护原始直播画面。FPVLS可在手机端运行。只需在人脸轨迹上微量的手动初始化,FPVLS即可自动打码直播中的非主播人脸,从而有效保护直播中的个人隐私。

报告人简介:

周吉喆,即将获得于澳门大学科技学院计算机与信息科学系博士学位。现就学于澳门大学图像处理与模式识别实验室(IPPRLAB),在信息的内容安全 (Information Security) 和视频语义理解 (Video Semantics) 领域以第一作者发表过多篇论文。其中三篇获得由中国计算机协会 (CCF) 认定的A级期刊、会议的刊载。 对视频和图像内容的自动审查机制有较为深入的研究。成功将人工智能应用到短视频和直播的自动审查中以实时维护信息安全和保护用户隐私。此外,在对抗学习 (Adversarial Training) 和图像加密算法 (Steganography) 方面也取得了一些成果,使得神经网络在面对潜在攻击时更具安全性。参与过多个由中国科技部,澳门科学技术发展基金资助的网络信息、医疗图像及多媒体安全项目。 受邀担任过WWW2021 (CCF-A), AsiaCCS (CCF-C), Signal Processing Letters (CCF-C)等多个国际期刊和会议的审稿人。