ABSTRACT
The real-world data, though massive, are hard for machines to resolve as they are largely unstructured and in the form of natural-language text. One of the grand challenges is to turn such massive corpora into machine-actionable structures. Yet, most existing systems have heavy reliance on human effort in the process of structuring various corpora, slowing down the development of downstream applications. In this talk, Dr. Ren will introduce an effort-light framework that extracts structured facts from massive corpora without task-specific human labeling effort. He will briefly introduce several interesting learning frameworks for structure extraction, and will share some directions towards mining corpus-specific structured networks for knowledge discovery.
BIO
Xiang Ren is an Assistant Professor in the Department of Computer Science at USC affiliated with USC ISI. Xiang was a visiting researcher at Stanford University and received his PhD in CS@UIUC. He is interested in computational methods and systems that extract machine-actionable knowledge from massive unstructured text data, and is particularly excited about problems in the space of modeling sequence and graph data under weak supervision (learning with partial/noisy labels, and semi-supervised learning) and indirect supervision (multi-task learning, transfer learning, and reinforcement learning). His research has been recognized with several prestigious awards including ACM SIGKDD Dissertation Award, a Google PhD Fellowship, Yahoo!-DAIS Research Excellence Award, Yelp Dataset Challenge Award, and WWW Best Poster Runner-up.