Research Projects

I am generally interested in developing data mining techniques to analyse data on the space, such as trajectory data, urban data, and environmental data. I am extremely passionate about interdisciplinary research. Our research projects are built upon collaborations with animal scientists, social scientists, criminologists, and geoscientists. We aim to provide efficient, effective, and practical computational methods to address these real-world challenges in understanding the data.

Trajectory Analysis

The advances in location-acquisition technologies and the prevalence of location-based services have generated massive spatial trajectory data, which represent the mobility of a diversity of moving objects, such as people, vehicles, and animals. Such trajectories offer us unprecedented information to understand moving objects and locations that could benefit a broad range of applications in business, transportation, ecology, and many more.

Collaborators: Roland Kays (NC Museum of Natural Sciences and Department of Forestry & Environmental Resources, North Carolina State University), Margaret Crofoot (Anthropology Department, UC Davis), Wang-Chien Lee (Computer Science and Engineering, Penn State), Stephen Matthews (Department of Sociology and Criminology, Penn State)

Trajectory Analysis: Semantic Trajectory Mining with Contexts (Current Focus)

The increasing availability of contextual information (e.g., venue information, local events, weather, and landscape) can enrich semantics of trajectory data. We have studied how to annotate the true destination of a mobility record (CIKM'16) and how to annotate the events the person is attending (WWW'15).

Funding: This research is funded by NSF award #1618448 (2016-2019).

Trajectory Analysis: Inferring Social Relationships based on Spatial-Temporal Interactions

Spatiotemporal data collected from GPS has the potential to provide insights into the relationship dynamics of individuals. For example, people gathering on a Saturday night could be an informative signal for a friend relationship, while being together during the day on weekdays indicates a potential colleague relationship. Our recent research projects have studied attraction/avoidance relationship (VLDB'14), friend relationship (ICDM'14, SSTD'11), follower/leader relationship (ICDM'13), and moving object clusters (VLDB'10).

Trajectory Analysis: Periodic Behaviour Analysis

We have designed algorithm, Periodica (KDD'10), to mine multiple interleaved periodic behaviors in complex movement. This is the first work that studies how to automatically detect the hidden periods in the movement. Periodicity can be successfully used to fill in missing data and predict future movement (DAMI'12).
Due to the limitations of positioning technology and data collection mechanisms, movement data collected from GPS or sensors could be highly sparse, noisy and unsynchronized. In our recent work (KDD'12, TKDE'15), a segment-and-overlay idea is explored to uncover the hidden period: Even when the observations are incomplete, the limited periodic observations will be clustered together if data is overlaid with the correct period.


Urban Data Computing

Increasing amount of urban data are being accumulated in the digital form, such as human trace traffic, venues, crime, weather, local events, vehicle collisions, and many more. Many cities in U.S. (e.g., New York City, Chicago, and Los Angeles) have joined the open data initiative and created websites to release the city data to the public. Analyzing such data could empower us to address many critical urban issues such as crime, traffic jam, education, health, and life quality. Our recent study has shown that using taxi flow data and Point-Of-Interest data can significantly improve crime rate inference in Chicago (KDD'16).

Funding: This research project is funded by NSF CAREER award (2017-2022). The crime data analysis research is funded by NSF award #1544455 (2015-2017).

Collaborators: Corina Graif (Department of Sociology and Criminology, Penn State), Daniel Kifer (Department of Computer Science and Engineering, Penn State)


Environmental Data Mining

High volume hydraulic fracturing, also called fracking, allows drillers to extract natural gas from shale deep within the earth. But such natural gas development also has led to environmental concerns. Methane gas sometimes escapes from shale gas wells and can contaminate water resources or leak into the atmosphere where it contributes to greenhouse gas emissions. We explore how to analyze the heterogeneous spatial data that describe distributions of methane concentrations in natural waters.

Funding: This research is funded by NSF award #1639150 (2016-2019) and CCRINGSS center at Penn State (2015-2016).

Collaborators: Susan Brantley (Department of Geosciences, Penn State)