21 April, 2020
Title: AI-Powered Data Management and the Future of Software
Abstract: The software industry is going through a revolution where we are replacing traditional, human-written software with machine-learned models. This is also happening for database systems, and I will describe recent progress and open research problems for AI-powered database systems. I will also discuss how the practice of software development is changing as a result of the AI shift.
Bio: Johannes Gehrke is a Technical Fellow at Microsoft where he is leading architecture and machine learning in the intelligent communication and conversations cloud. From 1999 to 2015, he was on the faculty of the Department of Computer Science at Cornell University where he graduated 25 PhD students. Johannes has received an NSF Career Award, a Sloan Research Fellowship, a Humboldt Research Award, the 2011 IEEE Computer Society Technical Achievement Award, and he is an ACM Fellow. He co-authored the undergraduate textbook “Database Management Systems (McGrawHill (2002),” currently in its third edition), and he was Program co-Chair of ACM KDD 2004, VLDB 2007, IEEE ICDE 2012, ACM SOCC 2014, and IEEE ICDE 2015.
21 April, 2020
Title: Data Management Challenges in Building and Deploying Smart Spaces
Abstract: It is expected that in the near future the number of devices connected to the internet will exceed 50 billion. Continuous data from such devices offers a unique opportunity to create fine-grained representations of the evolving physical world that can bring transformative improvements to all walks of life -- healthcare, transportation, disaster response, sustainable energy and smart buildings. While opportunities abound, creating a computing and data processing infrastructure for future smart communities offers significant challenges, both from the societal and systems perspective. From the systems perspective, ingesting and managing IoT data at scale, inferring semantically meaningful information from it, and offering such inferences to developers to create smart applications pose significant challenges. The deployment of new services and capabilities in society has raised questions on the privacy implications of continuous monitoring of physical spaces using sensors, as illustrated by several studies (including ours).
In recent years, at UC Irvine, we have been building a novel data management middleware for the emerging IoT, entitled TIPPERS, that provides a plug-n-play architecture to integrate diverse privacy technologies (secure computing, policy-based data processing, differential privacy, user preferences, accountability), mechanisms to seamlessly abstract sensor data into semantically meaningful observations easing application development, and a progressive data processing engine to support real-time analytics. TIPPERS has been deployed at the UCI campus over the past year to support a suite of smart applications and services that are in daily use. In this talk, I will describe our experience building TIPPERS highlighting novel challenges that emerged in migrating TIPPERS from a laboratory demo to a campus level deployment. TIPPERS is being transitioned from a campus scale deployment to new smartspace settings including at the Naval Information Warfare Center, where the researchers have repurposed TIPPERS for shipboard settings aboard a naval vessel, as part of the Navy's Trident Warrior 2019 and the upcoming 2020 exercises. The talk will focus on our ongoing research on scaling privacy technologies and (near) real-time analytics to real-world deployments.
Bio: Sharad Mehrotra received the PhD degree in computer science from the University of Texas, Austin, in 1993. He is currently a professor in the Department of Computer Science, University of California, Irvine. Previously, he was a professor with the University of Illinois at Urbana Champaign. He has received numerous awards and honors, including the 2011 SIGMOD Best Paper Award, 2007 DASFAA Best Paper Award, SIGMOD test of time award, 2012, DASFAA ten year best paper awards for 2013 and 2014, ACM ICMR best paper award for 2013, IEEE NCA Best paper award for 2019, Dean’s Award for Research 2016, and CAREER Award in 1998 from the US National Science Foundation (NSF). His primary research interests include areas of database management, distributed systems, secure databases, privacy, and Internet of Things.
22 April, 2020
Title: A Data-Centric Lens on Cloud Programming and Serverless Computing
Abstract: Major shifts in computing platforms are typically accompanied by new programming models. The public cloud emerged a decade ago, but we have yet to see a new generation of programming platforms arise in response. All the traditional challenges of distributed programming and data are present in the cloud, only they are now faced by the general population of software developers. Added to these challenges are new desires for "serverless" computing, including consumption-based pricing and autoscaling, which raise particular challenges for data-centric applications.
This talk will highlight some key principles for cloud programming that came out of database research, including the CALM Theorem and constructive approaches to monotonic coordination-free consistency. I will discuss a new platform called Hydro that we are building at Berkeley to take these ideas and combine them into a polyglot, pay-as-you-go platform for cloud programming and deployment. Early results on Hydro---and its underlying key-value store, Anna---point to major improvements that researchers can offer to Serverless Computing and public clouds. The talk will also illustrate emerging cloud opportunities for application areas of interest to our community, including prediction serving, data science and robotics.
Bio: Joe Hellerstein is the Jim Gray Professor of Computer Science at the University of California, Berkeley. His research focuses on data-centric systems and the way they drive computing. Hellerstein is an ACM Fellow, a Sloan Research Fellow and the recipient of three ACM SIGMOD Test of Time awards. Fortune Magazine included him in their list of 50 smartest people in tech, and MIT's Technology Review magazine included his work on cloud programming in their TR10 list of the 10 technologies "most likely to change our world".
In addition to his academic work, Hellerstein is the co-founder and CSO of Trifacta, which brought academic research on data wrangling to market. Trifacta's software is available in public cloud marketplaces, and also powers Google Cloud Dataprep and IBM Infosphere Advanced Data Preparation. Outside his interests in computing, Hellerstein is an amateur jazz trumpet player, and has performed at notable venues including Yoshi's, ICDE and VLDB.
23 April, 2020
Title: Big Data in Climate and Earth Sciences: Challenges and Opportunities for Data Science
Abstract: The climate and earth sciences have recently undergone a rapid transformation from a data-poor to a data-rich environment. In particular, a massive amount of data about Earth and its environment is now continuously being generated by a large number of Earth-observing satellites as well as physics-based earth system models running on large-scale computational platforms. These massive and information-rich datasets offer huge potential for understanding how the Earth's climate and ecosystem have been changing and how they are being impacted by human’s actions. This talk will discuss various challenges involved in analyzing these massive data sets as well as opportunities they present for both advancing machine learning as well as the science of climate change in the context of monitoring the state of the tropical forests and surface water on a global scale.
Bio: Vipin Kumar is a Regents Professor at the University of Minnesota, where he holds the William Norris Endowed Chair in the Department of Computer Science and Engineering. He has authored over 300 research articles, and has coedited or coauthored 10 books including two text books ``Introduction to Parallel Computing'' and ``Introduction to Data Mining'', that are used worldwide and have been translated into many languages. Kumar's current major research focus is on bringing the power of big data and machine learning to understand the impact of human induced changes on the Earth and its environment. Kumar served as the Lead PI of a 5-year, $10 million project, "Understanding Climate Change - A Data Driven Approach", funded by the NSF's Expeditions in Computing program that is aimed at pushing the boundaries of computer science research. Kumar has served as chair/co-chair for many international conferences in the area of data mining, big data, and high performance computing, including 25th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019).
Kumar has been elected a Fellow of the American Association for Advancement for Science (AAAS), Association for Computing Machinery (ACM), Institute of Electrical and Electronics Engineers (IEEE), and Society for Industrial and Applied Mathematics (SIAM). Kumar's foundational research in data mining and high performance computing has been honored by the ACM SIGKDD 2012 Innovation Award, which is the highest award for technical excellence in the field of Knowledge Discovery and Data Mining (KDD), and the 2016 IEEE Computer Society Sidney Fernbach Award, one of IEEE Computer Society's highest awards in high-performance computing.