Dynamic Monitoring of Correctness for Stateful Network Behavior

The goal of this project is to facilitate network debugging by exploring runtime checking of correctness properties about stateful network behavior. Monitoring stateful properties at runtime intrinsically requires maintaining information about packet history, which presents unique challenges compared to traditional monitoring approaches.

Baggage: End-to-End Execution Context for Distributed Systems

Today's distributed systems do not support the notion of an end-to-end request context in a well-defined or coherent manner. This has led to a fragmented landscape of poorly supported, siloed tracing frameworks. In this work we propose a principled layered design for end-to-end request context in distributed systems that enables tracing systems to share common underlying layers.

Resource Management in Shared Distributed Systems

In distributed systems shared by multiple tenants, effective resource management is an important pre-requisite to providing quality of service guarantees. However, traditional resource management mechanisms in the operating system and in the hypervisor are ineffective due to a mismatch in the management granularity.

Pivot Tracing

Pivot Tracing is a monitoring framework for distributed systems that can seamlessly correlate statistics across applications, components, and machines at runtime, without needing to change or redeploy system code. Users can define and install monitoring queries on-the-fly, to collect arbitrary statistics from one point in the system while being able to select, filter, and group by events meaningful at other points in the system.

Towards a Network Marketplace in a Cloud

Virtually all public clouds today are run by single providers, and this creates near-monopolies, inefficient markets, and hinders innovation at the infrastructure level. In this project we borrow ideas from the Internet architecture, and propose to structure the cloud datacenter network as a marketplace where multiple service providers can offer connectivity services to tenants.

Efficient Job Scheduling for Big Data

Job scheduling in Big Data clusters is crucial both for cluster operators’ return on investment and for overall user experience. In this work we address several anomalies in how modern cluster schedulers manage queues, and argue that maintaining queues of tasks at worker nodes has significant benefits.

Energy Management for Mobile Devices

Background activities on mobile devices can cause significant battery drain with little visibility or recourse to the user. They can range from useful but sometimes overly aggressive tasks, such as polling for messages or updates from sensors and online services, to outright bugs that cause resources to be held unnecessarily.

Planck: Millisecond-scale Monitoring and Control for Commodity Networks

State-of-the-art monitoring mechanisms for conventional networks require hundreds of milliseconds to seconds to extract global network state, like link utilization or the identity of “elephant” flows. Planck represents a novel network measurement architecture that employs oversubscribed port mirroring to extract network information at microsecond timescales - over 11x and 18x faster than other recent approaches.

Participatory Networking

PANE is an API for applications to control a software-defined network (SDN), that delegates read and write authority from the network’s administrators to end users, applications and devices. Users can then work with the network, rather than around it, to achieve better performance, security, or predictable behavior.