Wednesday, March 21, 2018

Stabilizing Embedology: Geometry-Preserving Delay-Coordinate Maps

Chris just sent me the following:

Hi Igor-
I hope you are well. I wanted to alert you that our paper on delay-coordinate maps and Takens' embeddings has finally appeared.
Eftekhari, Armin, Han Lun Yap, Michael B. Wakin, and Christopher J. Rozell. "Stabilizing embedology: Geometry-preserving delay-coordinate maps." Physical Review E 97, no. 2 (2018): 022222.
You had mentioned a much earlier preliminary result on your blog but this is the full and final result. It uses the tools familiar to this community (random measurements, stable embeddings) to address a fundamental observability result about nonlinear (perhaps even chaotic) dynamical systems from the physics community. The key question is "how much information is there in a time series measurement about the dynamical system that created it?". I think this result is a unique convergence of different fields, and our previous results analyzing recurrent neural networks were a distinct outgrowth of working on this problem.
Thanks Chris for the update !

Delay-coordinate mapping is an effective and widely used technique for reconstructing and analyzing the dynamics of a nonlinear system based on time-series outputs. The efficacy of delay-coordinate mapping has long been supported by Takens' embedding theorem, which guarantees that delay-coordinate maps use the time-series output to provide a reconstruction of the hidden state space that is a one-to-one embedding of the system's attractor. While this topological guarantee ensures that distinct points in the reconstruction correspond to distinct points in the original state space, it does not characterize the quality of this embedding or illuminate how the specific parameters affect the reconstruction. In this paper, we extend Takens' result by establishing conditions under which delay-coordinate mapping is guaranteed to provide a stable embedding of a system's attractor. Beyond only preserving the attractor topology, a stable embedding preserves the attractor geometry by ensuring that distances between points in the state space are approximately preserved. In particular, we find that delay-coordinate mapping stably embeds an attractor of a dynamical system if the stable rank of the system is large enough to be proportional to the dimension of the attractor. The stable rank reflects the relation between the sampling interval and the number of delays in delay-coordinate mapping. Our theoretical findings give guidance to choosing system parameters, echoing the trade-off between irrelevancy and redundancy that has been heuristically investigated in the literature. Our initial result is stated for attractors that are smooth submanifolds of Euclidean space, with extensions provided for the case of strange attractors.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Tuesday, March 20, 2018

Sparse Representations and Compressed Sensing Workshop, March 23rd 2018, Inria Paris,

Mark just sent me the following the other day:

Dear Igor, 
We thought that readers of Nuit Blanche may be interested in this free one-day workshop in Sparse Representations and Compressed Sensing, being held in Paris next week. There is also an opportunity for PhD students and Early Career Researchers to bring a poster (more below). Best wishes,  
Sure Mark. Here is the announcement:


  Sparse Representations and Compressed Sensing Workshop

  Inria Paris
  2 Rue Simone IFF, 75012 Paris, France


This one-day workshop will bring together researchers working in the area of sparse representations and compressed sensing to find out about the latest developments in theory and applications of these approaches, and to explore directions for future research.

The concept of sparse representations deals with systems of linear equations where only a small number of the coefficients are non-zero. The technique of compressed sensing aims to efficiently sense and reconstruct a signal from few measurements, typically by exploiting the sparse structure of the underlying representation. These techniques have proved very popular over the last decade or so, with new theoretical developments, and successful applications in areas such as hyperspectral imaging, brain imaging, audio signal processing and graph signal processing.

This one-day workshop, organized by the SpaRTaN and MacSeNet Initial/Innovative Training Networks*, will include invited keynote talks by Karin Schnass (Universität Innsbruck, Austria) and Jean-Luc Starck (CEA-Saclay, France), oral presentations and posters. The talks and posters will include theoretical advances in sparse representations, dictionary learning and compressed sensing, as well as advances in areas such as brain imaging and MRI, hyperspectral imaging, audio and visual signal processing, inverse imaging problems, and graph-structured signals.

PhD students and Early Career Researchers wishing to bring along a poster of their work for the poster session are encouraged to contact with a brief abstract of their work. Posters do not need to be novel: this is an opportunity to showcase work and discuss it with others in the field. There will be an opportunity for discussions to continue after the end of the formal workshop.

For more information and to register (Free), please visit

* European Union's Seventh Framework Programme (FP7-PEOPLE-2013-ITN) under grant agreement n° 607290 SpaRTaN and H2020 Framework Programme (H2020-MSCA-ITN-2014) under grant agreement n° 642685 MacSeNet

Prof Mark D Plumbley
Professor of Signal Processing
Centre for Vision, Speech and Signal Processing (CVSSP)
University of Surrey, Guildford, Surrey, GU2 7XH, UK

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, March 19, 2018

Institute for Advanced Study - Princeton University Joint Symposium on "The Mathematical Theory of Deep Neural Networks", Tuesday March 20th,

Adam just sent me the following:
Hi Igor,

I'm a long-time reader of your blog from back in the day when compressed sensing was still up-and-coming. I wanted to bring to your attention a workshop a few of my fellow post-docs at Princeton and I are hosting this Tuesday at the Princeton Neuroscience Institute: The "Institute for Advanced Study - Princeton University Joint Symposium on 'The Mathematical Theory of Deep Neural Networks'". I thought that this symposium would be of interest to both yourself and your readers. Since space is limited, we are going to be live-streaming the talks online (and will post videos once the dust settles). The link to the live-stream is available on the symposium website:



Adam Charles
Post-doctoral associate
Princeton Neuroscience Institute
Princeton, NJ, 08550
Awesome, Adam ! I love the streaming bit. Here is the announcement and the program

Institute for Advanced Study - Princeton University Joint Symposium on "The Mathematical Theory of Deep Neural Networks" 
Tuesday March 20th, Princeton Neuroscience Institute.
PNI  Lecture Hall A32 
This event will be live-streamed at: Additionally, video recordings of the talks will be posted after the event.
Registration is now open: register here. 
Recent advances in deep networks, combined with open, easily-accessible implementations, have moved the fields empirical results far faster than formal understanding. The lack of rigorous analysis for these techniques limits their use in addressing scientific questions in the physical and biological sciences, and prevents systematic design of the next generation of networks. Recently, long-past-due theoretical results have begun to emerge. These results, and those that will follow in their wake, will begin to shed light on the properties of large, adaptive, distributed learning architectures, and stand to revolutionize how computer science and neuroscience understand these systems.
This intensive one-day technical workshop will focus on state of the art theoretical understanding of deep learning. We aim to bring together researchers from the Princeton Neuroscience Institute (PNI) and Center for Statistics and Machine Learning (CSML) at Princeton University and of the theoretical machine learning group at the Institute for Advanced Studies (IAS) interested in more rigorously understanding deep networks to foster increased discussion and collaboration across these intrinsically related groups.

10:00-10:15: Adam Charles (PNI)"Introductory remarks"

 10:15-11:15: Sanjeev Arora (IAS)"Why do deep nets generalize, that is, predict well on unseen data?"
 11:15-12:15: Sebastian Musslick (PNI)"Multitasking Capability Versus Learning Efficiency in Neural Network Architectures"
 12:15-01:30: Lunch 
 01:30-02:30: Joan Bruna (NYU)"On the Optimization Landscape of Neural Networks"
 02:30-03:30: Andrew Saxe (Harvard) "A theory of deep learning dynamics: Insights from the linear case"
03:30-04:00: Break
 04:00-05:00: Anna Gilbert (U Mich)"Towards Understanding the Invertibility of Convolutional Neural Network"
05:00-06:00: Nadav Cohen (IAS) "Expressiveness of Convolutional Networks via Hierarchical Tensor Decompositions"
 06:00-06:15: Michael Shvartsman and Ahmed El Hady (PNI) "Outgoing remarks"
 06:15- 8:00: Reception

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Friday, March 16, 2018

Gradients explode - Deep Networks are shallow - ResNet explained

So last night at the Paris Machine Learning meetup, we had the good folks from Snips making an announcement on the release/open sourcing of their Natural language Understanding code. Joseph also mentioned that after many architectures search, a simple CRF model, a single layer model, did as well as other commercial models. It's NLP so the representability issue has already been parsed. In a different corner of the galaxy, the following paper seems to suggest that ResNets, while rendering these deep networks effectively shallower, do not solve the gradient explosion problem. 

Abstract: Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities ``solve'' the exploding gradient problem, we show that this is not the case and that in a range of popular MLP architectures, exploding gradients exist and that they limit the depth to which networks can be effectively trained, both in theory and in practice. We explain why exploding gradients occur and highlight the {\it collapsing domain problem}, which can arise in architectures that avoid exploding gradients. ResNets have significantly lower gradients and thus can circumvent the exploding gradient problem, enabling the effective training of much deeper networks, which we show is a consequence of a surprising mathematical property. By noticing that {\it any neural network is a residual network}, we devise the {\it residual trick}, which reveals that introducing skip connections simplifies the network mathematically, and that this simplicity may be the major cause for their success.
TL;DR: We show that in contrast to popular wisdom, the exploding gradient problem has not been solved and that it limits the depth to which MLPs can be effectively trained. We show why gradients explode and how ResNet handles them.

In this work we propose a novel interpretation of residual networks showing that they can be seen as a collection of many paths of differing length. Moreover, residual networks seem to enable very deep networks by leveraging only the short paths during training. To support this observation, we rewrite residual networks as an explicit collection of paths. Unlike traditional models, paths through residual networks vary in length. Further, a lesion study reveals that these paths show ensemble-like behavior in the sense that they do not strongly depend on each other. Finally, and most surprising, most paths are shorter than one might expect, and only the short paths are needed during training, as longer paths do not contribute any gradient. For example, most of the gradient in a residual network with 110 layers comes from paths that are only 10-34 layers deep. Our results reveal one of the key characteristics that seem to enable the training of very deep networks: Residual networks avoid the vanishing gradient problem by introducing short paths which can carry gradient throughout the extent of very deep networks.

Deep Residual Learning for Image Recognition by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers.
The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Wednesday, March 14, 2018

Paris Machine Learning Meetup #7 Season 5, Natural Language Understanding (NLU), AI for HR, decentralized AI

Tonight we will be hosted by Urban Linker ! The video of the streaming is here and presentation slides will be available here as well before the meetup. Stay tuned.


Joseph Dureau, Snips NLU (, an Open Source, Private by Design alternative to cloud-based solutions

As part of its mission to expand the use of privacy-preserving AI solutions, the Snips team has decided to fully open source its solution for Natural Language Understanding. Snips NLU is an alternative to all cloud-based NLU solutions powering chatbots or voice assistants: Dialogflow,, Recast, Amazon Lex,, Watson, etc. You can run it on the edge or on premises, thus avoiding giving away your user data to a third party service.

Erik Mathiesen, (, An AI Careers Advisor: Using Machine Learning to Predict Your Career Path specializes in smart solutions for recruitment. In this talk, I will describe how we use AI, and in particular Neural Networks and Deep Learning, to analyse and predict people’s career paths. Having analysed millions of CVs, our system can predict from a person’s CV what jobs are most likely to be next in the career path of that individual, as well as when the next job move is mostly likely to happen. By doing this, we enable companies to predict and find better candidates as well as forecast future hiring needs within an organisation. I will outline the technologies and techniques used in this application and give a few illustrative example of its usage.

An open-source community focused on building technology to facilitate the decentralized ownership of data and intelligence.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, March 12, 2018

Random projections in gravitational wave searches of compact binaries

Randomized Matrix factorization and gravitational waves, this is cool !

Random projection (RP) is a powerful dimension reduction technique widely used in analysis of high dimensional data. We demonstrate how this technique can be used to improve the computational efficiency of gravitational wave searches from compact binaries of neutron stars or black holes. Improvements in low-frequency response and bandwidth due to detector hardware upgrades pose a data analysis challenge in the advanced LIGO era as they result in increased redundancy in template databases and longer templates due to higher number of signal cycles in band. The RP-based methods presented here address both these issues within the same broad framework. We first use RP for an efficient, singular value decomposition inspired template matrix factorization and develop a geometric intuition for why this approach works. We then use RP to calculate approximate time-domain correlations in a lower dimensional vector space. For searches over parameters corresponding to non-spinning binaries with a neutron star and a black hole, a combination of the two methods can reduce the total on-line computational cost by an order of magnitude over a nominal baseline. This can, in turn, help free-up computational resources needed to go beyond current spin-aligned searches to more complex ones involving generically spinning waveforms.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, March 10, 2018

Saturday Morning Videos: NIPS2017 Meta Learning Symposium videos

Pieter Abbeel mentioned that the #nips2017 Meta Learning Symposium videos are now available here.

Thanks to Risto Miikkulainen, Quoc Le, Kenneth Stanley, and Chrisantha Fernando for organizing and getting the videos online !

Opening remarks, Quoc Le (slides, video)

Topic I: Evolutionary Optimization

  • Evolving Multitask Neural Network Structure, Risto Miikkulainen (slides, video)
  • Evolving to Learn through Synaptic Plasticity, Ken Stanley (slides, video)
  • PathNet and Beyond, Chrisantha Fernando (slides, video)
Topic II: Bayesian Optimization

  • Bayesian Optimization for Automated Model Selection, Roman Garnett (slides, video)
  • Automatic Machine Learning (AutoML) and How To Speed It Up, Frank Hutter (slides, video)

Topic III: Gradient Descent

  • Contrasting Model- and Optimization-based Metalearning, Oriol Vinyals (slides, video)
  • Population-based Training for Neural Network Meta-Optimization, Max Jaderberg (slides, video)
  • Learning to Learn for Robotic Control, Pieter Abbeel (slides, video)
  • On Learning How to Learn Learning Strategies, Juergen Schmidhuber (slides, video)

Topic IV: Reinforcement Learning

  • Intrinsically Motivated Reinforcement Learning, Satinder Singh (video)
  • Self-Play, Ilya Sutskever (slides, video)
  • Neural Architecture Search, Quoc Le (slides, video)
  • Multiple scales of reward and task learning, Jane Wang (slides, video)

Panel discussion, Moderator: Risto Miikkulainen, Panelists: Frank Hutter, Juergen Schmidhuber, Ken Stanley, Ilya Sutskever (video)

Credit photn: NASA, Starshine 2 , more on project Starshine.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, March 06, 2018

Randomness in Deconvolutional Networks for Visual Representation

So random weight networks seem to have better generalization properties, uh.

Toward a deeper understanding on the inner work of deep neural networks, we investigate CNN (convolutional neural network) using DCN (deconvolutional network) and randomization technique, and gain new insights for the intrinsic property of this network architecture. For the random representations of an untrained CNN, we train the corresponding DCN to reconstruct the input images. Compared with the image inversion on pre-trained CNN, our training converges faster and the yielding network exhibits higher quality for image reconstruction. It indicates there is rich information encoded in the random features; the pre-trained CNN may discard information irrelevant for classification and encode relevant features in a way favorable for classification but harder for reconstruction. We further explore the property of the overall random CNN-DCN architecture. Surprisingly, images can be inverted with satisfactory quality. Extensive empirical evidence as well as theoretical analysis are provided.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Sunday, February 18, 2018

Sunday Morning Insight: LightOn Cloud: Light Based Technology for AI on the Cloud

As some of you may know, part of the reason I am little less active on Nuit Blanche these days stems from being involved with LightOn. At LightOn, we build hardware that uses light to perform computations of interest to Machine Learning, in short, we bring light to AI

Quite simply we are building a hardware product that does random projections... for now. If you are a student of history or if you know the history of how technologies begin and thrive, it is essential for that technology to meet its eventual end users very early on. 

At LightOn, we want to get as much feedback as possible from the Machine Learning community as early as possible. And so for the past year, we have been working on integrating our technology so that it can be accessible on the web.  

Thanks to the OVH Labs program, we got one of our prototype to run in a nearby data center. On December 20th, we had our first light and it was beautiful.

Since then we have been going through our Verification and Validation (V\&V) program and started to run some algorithms on it. On Friday, we issued a press release on opening up our cloud to the Machine Learning community. If you want to be a beta user on our cluster, please register your interest here 

Forward we go !

How to find us on the web ?

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Saturday, February 17, 2018

Posters: SysML 2018 Conference

There is currently the SysML 2018 Conference at Stanford and while the Live Stream is over, the poster session is taking place. Here are the presentations of each poster:

Session I: 4:30pm - 6:00pm
1-1 A SIMD-MIMD Acceleration with Access-Execute Decoupling for Generative Adversarial Networks Amir Yazdanbakhsh, Kambiz Samadi, Hadi Esmaeilzadeh, Nam Sung Kim
1-2 Slice Finder: Automated Data Slicing for Model Interpretability Yeounoh Chung, Tim Kraska, Steven Euijong Whang, Neoklis Polyzotis
1-3 Data Infrastructure for Machine Learning Eric Breck, Neoklis Polyzotis, Sudip Roy, Steven Euijong Whang, Martin Zinkevich
1-4 Speeding up ImageNet Training on Supercomputers Yang You, Zhao Zhang, Cho-Jui Hsieh, James Demmel, Kurt Keutzer
1-5 Aloha: A Machine Learning Framework for Engineers Ryan M Deak, Jonathan H Morra
1-6 Parameter Hub: High Performance Parameter Servers for Efficient Distributed Deep Neural Network Training Liang Luo, Jacob Nelson, Luis Ceze, Amar Phanishayee, Arvind Krishnamurthy
1-7 Stitch-X: An Accelerator Architecture for Exploiting Unstructured Sparsity in Deep Neural Networks Ching-En Lee, Yakun Sophia Shao, Jie-Fang Zhang, Angshuman Parashar, Joel Emer, Stephen W. Keckler, Zhengya Zhang
1-8 DeepVizdom: Deep Interactive Data Exploration Carsten Binnig, Kristian Kersting, Alejandro Molina, Emanuel Zgraggen
1-9 Massively Parallel Video Networks João Carreira, Viorica Pătrăucean, Andrew Zisserman, Simon Osindero
1-10 EVA: An Efficient System for Exploratory Video Analysis Ziqiang Feng, Junjue Wang, Jan Harkes, Padmanabhan Pillai, Mahadev Satyanarayanan
1-11 Declarative Metadata Management: A Missing Piece in End-To-End Machine Learning Sebastian Schelter, Joos-Hendrik Böse, Johannes Kirschnick, Thoralf Klein, Stephan Seufert
1-12 Runway: machine learning model experiment management tool Jason Tsay, Todd Mummert, Norman Bobroff, Alan Braz, Peter Westerink, Martin Hirzel
1-13 STRADS-AP: Simplifying Distributed Machine Learning Programming Jin Kyu Kim, Garth A. Gibson, Eric P. Xing
1-14 A Deeper Look at FFT and Winograd Convolutions Aleksandar Zlateski, Zhen Jia, Kai Li, Fredo Durand
1-15 Efficient Deep Learning Inference on Edge Devices Ziheng Jiang, Tianqi Chen, Mu Li
1-16 On Human Intellect and Machine Failures: Troubleshooting Integrative Machine Learning Systems Besmira Nushi, Ece Kamar, Eric Horvitz, Donald Kossmann
1-17 DeepThin: A Self-Compressing Library for Deep Neural Networks Matthew Sotoudeh, Sara S. Baghsorkhi
1-18 MAERI: Enabling Flexible Dataflow Mapping over DNN Accelerators via Programmable Interconnects Hyoukjun Kwon, Ananda Samajdar, Tushar Krishna
1-19 On Machine Learning and Programming Languages Mike Innes, Stefan Karpinski, Viral Shah, David Barber, Pontus Stenetorp, Tim Besard, James Bradbury, Valentin Churavy, Simon Danisch, Alan Edelman, Jon Malmaud, Jarrett Revels, Deniz Yuret
1-20 "I Like the Way You Think!" - Inspecting the Internal Logic of Recurrent Neural Networks Thibault Sellam, Kevin Lin, Ian Yiran Huang, Carl Vondrick, Eugene Wu
1-21 Automatic Differentiation in Myia Olivier Breuleux, Bart van Merriënboer
1-22 TFX Frontend: A Graphical User Interface for a Production-Scale Machine Learning Platform Peter Brandt, Josh Cai, Tommie Gannert, Pushkar Joshi, Rohan Khot, Chiu Yuen Koo, Chenkai Kuang, Sammy Leong, Clemens Mewald, Neoklis Polyzotis, Herve Quiroz, Sudip Roy, Po-Feng Yang, James Wexler, Steven Euijong Whang
1-23 Learned Index Structures Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, Neoklis Polyzotis
1-24 Towards Optimal Winograd Convolution on Manycores Zhen Jia, Aleksandar Zlateski, Fredo Durand, Kai Li
1-25 Mobile Machine Learning Hardware at ARM: A Systems-on-Chip (SoC) Perspective Yuhao Zhu, Matthew Mattina, Paul Whatmough
1-26 Deep Learning with Apache SystemML Niketan Pansare, Michael Dusenberry, Nakul Jindal, Matthias Boehm, Berthold Reinwald, Prithviraj Sen
1-27 Scalable Language Modeling: WikiText-103 on a Single GPU in 12 hours Stephen Merity, Nitish Shirish Keskar, James Bradbury, Richard Socher
1-28 PipeDream: Pipeline Parallelism for DNN Training Aaron Harlap, Deepak Narayanan, Amar Phanishayee, Vivek Seshadri, Gregory R. Ganger, Phillip B. Gibbons
1-29 Efficient Mergeable Quantile Sketches using Moments Edward Gan, Jialin Ding, Peter Bailis
1-30 Systems Optimizations for Learning Certifiably Optimal Rule Lists Nicholas Larus-Stone, Elaine Angelino, Daniel Alabi, Margo Seltzer, Vassilios Kaxiras, Aditya Saligrama, Cynthia Rudin
1-31 Accelerating Model Search with Model Batching Deepak Narayanan, Keshav Santhanam, Matei Zaharia
1-32 Programming Language Support for Natural Language Interaction Alex Renda, Harrison Goldstein, Sarah Bird, Chris Quirk, Adrian Sampson
1-33 Factorized Deep Retrieval and Distributed TensorFlow Serving Xinyang Yi, Yi-Fan Chen, Sukriti Ramesh, Vinu Rajashekhar, Lichan Hong, Noah Fiedel, Nandini Seshadri, Lukasz Heldt, Xiang Wu, Ed H. Chi
1-34 Relaxed Pruning: Memory-Efficient LSTM Inference Engine by Limiting the Synaptic Connection Patterns Jaeha Kung, Junki Park, Jae-Joon Kim
1-35 Deploying Deep Ranking Models for Search Verticals Rohan Ramanath, Gungor Polatkan, Liqin Xu, Harold Lee, Bo Hu, Shan Zhou
1-36 Understanding the Error Structure as a Key to Regularize Convolutional Neural Networks Bilal Alsallakh, Amin Jourabloo, Mao Ye, Xiaoming Liu, Liu Ren
1-37 On Scale-out Deep Learning Training for Cloud and HPC Srinivas Sridharan, Karthikeyan Vaidyanathan, Dhiraj Kalamkar, Dipankar Das, Mikhail E. Smorkalov, Mikhail Shiryaev, Dheevatsa Mudigere, Naveen Mellempudi, Sasikanth Avancha, Bharat Kaul, Pradeep Dubey
1-38 In-network Neural Networks Giuseppe Siracusano, Roberto Bifulco
1-39 Compressing Deep Neural Networks with Probabilistic Data Structures Brandon Reagen, Udit Gupta, Robert Adolf, Michael M. Mitzenmacher, Alexander M. Rush, Gu-Yeon Wei, David Brooks
1-40 Greenhouse: A Zero-Positive Machine Learning System for Time-Series Anomaly Detection Tae Jun Lee, Justin Gottschlich, Nesime Tatbul, Eric Metcalf, Stan Zdonik
1-41 Precision and Recall for Range-Based Anomaly Detection Tae Jun Lee, Justin Gottschlich, Nesime Tatbul, Eric Metcalf, Stan Zdonik
1-42 Whetstone: An accessible, platform-independent method for training spiking deep neural networks for neuromorphic processors William M. Severa, Craig M. Vineyard, Ryan Dellana, James B. Aimone
1-43 SparseCore: An Accelerator for Structurally Sparse CNNs Sharad Chole, Ramteja Tadishetti, Sree Reddy
1-44 SGD on Random Mixtures: Private Machine Learning under Data Breach Threats Kangwook Lee, Kyungmin Lee, Hoon Kim, Changho Suh, Kannan Ramchandran
1-45 Towards High-Performance Prediction Serving Systems Yunseong Lee, Alberto Scolari, Matteo Interlandi, Markus Weimer, Byung-Gon Chun
1-46 Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization Fabian Pedregosa, Rémi Leblond, Simon Lacoste–Julien
1-47 Corpus Conversion Service: A machine learning platform to ingest documents at scale. Peter W J Staar, Michele Dolfi, Christoph Auer, Costas Bekas
1-48 Representation Learning for Resource Usage Prediction Florian Schmidt, Mathias Niepert, Felipe Huici
1-49 TVM: End-to-End Compilation Stack for Deep Learning Tianqi Chen, Thierry Moreau, Ziheng Jiang, Haichen Shen, Eddie Yan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
1-50 vectorflow: a minimalist neural-network library Benoît Rostykus, Yves Raimond
1-51 Learning Heterogeneous Cloud Storage Configuration for Data Analytics Ana Klimovic, Heiner Litz, Christos Kozyrakis
1-52 Salus: Fine-Grained GPU Sharing Among CNN Applications Peifeng Yu, Mosharaf Chowdhury
1-53 OpenCL Acceleration for TensorFlow Mehdi Goli, Luke Iwanski, John Lawson, Uwe Dolinsky, Andrew Richards
1-54 Picking Interesting Frames in Streaming Video Christopher Canel, Thomas Kim, Giulio Zhou, Conglong Li, Hyeontaek Lim, David G. Andersen, Michael Kaminsky, Subramanya R. Dulloor
1-55 SLAQ: Quality-Driven Scheduling for Distributed Machine Learning Haoyu Zhang, Logan Stafman, Andrew Or, Michael J. Freedman
1-56 A Comparison of Bottom-Up Approaches to Grounding for Templated Markov Random Fields Eriq Augustine, Lise Getoor
1-57 Growing Cache Friendly Decision Trees Niloy Gupta, Adam Johnston
1-58 Parallelizing Hyperband for Large-Scale Tuning Lisha Li, Kevin Jamieson, Afshin Rostamizadeh, Ameet Talwalkar
1-59 Towards Interactive Curation and Automatic Tuning of ML Pipelines Carsten Binnig, Benedetto Buratti, Yeounoh Chung, Cyrus Cousins, Dylan Ebert, Tim Kraska, Zeyuan Shang, Isabella Tromba, Eli Upfal, Linnan Wang, Robert Zeleznik, Emanuel Zgraggen

Session II: 6:00pm - 7:30pm
2-1 Ternary Residual Networks Abhisek Kundu, Kunal Banerjee, Naveen Mellempudi, Dheevatsa Mudigere, Dipankar Das, Bharat Kaul, Pradeep Dubey
2-2 Neural Architect: A Multi-objective Neural Architecture Search with Performance Prediction Yanqi Zhou, Gregory Diamos
2-3 Federated Kernelized Multi-Task Learning Sebastian Caldas, Virginia Smith, Ameet Talwalkar
2-4 Materialization Trade-offs for Feature Transfer from Deep CNNs for Multimodal Data Analytics Supun Nakandala, Arun Kumar
2-5 Scaling HDBSCAN Clustering with kNN Graph Approximation Jacob Jackson, Aurick Qiao, Eric P. Xing
2-6 BlazeIt: An Optimizing Query Engine for Video at Scale Daniel Kang, Peter Bailis, Matei Zaharia
2-7 Time Travel based Feature Generation Kedar Sadekar, Hua Jiang
2-8 Controlling AI Engines in Dynamic Environments Nikita Mishra, Connor Imes, Henry Hoffmann, John D. Lafferty
2-9 Intermittent Deep Neural Network Inference Graham Gobieski, Nathan Beckmann, Brandon Lucia
2-10 CascadeCNN: Pushing the performance limits of quantisation Alexandros Kouris, Stylianos I. Venieris, Christos-Savvas Bouganis
2-11 Making Machine Learning Easy with Embeddings Dan Shiebler, Abhishek Tayal
2-12 CrossBow: Scaling Deep Learning on Multi-GPU Servers Alexandros Koliousis, Pijika Watcharapichat, Matthias Weidlich, Paolo Costa, Peter Pietzuch
2-13 Better Caching with Machine Learned Advice Thodoris Lykouris, Sergei Vassilvitskii
2-14 Large Model Support for Deep Learning in Caffe and Chainer Minsik Cho, Tung D. Le, Ulrich A. Finkler, Haruiki Imai, Yasushi Negishi, Taro Sekiyama, Saritha Vinod, Vladimir Zolotov, Kiyokuni Kawachiya, David S. Kung, Hillery C. Hunter
2-15 Learning Graph-based Cluster Scheduling Algorithms Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Mohammad Alizadeh
2-16 Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning Scott Cyphers, Arjun K. Bansal, Anahita Bhiwandiwalla, Jayaram Bobba, Matthew Brookhart, Avijit Chakraborty, Will Constable, Christian Convey, Leona Cook, Omar Kanawi, Robert Kimball, Jason Knight, Nikolay Korovaiko, Varun Kumar, Yixing Lao, Christopher R. Lishka, Jaikrishnan Menon, Jennifer Myers, Sandeep Aswath Narayana, Adam Procter, Tristan J. Webb
2-17 Efficient Multi-Tenant Inference on Video using Microclassifiers Giulio Zhou, Thomas Kim, Christopher Canel, Conglong Li, Hyeontaek Lim, David G. Andersen, Michael Kaminsky, Subramanya R. Dulloor
2-18 Abstractions for Containerized Machine Learning Workloads in the Cloud Balaji Subramaniam, Niklas Nielsen, Connor Doyle, Ajay Deshpande, Jason Knight, Scott Leishman
2-19 Not All Ops Are Created Equal! Liangzhen Lai, Naveen Suda, Vikas Chandra
2-20 Robust Gradient Descent via Moment Encoding with LDPC Codes Raj Kumar Maity, Ankit Singh Rawat, Arya Mazumdar
2-21 Buzzsaw: A System for High Speed Feature Engineering Andrew Stanton, Liangjie Hong, Manju Rajashekhar
2-22 Predicate Optimization for a Visual Analytics Database Michael R. Anderson, Michael Cafarella, Thomas F. Wenisch, German Ros
2-23 Understanding the Limitations of Current Energy-Efficient Design Approaches for Deep Neural Networks Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, Vivienne Sze
2-24 Compiling machine learning programs via high-level tracing Roy Frostig, Matthew James Johnson, Chris Leary
2-25 Dynamic Stem-Sharing for Multi-Tenant Video Processing Angela Jiang, Christopher Canel, Daniel Wong, Michael Kaminsky, Michael A. Kozuch, Padmanabhan Pillai, David G. Andersen, Gregory R. Ganger
2-26 A Hierarchical Model for Device Placement Azalia Mirhoseini, Anna Goldie, Hieu Pham, Benoit Steiner, Quoc V. Le, Jeff Dean
2-27 Blink: A fast NVLink-based collective communication library Guanhua Wang, Amar Phanishayee, Shivaram Venkataraman, Ion Stoica
2-28 TOP: A Compiler-Based Framework for Optimizing Machine Learning Algorithms through Generalized Triangle Inequality Yufei Ding, Lin Ning, Hui Guang, Xipeng Shen, Madanlal Musuvathi, Todd Mytkowicz
2-29 UberShuffle: Communication-efficient Data Shuffling for SGD via Coding Theory Jichan Chung, Kangwook Lee, Ramtin Pedarsani, Dimitris Papailiopoulos, Kannan Ramchandran
2-30 Toward Scalable Verification for Safety-Critical Deep Networks Lindsey Kuper, Guy Katz, Justin Gottschlich, Kyle Julian, Clark Barrett, Mykel J. Kochenderfer
2-31 DAWNBench: An End-to-End Deep Learning Benchmark and Competition Cody Coleman, Deepak Narayanan, Daniel Kang, Tian Zhao, Jian Zhang, Luigi Nardi, Peter Bailis, Kunle Olukotun, Chris Ré, Matei Zaharia
2-32 Learning Network Size While Training with ShrinkNets Guillaume Leclerc, Raul Castro Fernandez, Samuel Madden
2-33 Have a Larger Cake and Eat It Faster Too: A Guideline to Train Larger Models Faster Newsha Ardalani, Joel Hestness, Gregory Diamos
2-34 Retrieval as a defense mechanism against adversarial examples in convolutional neural networks Junbo Zhao, Jinyang Li, Kyunghyun Cho
2-35 DNN-Train: Benchmarking and Analyzing Deep Neural Network Training Hongyu Zhu, Bojian Zheng, Bianca Schroeder, Gennady Pekhimenko, Amar Phanishayee
2-36 High Accuracy SGD Using Low-Precision Arithmetic and Variance Reduction (for Linear Models) Alana Marzoev, Christopher De Sa
2-37 SkipNet: Learning Dynamic Routing in Convolutional Networks Xin Wang, Fisher Yu, Zi-Yi Dou, Joseph E. Gonzalez
2-38 Memory-Efficient Data Structures for Learning and Prediction Damian Eads, Paul Baines, Joshua S. Bloom
2-39 Efficient and Programmable Machine Learning on Distributed Shared Memory via Static Analysis Jinliang Wei, Garth A. Gibson, Eric P. Xing
2-40 Parle: parallelizing stochastic gradient descent Pratik Chaudhari, Carlo Baldassi, Riccardo Zecchina, Stefano Soatto, Ameet Talwalkar, Adam Oberman
2-41 Optimal Message Scheduling for Aggregation Leyuan Wang, Mu Li, Edo Liberty, Alex J. Smola
2-42 Analog electronic deep networks for fast and efficient inference Jonathan Binas, Daniel Neil, Giacomo Indiveri, Shih-Chii Liu, Michael Pfeiffer
2-43 Network Evolution for DNNs Michael Alan Chang, Aurojit Panda, Domenic Bottini, Lisa Jian, Pranay Kumar, Scott Shenker
2-44 BinaryCmd: Keyword Spotting with deterministic binary basis Javier Fernández-Marqués, Vincent W.-S. Tseng, Sourav Bhattachara, Nicholas D. Lane
2-45 YellowFin: Adaptive Optimization for (A)synchronous Systems Jian Zhang, Ioannis Mitliagkas
2-46 GPU-acceleration for Large-scale Tree Boosting Huan Zhang, Si Si, Cho-Jui Hsieh
2-47 Treelite: toolbox for decision tree deployment Hyunsu Cho, Mu Li
2-48 On Importance of Execution Ordering in Graph-Based Distributed Machine Learning Systems Sayed Hadi Hashemi, Sangeetha Abdu Jyothi, Roy Campbell
2-49 Draco: Robust Distributed Training against Adversaries Lingjiao Chen, Hongyi Wang, Dimitris Papailiopoulos
2-50 Clustering System Data using Aggregate Measures Johnnie C-N. Chang, Robert H-J. Chen, Jay Pujara, Lise Getoor
2-51 A Framework for Searching a Predictive Model Yoshiki Takahashi, Masato Asahara, Kazuyuki Shudo
2-52 Distributed Placement of Machine Learning Operators for IoT applications spanning Edge and Cloud Resources Tarek Elgamal, Atul Sandur, Klara Nahrstedt, Gul Agha
2-53 Finding Heavily-Weighted Features with the Weight-Median Sketch Kai Sheng Tai, Vatsal Sharan, Peter Bailis, Gregory Valiant
2-54 Flexible Primitives for Distributed Deep Learning in Ray Yaroslav Bulatov, Robert Nishihara, Philipp Moritz, Melih Elibol, Ion Stoica, Michael I. Jordan
2-55 BLAS-on-flash: an alternative for training large ML models? Suhas Jayaram Subramanya, Srajan Garg, Harsha Vardhan Simhadri
2-56 Treating Machine Learning Algorithms As Declaratively Specified Circuits Jason Eisner, Nathaniel Wesley Filardo
2-57 Tasvir: Distributed Shared Memory for Machine Learning Amin Tootoonchian, Aurojit Panda, Aida Nematzadeh, Scott Shenker

Rest of the program:

  • 9:00 am - 9:15 am Opening Remarks: Ameet Talwalkar
  • Session I (moderator: Virginia Smith)
  • 9:15 am - 9:55 am Invited talk: Michael I. Jordan
  • 9:55 am - 10:05 am Contributed talk: TVM: End-to-End Compilation Stack for Deep Learning, Tianqi Chen
  • 10:05 am - 10:15 am Contributed talk: Robust Gradient Descent via Moment Encoding with LDPC Codes, Arya Mazumdar
  • 10:15 am - 10:25 am Contributed talk: Analog electronic deep networks for fast and efficient inference, Jonathan Binas
  • 10:25 am - 10:50 pm Coffee Break
  • Session II (moderator: Virginia Smith)
  • 10:50 am - 11:30 am Invited talk: Hardware for Deep Learning, Bill Dally
  • 11:30 am - 11:40 am Contributed talk: YellowFin: Adaptive Optimization for (A)synchronous Systems, Ioannis Mitliagkas
  • 11:40 am - 12:20 am Invited talk: Security, Privacy, and Democratization: Challenges & Future Directions for ML Systems beyond Scalability, Dawn Song
  • 12:20 pm - 1:30 pm Lunch
  • Session III (moderator: Sarah Bird)
  • 1:30 pm - 2:10 pm Invited talk: Structured ML: Opportunities and Challenges for the SysML Community, Lise Getoor
  • 2:10 pm - 2:20 pm Contributed talk: Understanding the Limitations of Current Energy-Efficient Design Approaches for Deep Neural Networks, Vivienne Sze
  • 2:20 am - 2:30 am Contributed talk: Towards High-Performance Prediction Serving Systems, Matteo Interlandi
  • 2:30 pm - 2:55 pm Coffee Break
  • Session IV (moderator: Sarah Bird)
  • 2:55 pm - 3:05 pm Contributed talk: "I Like the Way You Think!" - Inspecting the Internal Logic of Recurrent Neural Networks, Thibault Sellam
  • 3:05 pm - 3:45 pm Invited talk: Systems and Machine Learning Symbiosis, Jeff Dean
  • 3:45 pm - 4:00 pm Closing Remarks: Matei Zaharia

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.