Alex's work aims to bridge the gap between theory and practice in Deep Learning. His work focuses on explaining the effect of individual components in neural networks and how their interactions affect the optimization dynamics of these models. Ultimately, his goal is to turn the current empirical approach for defining architectures into an engineering problem: given a task and the data, how can we pick the optimal combination of layers and hyperparameters?