Research
Selected publications: Generalization of two-layer neural networks: an asymptotic viewpoint. Ba, J., Erdogdu, M., Suzuki, T., Wu, D., & Zhang, T. (2020). International conference on learning representations.
On solving minimax optimization locally: a follow-the-ridge approach. Wang, Y., Zhang, G., & Ba, J. (2020). International conference on learning representations. arXiv preprint arXiv:1910.07512.
Interplay between optimization and generalization of stochastic gradient descent with covariance noise. Wen, Y., Luk, K., Gazeau, M., Zhang, G., Chan, H., & Ba, J. (2020). International conference on artificial intelligence and statistics.
Batchensemble: an alternative approach to efficient ensemble and lifelong learning. Wen, Y., Tran, D., & Ba, J. (2020). International conference on learning representations.
An inductive bias for distances: neural nets that respect the triangle inequality. Pitis, S., Chan, H., Jamali, K., & Ba, J. (2020). International conference on learning representations.
Towards characterizing the high-dimensional bias of kernel-based particle inference algorithms. Ba, J., Erdogdu, M. A., Ghassemi, M., Suzuki, T., Sun, S., Wu, D., & Zhang, T. (2019).
preprint.
Actrce: augmenting experience via teacher's advice for multi-goal reinforcement learning. Chan, H., Wu, Y., Kiros, J., Fidler, S., & Ba, J. (2019).
arXiv preprint arXiv:1902.04546.
Dream to control: learning behaviors by latent imagination. Hafner, D., Lillicrap, T., Ba, J., & Norouzi, M. (2019). International conference on learning representations. arXiv preprint arXiv:1912.01603.
Dom-q-net: grounded rl on structured language. Jia, S., Kiros, J., & Ba, J. (2019). International conference on learning representations. arXiv preprint arXiv:1902.07257.
Graph normalizing flows. Liu, J. S., Kumar, A., Ba, J., Kiros, J. R., & Swersky, K. (2019). Advances in neural information processing systems.
Exploring model-based planning with policy networks. Wang, T., & Ba, J. (2019). International conference on learning representations. arXiv preprint arXiv:1906.08649.
Benchmarking model-based reinforcement learning. Wang, T., Bao, X., Clavera, I., Hoang, J., Wen, Y., Langlois, E., … Ba, J. (2019).
arXiv preprint arXiv:1907.02057.
Neural graph evolution: towards efficient automatic robot design. Wang, T., Zhou, Y., Fidler, S., & Ba, J. (2019).
International Conference on Learning Representations.
Lookahead optimizer: k steps forward, 1 step back. Zhang, M., Lucas, J., Hinton, G. E., & Ba, J. (2019). Advances in neural information processing systems (pp. 9593–9604).
Towards permutation-invariant graph generation. Liu, J., Kumar, A., Ba, J., & Swersky, K. (2018).
preprint.
Reversible recurrent neural networks. MacKay, M., Vicol, P., Ba, J., & Grosse, R. B. (2018). Advances in neural information processing systems (pp. 9029–9040).
Kronecker-factored curvature approximations for recurrent neural networks. Martens, J., Ba, J., & Johnson, M. (2018). International conference on learning representations.
On the convergence and robustness of training gans with regularized optimal transport. Sanjabi, M., Ba, J., Razaviyayn, M., & Lee, J. D. (2018). Advances in neural information processing systems (pp. 7091–7101).
Nervenet: learning structured policy with graph neural networks. Wang, T., Liao, R., Ba, J., & Fidler, S. (2018). International conference on learning representations.
Flipout: efficient pseudo-independent weight perturbations on mini-batches. Wen, Y., Vicol, P., Ba, J., Tran, D., & Grosse, R. (2018).
International Conference on Learning Representations.
Distributed second-order optimization using kronecker-factored approximations. Ba, J., Grosse, R., & Martens, J. (2017). International conference on learning representations.
Automated analysis of high‐content microscopy data with deep learning. Kraus, O. Z., Grys, B. T., Ba, J., Chong, Y., Frey, B. J., Boone, C., & Andrews, B. J. (2017).
Molecular systems biology,
13(4).
Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. Wu, Y., Mansimov, E., Grosse, R. B., Liao, S., & Ba, J. (2017). Advances in neural information processing systems (pp. 5279–5288).
Using fast weights to attend to the recent past. Ba, J., Hinton, G. E., Mnih, V., Leibo, J. Z., & Ionescu, C. (2016). Advances in neural information processing systems (pp. 4331–4339).
Layer normalization. Ba, J., Kiros, J. R., & Hinton, G. E. (2016). Advances in nips 2016 deep learning symposium (p. arXiv preprint arXiv:1607.06450).
Classifying and segmenting microscopy images with deep multiple instance learning. Kraus, O. Z., Ba, J. L., & Frey, B. J. (2016).
Bioinformatics,
32(12), i52-i59.
Generating images from captions with attention. Mansimov, E., Parisotto, E., Ba, J., & Salakhutdinov, R. (2016). International conference on learning representations (p. arXiv preprint arXiv:1511.02793, 2015).
Actor-mimic: deep multitask and transfer reinforcement learning. Parisotto, E., Ba, J., & Salakhutdinov, R. (2016). International conference on learning representations (p. arXiv preprint arXiv:1511.06342).
Multiple object recognition with visual attention. Ba, J., Mnih, V., & Kavukcuoglu, K. (2015). International conference on learning representations.
Learning wake-sleep recurrent attention models. Ba, J., Salakhutdinov, R. R., Grosse, R. B., & Frey, B. J. (2015). Advances in neural information processing systems (pp. 2593–2601).
Predicting deep zero-shot convolutional neural networks using textual descriptions. Ba, J., Swersky, K., & Fidler, S. (2015). Proceedings of the ieee international conference on computer vision (pp. 4247–4255).
Show, attend and tell: neural image caption generation with visual attention. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., … Bengio, Y. (2015). International conference on machine learning (pp. 2048–2057).
Do deep nets really need to be deep? Ba, J., & Caruana, R. (2014). Advances in neural information processing systems (pp. 2654–2662).
Making dropout invariant to transformations of activation functions and inputs. Ba, J., Xiong, H. Y., & Frey, B. (2014).
NIPS 2014 Workshop on Deep Learning.
Adaptive dropout for training deep neural networks. Ba, J., & Frey, B. (2013). Advances in neural information processing systems (pp. 3084–3092).