My long-term research goal is to address a computational question: How can we build general problem-solving machines with human-like efficiency and adaptability? In particular, my research interests focus on the development of efficient learning algorithms for deep neural networks. I am also broadly interested in reinforcement learning, natural language processing and artificial intelligence.

For future students: I am starting the Assistant Professor position at the Department of Computer Science in mid 2018. Please apply through the department admission.

Short bio: I'm completing my PhD under the supervision of Geoffrey Hinton. Both my master (2014) and undergrad degrees (2011) are from the University of Toronto under Brendan Frey and Ruslan Salakhutdinov. I was a recipient of Facebook Graduate Fellowship 2016 in machine learning.

--Google scholar page contact me: jba at cs.toronto.edu

Current postdocs

Bradly C. Stadie

Current students

Harris Chan (joint with Sanja Fidler)

Danijar Hafner

Jenny Liu

Silviu Pitis

Tingwu Wang (joint with Sanja Fidler)

Yeming Wen

Denny Wu (joint with Marzyeh Ghassemi)

Michael Zhang


Research


Actrce: augmenting experience via teacher's advice for multi-goal reinforcement learning. Chan, H., Wu, Y., Kiros, J., Fidler, S., & Ba, J. (2019). arXiv preprint arXiv:1902.04546.

Dom-q-net: grounded rl on structured language. Jia, S., Kiros, J., & Ba, J. (2019). International conference on learning representation.

Neural graph evolution: automatic robot design. Wang, T., Zhou, Y., Fidler, S., & Ba, J. (2019). International conference on learning representation.

Interplay between optimization and generalization of stochastic gradient descent with covariance noise. Wen, Y., Luk, K., Gazeau, M., Zhang, G., Chan, H., & Ba, J. (2019). arXiv preprint arXiv:1902.08234.

Reversible recurrent neural networks. MacKay, M., Vicol, P., Ba, J., & Grosse, R. B. (2018). Advances in neural information processing systems (pp. 9029–9040).

Kronecker-factored curvature approximations for recurrent neural networks. Martens, J., Ba, J., & Johnson, M. (2018). International conference on learning representation.

On the convergence and robustness of training gans with regularized optimal transport. Sanjabi, M., Ba, J., Razaviyayn, M., & Lee, J. D. (2018). Advances in neural information processing systems (pp. 7091–7101).

Nervenet: learning structured policy with graph neural networks. Wang, T., Liao, R., Ba, J., & Fidler, S. (2018). International conference on learning representation.

Flipout: efficient pseudo-independent weight perturbations on mini-batches. Wen, Y., Vicol, P., Ba, J., Tran, D., & Grosse, R. (2018). International conference on learning representation.

Automated analysis of high-content microscopy data with deep learning. Kraus, O. Z., Grys, B. T., Ba, J., Chong, Y., Frey, B. J., Boone, C., & Andrews, B. J. (2017). Molecular systems biology, 13(4), 924.

Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation. Wu, Y., Mansimov, E., Grosse, R. B., Liao, S., & Ba, J. (2017). Advances in neural information processing systems (pp. 5279–5288).

Distributed second-order optimization using kronecker-factored approximations. Ba, J., Grosse, R., & Martens, J. (2016). International conference on learning representation.

Using fast weights to attend to the recent past. Ba, J., Hinton, G. E., Mnih, V., Leibo, J. Z., & Ionescu, C. (2016). Advances in neural information processing systems (pp. 4331–4339).

Layer normalization. Ba, J., Kiros, J. R., & Hinton, G. E. (2016). arXiv preprint arXiv:1607.06450.

Classifying and segmenting microscopy images with deep multiple instance learning. Kraus, O. Z., Ba, J. L., & Frey, B. J. (2016). Bioinformatics, 32(12), i52–i59.

Multiple object recognition with visual attention. Ba, J., Mnih, V., & Kavukcuoglu, K. (2015). International conference on learning representation.

Learning wake-sleep recurrent attention models. Ba, J., Salakhutdinov, R. R., Grosse, R. B., & Frey, B. J. (2015). Advances in neural information processing systems (pp. 2593–2601).

Adam: a method for stochastic optimization. Kingma, D., & Ba, J. (2015). International conference on learning representation.

Predicting deep zero-shot convolutional neural networks using textual descriptions. Lei Ba, J., Swersky, K., Fidler, S., & others. (2015). Proceedings of the ieee international conference on computer vision (pp. 4247–4255).

Generating images from captions with attention. Mansimov, E., Parisotto, E., Ba, J. L., & Salakhutdinov, R. (2015). International conference on learning representation.

Actor-mimic: deep multitask and transfer reinforcement learning. Parisotto, E., Ba, J. L., & Salakhutdinov, R. (2015). International conference on learning representation.

Show, attend and tell: neural image caption generation with visual attention. Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A., Salakhudinov, R., … Bengio, Y. (2015). International conference on machine learning (pp. 2048–2057).

Do deep nets really need to be deep? Ba, J., & Caruana, R. (2014). Advances in neural information processing systems (pp. 2654–2662).

Adaptive dropout for training deep neural networks. Ba, J., & Frey, B. (2013). Advances in neural information processing systems (pp. 3084–3092).