1. Capsule networks by Geoffrey Hinton:

There is a big hype around what seems to be a new approach in deep learning even though the basic idea was invented by Hinton long ago. Lately he demonstrated SotA results using a relatively shallow network with this new special concept. There is a lot of philosophy around the mechanism that allows that which Hinton mentions in his talks.

Papers:

http://www.cs.toronto.edu/~fritz/absps/transauto6.pdf 

https://arxiv.org/pdf/1710.09829.pdf 

https://openreview.net/pdf?id=HJWLfGWRb 

Blogs and video tutorials:

https://medium.com/ai%C2%B3-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b 

https://jhui.github.io/2017/11/03/Dynamic-Routing-Between-Capsules/

https://jhui.github.io/2017/11/14/Matrix-Capsules-with-EM-routing-Capsule-Network/

https://www.youtube.com/watch?v=rTawFwUvnLE&t=3589s (Hinton explaining)

https://www.youtube.com/watch?v=YqazfBLLV4U&t=2353s

comments:

2. VQA to Visual dialogue by Dhruv Batra’s group

A press-report about Facebook having to shut down an experiment after crazy bots invented a secret language was published a few months ago. This description is, of course, not accurate. It relates to a work by Dhruv Batra’s group; A remarkable set of papers dealing with vision, natural language and deep reinforcement learning. Starting from visual question answering and going through bots chatting with each other about images.

Papers:

https://arxiv.org/abs/1505.00468

https://arxiv.org/abs/1612.00837

https://arxiv.org/abs/1611.08669

https://arxiv.org/abs/1706.08502

https://arxiv.org/abs/1610.02391

https://arxiv.org/abs/1703.06585

Blogs and video tutorials:

https://www.youtube.com/watch?v=7cGbl_muKIY&t=1545s

https://www.youtube.com/watch?v=Xbl-rQls77U&t=733s

https://www.youtube.com/redirect?event=video_description&v=Xbl-rQls77U&redir_token=T_qSxKjF6sppTacA1qFHH2PfLSV8MTUxODAyMDQ4OUAxNTE3OTM0MDg5&q=http%3A%2F%2Fvideolectures.net%2Fsite%2Fnormal_dl%2Ftag%3D1137915%2Fdeeplearning2017_parikh_batra_deep_rl.pdf

Comments:

3. Deep image enhancement and restoration

CNNs are very effective in image processing tasks.

Papers:

https://cv.snu.ac.kr/publication/conf/2017/DeepDeblur.pdf (Deblurring)

https://cv.snu.ac.kr/publication/conf/2017/thkim_iccv2017_online.pdf (Video Deblurring)

https://cv.snu.ac.kr/publication/conf/2017/PaletteNet.pdf (Recolorization)

https://cv.snu.ac.kr/publication/conf/2017/EDSR.pdf (EDSR super-resolution)

https://arxiv.org/abs/1701.01698 (Class aware denoising)

https://arxiv.org/abs/1803.02735 (Back projection nets for super-resolution)

https://dmitryulyanov.github.io/deep_image_prior (deep image prior)

Blogs and video tutorials:

https://www.youtube.com/watch?v=nSugL7HsKmg 

comments:


4. Domain transfer

A recent challenge in AI is transferring images from one domain to another. An even more advanced challenge is doing it without having example pairs of images but only two large sets. In the words of Yaniv Taigman: “True AI need no explicit supervision”.

Papers:

https://arxiv.org/abs/1611.07004 (Pix2Pix- Efros)

https://arxiv.org/abs/1703.10593 (CycleGAN- Efros)

https://papers.nips.cc/paper/6650-toward-multimodal-image-to-image-translation.pdf (BiCycleGAN- Efros)

https://openreview.net/pdf?id=BkN_r2lR-  (Analogies across domains- Wolf)

https://arxiv.org/pdf/1706.00826.pdf (DistanceGAN- Wolf)

Blogs and video tutorials:

https://affinelayer.com/pixsrv/

https://affinelayer.com/pix2pix/

https://www.youtube.com/watch?v=AxrKVfjSBiA

https://hardikbansal.github.io/CycleGANBlog/ 

https://www.youtube.com/watch?v=JvGysD2EFhw 

Comments:

5. Special architectures for non-local CNNs

This is a set of papers about non-regular ConvNets architectures that make use of non-local data.

Papers:

https://arxiv.org/abs/1711.07971 (Non local neural networks)

http://openaccess.thecvf.com/content_ICCV_2017/papers/Dai_Deformable_Convolutional_Networks_ICCV_2017_paper.pdf (deformable convolutions)

https://arxiv.org/abs/1611.06757 (Non-local denoising: CNNs + nearest neighbors)

Blogs and video tutorials:

https://www.youtube.com/watch?v=HRLMSrxw2To

https://medium.com/@phelixlau/notes-on-deformable-convolutional-networks-baaabbc11cf3 

6. Advanced Recurrent Neural Networks

A set of recent RNN approaches with emphasis on images and video.

Papers:

https://arxiv.org/pdf/1502.02367.pdf (GRU)

https://arxiv.org/pdf/1502.03240.pdf (CRF as RNN)

https://arxiv.org/abs/1801.10308 (Nested LSTMs)

https://arxiv.org/abs/1511.06432 (A convolutional GRU for video)

Blogs and video tutorials:

https://jhui.github.io/2017/03/15/RNN-LSTM-GRU/ 

http://karpathy.github.io/2015/05/21/rnn-effectiveness/ 

Comments:

7. Improving time and memory efficiency

Most papers are from Song Han’s lab at MIT. It is specialized in efficiency of DNNs. They make deep learning possible for mobile devices and find great ways of making feasible computation time for neural networks. Their famous squeeze-net gained a lot of popularity and they have very recent (to be presented at ICLR’18) impressive works.

Papers:

https://arxiv.org/pdf/1602.07360.pdf (squeeze-net 2016)

https://arxiv.org/pdf/1607.04381.pdf (dense-sparse-dense training 2017)

https://arxiv.org/pdf/1612.01064.pdf (trained ternary quantization 2017)

https://openreview.net/pdf?id=HJzgZ3JCW (sparse-Winograd convolution 2018)

https://arxiv.org/pdf/1712.01887.pdf (deep gradient compression 2018)

https://arxiv.org/pdf/1710.07739.pdf (Discrete weights reparameterization trick- Fetaya)

Blogs and video tutorials:

https://www.youtube.com/watch?v=eZdOkDtYMoo 

8. Relations and compositions of objects

These are some recent novel works that present more advanced learning tasks and try to discuss reasoning and understanding in machine learning.

Papers:

https://www.cs.cmu.edu/~imisra/data/composing_cvpr17.pdf (red wine to red tomato)

https://arxiv.org/pdf/1707.03389.pdf (Deepmind- Learning Compositional Visual Concepts)

https://arxiv.org/pdf/1705.03633.pdf (Fei-Fei Lee, Visual reasoning)

https://arxiv.org/pdf/1706.01427.pdf  (Deepmind - a simpodule for relational reasoning)

Blogs and video tutorials:

https://www.youtube.com/watch?v=57KKh2BIFuc 

9. Analyzing and visualizing the loss of neural nets

This is a set of papers that analyzes the weight space, surfaces and minima. Trying to understand why optimization works in such a highly-non-convex surface and how do different architectures influence the optimization.

Papers:

https://arxiv.org/abs/1712.09913 (Visualizing the Loss Landscape of Neural Nets)

https://arxiv.org/pdf/1702.08591.pdf (If resnets are the answer, then what is the question?)

https://arxiv.org/pdf/1702.05777.pdf  (Exponentially vanishing sub-optimal local minima)

https://arxiv.org/pdf/1707.04926.pdf (optimization landscape of over-parameterized shallow neural networks)

Comments:

10. Analyzing and improving GANs:

After understanding the basics of GANs, it is now time to analyze them and find ways to train them better.

Papers:

https://arxiv.org/pdf/1606.03498.pdf (Improved techniques for training GANs by Goodfellow)

https://arxiv.org/pdf/1706.08224.pdf (Do GANs learn distribution?)

http://proceedings.mlr.press/v70/arjovsky17a/arjovsky17a.pdf (Wasserstein GAN)

https://arxiv.org/pdf/1704.00028.pdf (Improved training of Wasserstein GAN)

https://openreview.net/pdf?id=SJx9GQb0- (Improving the Improved Wasserstein GAN)

https://arxiv.org/pdf/1609.03126.pdf (Energy based GANs by LeCun)

11. Detection and Segmentation

Papers:

https://arxiv.org/abs/1311.2524 (R-CNN)

https://arxiv.org/abs/1504.08083 (Fast R-CNN)

https://arxiv.org/abs/1506.01497 (Faster R-CNN)

https://arxiv.org/abs/1703.06870 (Mask R-CNN)

https://arxiv.org/pdf/1506.02640v5.pdf (YOLO)

https://arxiv.org/abs/1612.08242 (YOLO 9000)

https://arxiv.org/abs/1512.02325 (SSD)

Blogs and video tutorials:

https://www.youtube.com/watch?v=nDPWywWRIRo&list=PL3FW7Lu3i5JvHM8ljYj-zLfQRF3EO8sYv 

http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf 

https://www.youtube.com/watch?v=GBu2jofRJtk

https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4 

12. Gradient based optimization- advanced algorithms

Since the basic use of stochastic gradient descent, a lot of advanced techniques appeared. The most popular is ADAM but there are now some more recent approaches. This papers analyze the optimization algorithms and suggests new ones.

Papers:

https://arxiv.org/pdf/1412.6980.pdf (ADAM & ADAMAX)    
https://openreview.net/pdf?id=OM0jvwB8jIp57ZJjtNEZ (NADAM)
https://openreview.net/pdf?id=ryQu7f-RZ (AMSgrad)
https://openreview.net/pdf?id=rJTutzbA-  (Insufficiency of momentum schemes for Optimization)
https://openreview.net/pdf?id=B1YfAfcgl (Entropy-SGD)