Contents:


Style Transfer

A Neural Style Algorithm of Artistic Style

Leon A. Gatys, Alexander S. Ecker, Matthias Bethge : Sep 2015
Source

  • In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities.
  • Then we came across Deep Neural Networks.
  • Higher layers in the network capture the high-level content in terms of objects and their arrangement in the input image. We represent these feature responses as content representation.
    $$\mathcal L_{content}(\vec p,\vec x,l) = \frac12\sum_{i,j}{(F^l_{ij} - P^l_{ij})^2}$$
  • For style we need to capture correlations(given by Gram matrix \(G^l \in \mathcal R^{N_l \times N_l}\) where \(G^l_{ij} = \sum_kF^l_{ik}F^l_{jk}\)) between different filter responses. This representation captures the texture information of the input, but not the global arrangement. This multi-scale representation is called style representation.
    $$E_l = \frac1{4N^2_lM^2_l}\sum_{ij}(G^l_{ij}-A^l_{ij})^2$$
    $$\mathcal L_{style}(\vec a,\vec x) = \sum_{l=0}^Lw_lE_l$$
  • So, we can manipulate both content and style separately.
  • The images are synthesised by finding an image that simultaneously matches the content representation of the photograph and the style representation of the respective piece of art.
    $$\mathcal L_{total}(\vec p,\vec a,\vec x) = \alpha\mathcal L_{content}(\vec p,\vec x) + \beta\mathcal L_{style}(\vec a,\vec x)$$
  • Gallleries
  • Implementations

Improving the Neural Algorithm of Artistic Style

Roman Novak, Yalroslav Nikulin : May 2016
Source

  • Objectives addressed in this paper:
    • Similar areas of the content image should be repainted in a similar way.
    • Different areas should be painted differently.
  • Useful Modifications,
    • A better per-layer content/style weighting scheme.
      • \(w_l^s = 2^{D-d(l)},\quad w_l^c=2^{d(l)}\)
      • This indicates that most important style properties come from bottom layers, while content is mostly represented by activations in the upper layers.
    • Using more layers to capture more style properties.
      • Used all 16 conv layers of VGG-19 for calculating Gram matrices.
    • Using shifted activations when computing Gram matrices to eliminate sparsity and make individual entries more informative and also speed-up style transfer convergence.
      • \(G^l=(F^l+s)(F^l+s)^T\), (where \(s=-1\) for best results).
    • Targeting correlations of features belonging to different layers to capture more feature interactions.
      • \(G^{lk}=F^l[up(F^k)]^T\), if \(X_k \leq X_l\)
      • This blows up the number of definitions of style(\(G\)) to \(2^{16^2}\) for 16 layers of VGG-19.
      • However, experiments also show that tieing in distant layers gives poor results.
    • Correlation Chain
      • Instead of considering all layer combinations, use only a "chained" representation, \(\{G^{l,l-1}|l=2...16\}.\)
      • So, only correlations with immediate neighbors are considered.
    • Blurred Correlations
      • While calculating correlations, the smaller feature layer is upsampled, but even after having the same dimensions, the feature maps may still correspond to features of different scales.
      • To overcome this we use blurring.
      • \(G^{lk}=F^l[blur^{l-k}\circ up(F^k)]^T\)
      • This gives positive results, but it does complicate the objective function and results in slow and unreliable convergence.
  • Some Modifications that did not work out in the end,
    • Gradient Masking
    • Amplifying Activations
    • Adjacent Activations Correlations
    • Content-aware Gram Matrices
    • Gram Cubes
  • Experiments


Preserving Color in Neural Artistic Style Transfer

Leon A. Gatys, Matthias Bethge, Aaron Hertzmann, Eli Shechtman : Jun 2016
Source


  • The original style transfer method also copies the colors of the style image, which might be undesirable in many cases.
  • Approach #1: Color histogram matching
  • Transform style image \((S)\) to match the colors of content image \((C)\). This produces a new style \((S')\). The algorithm remains unchanged otherwise.
  • We have several different options for the initial color transfer.
    • Linear method,
    • \(\mathbf x_{S'}\leftarrow \mathbf Ax_S+\mathbf b\)
    • \(\mathbf b=\mu_C- \mathbf A\mu_S\), where \(\mu_C\) and\(\mu_S\) are mean colors.
    • \(\mathbf A\Sigma_S \mathbf A^T=\Sigma_C\), where \(\Sigma_C\) and \(\Sigma_C\) are pixel covariances.
    • \(\mathbf A\) can be computed using Cholesky decomposition, or by using Image Analogies.
  • Color transfer before style transfer generally gives better results.
  • Approach #2: Luminance-only transfer
  • This approach is motivated by the observation that visual perception is far more sensitive to change in luminance than in color.
  • \(L_S\) and \(L_C\) are luminance channels extracted from the style and content images.
  • Use a YIQ color space, the color information represented by I and Q channels is combined with \(L_T\) to produce the final output image.
  • \(L_{S'}=\frac {\sigma_C}{\sigma_S}(L_S - \mu_S) + \mu_C\)

  • Comparison
  • Linear color transfer onto the style image, before style transfer.
    • Limited by how well the color transfer from content to style works.
  • Style transfer only in the luminance channel.
    • Preserves the colors of content image perfectly.
    • However, dependencies between the luminance and the color channels are lost in the output image.


Comments

comments powered by Disqus