<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="http://blog.lufficc.com/feed.xml" rel="self" type="application/atom+xml" /><link href="http://blog.lufficc.com/" rel="alternate" type="text/html" /><updated>2026-01-02T19:37:50+00:00</updated><id>http://blog.lufficc.com/feed.xml</id><title type="html">Congcong Li’s Blog</title><subtitle>Stay Hungry. Stay Foolish.
</subtitle><author><name>lufficc</name></author><entry xml:lang="zh-CN"><title type="html">Vim 基本命令</title><link href="http://blog.lufficc.com/vim-basic-commands/" rel="alternate" type="text/html" title="Vim 基本命令" /><published>2020-07-18T01:47:00+00:00</published><updated>2020-07-18T01:47:00+00:00</updated><id>http://blog.lufficc.com/vim-basic-commands</id><content type="html" xml:base="http://blog.lufficc.com/vim-basic-commands/"><![CDATA[<p>注意，Vim 区分大小写。</p>

<h2 id="移动">移动</h2>

<h3 id="方向键移动">方向键移动</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">h</code> 或 <code class="language-plaintext highlighter-rouge">←</code> 光标左移</li>
  <li><code class="language-plaintext highlighter-rouge">l</code> 或 <code class="language-plaintext highlighter-rouge">→</code> 光标右移</li>
  <li><code class="language-plaintext highlighter-rouge">j</code> 或 <code class="language-plaintext highlighter-rouge">↓</code> 光标下移</li>
  <li><code class="language-plaintext highlighter-rouge">k</code> 或 <code class="language-plaintext highlighter-rouge">↑</code> 光标上移</li>
</ul>

<h3 id="单词移动">单词移动</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">w</code> (“word”) 光标向右移动一个单词</li>
  <li><code class="language-plaintext highlighter-rouge">b</code> (“back”) 光标向左移动一个单词</li>
  <li><code class="language-plaintext highlighter-rouge">e</code> (“end”) 移动光标到当前单词的最后一个字母</li>
</ul>

<h3 id="行首行末移动">行首行末移动</h3>
<p>类似正则表达式</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">^</code> 移动光标到行首</li>
  <li><code class="language-plaintext highlighter-rouge">$</code> 移动光标到行末</li>
</ul>

<h3 id="屏幕位置移动">屏幕位置移动</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">H</code> (“high”) 移动光标到屏幕上端</li>
  <li><code class="language-plaintext highlighter-rouge">M</code> (“middle”) 移动光标到屏幕中端</li>
  <li><code class="language-plaintext highlighter-rouge">L</code> (“low”) 移动光标到屏幕下端</li>
</ul>

<h3 id="页面滚动">页面滚动</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">Ctrl-f</code> (“forward”)  向下翻页(整个屏幕)</li>
  <li><code class="language-plaintext highlighter-rouge">Ctrl-d</code> (“down”)  向下翻半页(半个屏幕)</li>
  <li><code class="language-plaintext highlighter-rouge">Ctrl-b</code> (“backward”)  向上翻页(整个屏幕)</li>
  <li><code class="language-plaintext highlighter-rouge">Ctrl-u</code> (“up”)  向上翻半页(半个屏幕)</li>
</ul>

<h2 id="插入文本">插入文本</h2>

<ul>
  <li><code class="language-plaintext highlighter-rouge">a</code> 在光标右侧插入文本</li>
  <li><code class="language-plaintext highlighter-rouge">A</code> 在行末插入文本</li>
  <li><code class="language-plaintext highlighter-rouge">i</code> 在光标左侧插入文本</li>
  <li><code class="language-plaintext highlighter-rouge">I</code> 在行首插入文本</li>
  <li><code class="language-plaintext highlighter-rouge">o</code> 在光标下插入新行</li>
  <li><code class="language-plaintext highlighter-rouge">O</code> 在光标上插入新行</li>
</ul>

<h2 id="修改文本">修改文本</h2>

<ul>
  <li><code class="language-plaintext highlighter-rouge">cw</code> 删除当前单词的光标右侧部分，进入编辑模式</li>
  <li><code class="language-plaintext highlighter-rouge">cc</code> 将当前行替换为空行，进入编辑模式</li>
  <li><code class="language-plaintext highlighter-rouge">s</code> 删除当前字母，进入编辑模式</li>
  <li><code class="language-plaintext highlighter-rouge">r</code> 替换当前字母，输入一个字母后自动返回命令模式</li>
</ul>

<h2 id="撤销修改">撤销修改</h2>

<ul>
  <li><code class="language-plaintext highlighter-rouge">u</code> 撤销上次修改</li>
  <li><code class="language-plaintext highlighter-rouge">U</code> 撤销对当前行的所有修改</li>
  <li><code class="language-plaintext highlighter-rouge">Ctrl-r</code> 恢复上次修改</li>
</ul>

<h2 id="删除文本">删除文本</h2>

<h3 id="删除字母">删除字母</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">x</code> 删除光标右侧字母</li>
  <li><code class="language-plaintext highlighter-rouge">X</code> 删除光标左侧字母</li>
</ul>

<h3 id="删除单词">删除单词</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">dw</code> (“delete word”) 删除当前单词的光标右侧部分 (<code class="language-plaintext highlighter-rouge">cw</code> 会进入编辑模式)</li>
  <li><code class="language-plaintext highlighter-rouge">daw</code> (“delete a word”) 删除光标所在的整个单词 (包括该单词后面的空格)</li>
  <li><code class="language-plaintext highlighter-rouge">diw</code> (“delete inside word”) 删除光标所在的整个单词 (不包括该单词后面的空格)</li>
</ul>

<h3 id="删除行">删除行</h3>

<ul>
  <li><code class="language-plaintext highlighter-rouge">dd</code> 删除一行</li>
  <li><code class="language-plaintext highlighter-rouge">dt&lt;char&gt;</code> 删除当前行光标到指定字母 <code class="language-plaintext highlighter-rouge">&lt;char&gt;</code></li>
</ul>

<h2 id="参考">参考</h2>
<ul>
  <li>https://docs.oracle.com/cd/E19683-01/806-7612/editorvi-43/index.html</li>
  <li>https://til.hashrocket.com/posts/fbfwnjxgtd-deleting-words-in-vim</li>
</ul>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[注意，Vim 区分大小写。 移动 方向键移动 h 或 ← 光标左移 l 或 → 光标右移 j 或 ↓ 光标下移 k 或 ↑ 光标上移 单词移动 w (“word”) 光标向右移动一个单词 b (“back”) 光标向左移动一个单词 e (“end”) 移动光标到当前单词的最后一个字母 行首行末移动 类似正则表达式 ^ 移动光标到行首 $ 移动光标到行末 屏幕位置移动 H (“high”) 移动光标到屏幕上端 M (“middle”) 移动光标到屏幕中端 L (“low”) 移动光标到屏幕下端 页面滚动 Ctrl-f (“forward”) 向下翻页(整个屏幕) Ctrl-d (“down”) 向下翻半页(半个屏幕) Ctrl-b (“backward”) 向上翻页(整个屏幕) Ctrl-u (“up”) 向上翻半页(半个屏幕) 插入文本 a 在光标右侧插入文本 A 在行末插入文本 i 在光标左侧插入文本 I 在行首插入文本 o 在光标下插入新行 O 在光标上插入新行 修改文本 cw 删除当前单词的光标右侧部分，进入编辑模式 cc 将当前行替换为空行，进入编辑模式 s 删除当前字母，进入编辑模式 r 替换当前字母，输入一个字母后自动返回命令模式 撤销修改 u 撤销上次修改 U 撤销对当前行的所有修改 Ctrl-r 恢复上次修改 删除文本 删除字母 x 删除光标右侧字母 X 删除光标左侧字母 删除单词 dw (“delete word”) 删除当前单词的光标右侧部分 (cw 会进入编辑模式) daw (“delete a word”) 删除光标所在的整个单词 (包括该单词后面的空格) diw (“delete inside word”) 删除光标所在的整个单词 (不包括该单词后面的空格) 删除行 dd 删除一行 dt&lt;char&gt; 删除当前行光标到指定字母 &lt;char&gt; 参考 https://docs.oracle.com/cd/E19683-01/806-7612/editorvi-43/index.html https://til.hashrocket.com/posts/fbfwnjxgtd-deleting-words-in-vim]]></summary></entry><entry xml:lang="zh-CN"><title type="html">Docker 基本命令</title><link href="http://blog.lufficc.com/docker-basic-commandline/" rel="alternate" type="text/html" title="Docker 基本命令" /><published>2020-07-17T21:47:00+00:00</published><updated>2020-07-17T21:47:00+00:00</updated><id>http://blog.lufficc.com/docker-basic-commandline</id><content type="html" xml:base="http://blog.lufficc.com/docker-basic-commandline/"><![CDATA[<h3 id="下载镜像">下载镜像</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker pull <span class="o">[</span>OPTIONS] NAME[:TAG|@DIGEST]
<span class="c"># eg:</span>
docker pull nvcr.io/nvidia/pytorch:20.06-py3
</code></pre></div></div>

<h3 id="启动镜像">启动镜像</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker run <span class="o">[</span>OPTIONS] IMAGE <span class="o">[</span>COMMAND] <span class="o">[</span>ARG...]
<span class="c"># eg:</span>
docker run <span class="nt">--gpus</span> all <span class="nt">-ti</span> <span class="nt">-v</span> /:/data <span class="nt">--ipc</span><span class="o">=</span>host <span class="nt">-p</span> 8000:8000 <span class="nt">--name</span> lufficc nvcr.io/nvidia/pytorch:20.03-py3
</code></pre></div></div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">-it</code> 交互模式运行</li>
  <li><code class="language-plaintext highlighter-rouge">--rm</code> 容器退出时自动删除此容器</li>
  <li><code class="language-plaintext highlighter-rouge">-v</code> 绑定磁盘，<code class="language-plaintext highlighter-rouge">/:/data</code> 即将容器下 <code class="language-plaintext highlighter-rouge">/data</code> 目录映射到主服务器的 <code class="language-plaintext highlighter-rouge">/</code> 目录</li>
  <li><code class="language-plaintext highlighter-rouge">--ipc</code> IPC mode</li>
  <li><code class="language-plaintext highlighter-rouge">-p</code> 映射端口</li>
  <li><code class="language-plaintext highlighter-rouge">--name</code> 容器名称</li>
</ul>

<h3 id="启动容器">启动容器</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker start <span class="o">[</span>OPTIONS] CONTAINER <span class="o">[</span>CONTAINER...]
<span class="c"># eg:</span>
docker start lufficc
</code></pre></div></div>

<h3 id="停止容器">停止容器</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker stop <span class="o">[</span>OPTIONS] CONTAINER <span class="o">[</span>CONTAINER...]
<span class="c"># eg:</span>
docker stop lufficc
</code></pre></div></div>
<h3 id="显示所有容器">显示所有容器</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker container <span class="nb">ls</span> <span class="nt">-a</span>
</code></pre></div></div>

<h3 id="进入容器">进入容器</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker <span class="nb">exec</span> <span class="nt">-it</span> CONTAINER bash
<span class="c"># 如果使用 zsh</span>
docker <span class="nb">exec</span> <span class="nt">-it</span> CONTAINER zsh
</code></pre></div></div>

<h3 id="删除容器">删除容器</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker <span class="nb">rm </span>CONTAINER
</code></pre></div></div>

<h3 id="删除镜像">删除镜像</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker rmi IMAGE
</code></pre></div></div>

<h3 id="参考">参考</h3>
<ul>
  <li><a href="https://docs.docker.com/engine/reference/commandline/cli/">Docker command line</a></li>
  <li><a href="https://ngc.nvidia.com/catalog/containers/nvidia:pytorch">NVIDIA NGC</a></li>
</ul>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[下载镜像 docker pull [OPTIONS] NAME[:TAG|@DIGEST] # eg: docker pull nvcr.io/nvidia/pytorch:20.06-py3 启动镜像 docker run [OPTIONS] IMAGE [COMMAND] [ARG...] # eg: docker run --gpus all -ti -v /:/data --ipc=host -p 8000:8000 --name lufficc nvcr.io/nvidia/pytorch:20.03-py3 -it 交互模式运行 --rm 容器退出时自动删除此容器 -v 绑定磁盘，/:/data 即将容器下 /data 目录映射到主服务器的 / 目录 --ipc IPC mode -p 映射端口 --name 容器名称 启动容器 docker start [OPTIONS] CONTAINER [CONTAINER...] # eg: docker start lufficc 停止容器 docker stop [OPTIONS] CONTAINER [CONTAINER...] # eg: docker stop lufficc 显示所有容器 docker container ls -a 进入容器 docker exec -it CONTAINER bash # 如果使用 zsh docker exec -it CONTAINER zsh 删除容器 docker rm CONTAINER 删除镜像 docker rmi IMAGE 参考 Docker command line NVIDIA NGC]]></summary></entry><entry xml:lang="zh-CN"><title type="html">SENet: Squeeze-and-Excitation Networks</title><link href="http://blog.lufficc.com/senet/" rel="alternate" type="text/html" title="SENet: Squeeze-and-Excitation Networks" /><published>2020-06-10T17:28:21+00:00</published><updated>2020-06-10T17:28:21+00:00</updated><id>http://blog.lufficc.com/senet</id><content type="html" xml:base="http://blog.lufficc.com/senet/"><![CDATA[<p><a href="https://arxiv.org/abs/1709.01507">Squeeze-and-Excitation Networks</a> 提出了 SENet，进一步提高了 <a href="/resnet">ResNet</a> 的表达能力。</p>

<p><img src="https://static.lufficc.com/2020/06/10/dd1341495d9528ef.png" alt="image.png" /></p>

<p>对于由卷积神经网络（Convolutional Neural Networks）得到的特征图（Feature Map），其每一层通道（Channel）由上一个特征图所有通道经过卷积操作然后<strong>加权相加</strong>得到。不同通道由不同组独立的参数得到，这些参数在当前层并无直接交互，互不影响。</p>

<p>且卷积操作是局部的，而 SENet 用全局的 Global pooling 操作计算权值，动态地调整了特征图不同通道之间的权值，<strong>给予了通道层与层之间直接交互的能力</strong>，提高了表达能力。</p>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[Squeeze-and-Excitation Networks 提出了 SENet，进一步提高了 ResNet 的表达能力。 对于由卷积神经网络（Convolutional Neural Networks）得到的特征图（Feature Map），其每一层通道（Channel）由上一个特征图所有通道经过卷积操作然后加权相加得到。不同通道由不同组独立的参数得到，这些参数在当前层并无直接交互，互不影响。 且卷积操作是局部的，而 SENet 用全局的 Global pooling 操作计算权值，动态地调整了特征图不同通道之间的权值，给予了通道层与层之间直接交互的能力，提高了表达能力。]]></summary></entry><entry xml:lang="zh-CN"><title type="html">ResNet: Deep Residual Learning for Image Recognition</title><link href="http://blog.lufficc.com/resnet/" rel="alternate" type="text/html" title="ResNet: Deep Residual Learning for Image Recognition" /><published>2020-06-10T16:44:58+00:00</published><updated>2020-06-10T16:44:58+00:00</updated><id>http://blog.lufficc.com/resnet</id><content type="html" xml:base="http://blog.lufficc.com/resnet/"><![CDATA[<p><a href="https://arxiv.org/abs/1512.03385">Deep Residual Learning for Image Recognition</a> 一文提出了残差连接，并以此为 building block 构建了 ResNet，大大提高了网络深度，在多项计算机视觉任务中取得最佳成绩。</p>

<!-- more -->

<p><img src="https://static.lufficc.com/2020/06/10/8174b526087d6f62.png" alt="Residual learning: a building block" />{.to-figure}</p>

<p>残差连接如上图所示，即映射 $H(x) = F(x) + x$。ResNet 能保持这么深并且不造成梯度消失，得益于残差连接。<strong>即网络在向后连接的同时，保持对较浅层的引用，那么在梯度传播的时候，就会多一种选择：梯度可以来自函数 $F(x)$，也可以来自 $x$</strong>。$H(x) = F(x) + x$ 是对 $H(x) = F(x)$ 的一种推广，$F(x)$ 能完成的目标函数，$F(x) + x$ 也可以，但多一种选择，多无限可能。</p>

<p><strong>网络结构结构决定一切。</strong></p>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[Deep Residual Learning for Image Recognition 一文提出了残差连接，并以此为 building block 构建了 ResNet，大大提高了网络深度，在多项计算机视觉任务中取得最佳成绩。]]></summary></entry><entry xml:lang="zh-CN"><title type="html">Batch Norm, Layer Norm, Instance Norm, Group Norm</title><link href="http://blog.lufficc.com/norm/" rel="alternate" type="text/html" title="Batch Norm, Layer Norm, Instance Norm, Group Norm" /><published>2020-05-27T23:12:33+00:00</published><updated>2020-05-27T23:12:33+00:00</updated><id>http://blog.lufficc.com/norm</id><content type="html" xml:base="http://blog.lufficc.com/norm/"><![CDATA[<p>自 <a href="https://arxiv.org/abs/1502.03167">Batch Normalization</a> 从 2015 年被 Google 提出来之后，又诞生了很多 Normalization 方法，如 <a href="https://arxiv.org/abs/1607.06450">Layer Normalization</a>, <a href="https://arxiv.org/abs/1607.08022">Instance Normalization</a>, <a href="https://arxiv.org/abs/1803.08494">Group Normalization</a>。 这些方法作用、效果各不相同，但却有着统一的内核和本质：计算输入数据在某些维度上的方差和均值，归一化，最后用可学习参数映射归一化后的特征。这可以统一表达为：</p>

<!-- more -->

\[y = \frac{x - \mathrm{E}[x]}{ \sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta\]

<p>我们以图像数据为例子，给定输入数据 $x \in (N, C, H, W)$, 其中 $N, C, H, W$ 分别为 batch size, 通道数，图像高和宽。</p>

<p><img src="https://static.lufficc.com/2020/06/04/19db89eb950adf3a.png" alt="Normalization methods" /></p>

<p>如上图所示，BN 计算在 $N, H, W$ 维度上的均值方差，LN 计算在 $C, H, W$ 维度上的均值方差，IN 计算在 $H, W$ 维度上的均值方差，GN 计算在 $C’, H, W$ 维度上的均值方差，其中 $C’$ 是分组后的通道个数。</p>

<p><strong>计算维度的不同是这些方法的唯一区别。也正是因为计算维度的不同，也导致了不同的效果和特性。</strong></p>

<ul>
  <li><strong>BN</strong> 计算依赖 $N, H, W$，因此当 batch size 较小时效果可能并不理想，且 batch size 对结果影响较大。</li>
  <li><strong>IN</strong> 计算依赖 $H, W$，不依赖 batch size, 相当于计算每个单独的 instance 不同通道的特征，IN 也因此常用于风格转换。</li>
  <li><strong>LN</strong> 计算依赖 $C, H, W$，舍弃了对 batch size 的依赖，因此常用在 batch size 变化的模型中，如 RNN。另外 LN 与 BN 和 IN 不同的是，BN 和 IN 整个通道用的是同一个标量进行映射，而 LN 通道内每一个元素都采用不同的标量进行映射。因此前者可学习参数的形状为 $(C)$, 而 LN 可学习参数的形状为 $(C \times H \times W)$。</li>
  <li><strong>GN</strong> 首先将通道分组 $(N, C, H, W) \rightarrow (N, G, C’, H, W)$（其中$C’ = \frac{C}{G}$），计算依赖 $C’, H, W$, 显然不依赖 batch size。而将特征分组，有点类似将类似特征归一化（比形状、亮度和纹理等），实验证明 GN 效果很好。而且事实上，GN 可以看作是 IN 和 LN 的中间体：当分组个数等于 1 时相当于不分组，是计算 $C, H, W$ 上的均值方差，而当分组个数等于通道个数时（$G = C$），相当于计算在 $H, W$ 上的均值方差，于是退化成了 IN。</li>
</ul>

<p>我们可以很轻松的用 PyTorch 实现每个方法的等效版本：</p>

<p><strong>BN</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputs = torch.randn(5, 256, 32, 32)  # (N, C, H, W)
bn = nn.BatchNorm2d(256)  # Weight shape: (C, )

# weight default is 1
bn.weight.data = torch.rand_like(bn.weight)

# compute on (N, H, W)
var, mean = torch.var_mean(inputs, dim=(0, 2, 3), keepdim=True, unbiased=False)  # (1, C, 1, 1)
std = (var + bn.eps).sqrt()  # (1, C, 1, 1)
norm = (inputs - mean) / std  # (N, C, H, W)

print(torch.allclose(
    norm * bn.weight.view(1, 256, 1, 1) + bn.bias.view(1, 256, 1, 1),
    bn(inputs))
)  # True
</code></pre></div></div>

<p><strong>IN</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputs = torch.randn(5, 256, 32, 32)  # (N, C, H, W)
ins = nn.InstanceNorm2d(256, affine=True)  # Weight shape: (C, )

# weight default is 1
ins.weight.data = torch.rand_like(ins.weight)

# compute on (H, W)
var, mean = torch.var_mean(inputs, dim=(2, 3), keepdim=True, unbiased=False)  # (N, C, 1, 1)
std = (var + ins.eps).sqrt()  # (N, C, 1, 1)
norm = (inputs - mean) / std  # (N, C, H, W)

print(torch.allclose(
    norm * ins.weight.view(1, 256, 1, 1) + ins.bias.view(1, 256, 1, 1),
    ins(inputs))
)  # True
</code></pre></div></div>

<p><strong>LN</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputs = torch.randn(5, 256, 32, 32)  # (N, C, H, W)
normalized_shape = inputs.shape[1:]  # Normalize on (C, H, W)
ln = nn.LayerNorm(normalized_shape)

# weight default is 1
ln.weight.data = torch.rand_like(ln.weight)

# compute on (C, H, W)
var, mean = torch.var_mean(inputs, dim=(1, 2, 3), keepdim=True, unbiased=False)  # (N, 1, 1, 1)
std = (var + ln.eps).sqrt()  # (N, 1, 1, 1)
norm = (inputs - mean) / std  # (N, C, H, W)

print(torch.allclose(norm * ln.weight + ln.bias, ln(inputs)))  # True
</code></pre></div></div>

<p><strong>GN</strong></p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>inputs = torch.randn(5, 256, 32, 32)  # (N, C, H, W)
num_groups = 32
bn = nn.GroupNorm(num_groups=num_groups, num_channels=256)  # Weight shape: (C, )

# weight default is 1
bn.weight.data = torch.rand_like(bn.weight)

grouped_inputs = inputs.view(5, num_groups, 256 // num_groups, 32, 32)  # (N, G, C', H, W)

# compute on (C', H, W)
var, mean = torch.var_mean(grouped_inputs, dim=(2, 3, 4), keepdim=True, unbiased=False)  # (N, G, 1, 1, 1)
std = (var + bn.eps).sqrt()  # # (N, G, 1, 1, 1)
norm = (grouped_inputs - mean) / std  # (N, G, C', H, W)

print(torch.allclose(
    norm.view(5, 256, 32, 32) * bn.weight.view(1, 256, 1, 1) + bn.bias.view(1, 256, 1, 1),
    bn(inputs))
)  # True
</code></pre></div></div>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[自 Batch Normalization 从 2015 年被 Google 提出来之后，又诞生了很多 Normalization 方法，如 Layer Normalization, Instance Normalization, Group Normalization。 这些方法作用、效果各不相同，但却有着统一的内核和本质：计算输入数据在某些维度上的方差和均值，归一化，最后用可学习参数映射归一化后的特征。这可以统一表达为：]]></summary></entry><entry xml:lang="zh-CN"><title type="html">如何上传论文到 arXiv</title><link href="http://blog.lufficc.com/how-to-obtain-and-use-the-bbl-file-for-arxiv-submission/" rel="alternate" type="text/html" title="如何上传论文到 arXiv" /><published>2020-04-04T22:41:00+00:00</published><updated>2020-04-04T22:41:00+00:00</updated><id>http://blog.lufficc.com/how-to-obtain-and-use-the-bbl-file-for-arxiv-submission</id><content type="html" xml:base="http://blog.lufficc.com/how-to-obtain-and-use-the-bbl-file-for-arxiv-submission/"><![CDATA[<p>上传你的论文到 <a href="https://arxiv.org/">arXiv</a>，看似麻烦，实则简单。</p>

<h3 id="步骤">步骤</h3>

<p>新建文件夹，将  <code class="language-plaintext highlighter-rouge">mwe.tex</code> 和 <code class="language-plaintext highlighter-rouge">mwe.bib</code> 放进去。</p>

<p>文件 <code class="language-plaintext highlighter-rouge">mwe.bib</code>：</p>

<div class="language-tex highlighter-rouge"><div class="highlight"><pre class="highlight"><code>@Book<span class="p">{</span>Goossens,
  author    = <span class="p">{</span>Goossens, Michel and Mittelbach, Frank and 
               Samarin, Alexander<span class="p">}</span>,
  title     = <span class="p">{</span>The LaTeX Companion<span class="p">}</span>,
  edition   = <span class="p">{</span>1<span class="p">}</span>,
  publisher = <span class="p">{</span>Addison-Wesley<span class="p">}</span>,
  location  = <span class="p">{</span>Reading, Mass.<span class="p">}</span>,
  year      = <span class="p">{</span>1994<span class="p">}</span>,
<span class="p">}</span>
@Book<span class="p">{</span>adams,
  title     = <span class="p">{</span>The Restaurant at the End of the Universe<span class="p">}</span>,
  author    = <span class="p">{</span>Douglas Adams<span class="p">}</span>,
  series    = <span class="p">{</span>The Hitchhiker's Guide to the Galaxy<span class="p">}</span>,
  publisher = <span class="p">{</span>Pan Macmillan<span class="p">}</span>,
  year      = <span class="p">{</span>1980<span class="p">}</span>,
<span class="p">}</span>
</code></pre></div></div>

<p>文件 <code class="language-plaintext highlighter-rouge">mwe.tex</code>：</p>

<div class="language-tex highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">\documentclass</span><span class="na">[10pt,a4paper]</span><span class="p">{</span>article<span class="p">}</span>

<span class="k">\usepackage</span><span class="p">{</span>hyperref<span class="p">}</span> <span class="c">% for better urls</span>


<span class="nt">\begin{document}</span>

This is text with <span class="k">\cite</span><span class="p">{</span>Goossens<span class="p">}</span> and <span class="k">\cite</span><span class="p">{</span>adams<span class="p">}</span>.

<span class="k">\nocite</span><span class="p">{</span>*<span class="p">}</span> <span class="c">% to test all bib entrys</span>
<span class="k">\bibliographystyle</span><span class="p">{</span>unsrt<span class="p">}</span>
<span class="k">\bibliography</span><span class="p">{</span>mwe<span class="p">}</span> <span class="c">% file mwe.bib</span>

<span class="nt">\end{document}</span>
</code></pre></div></div>

<p>首先运行 <code class="language-plaintext highlighter-rouge">pdflatex mwe.tex</code>，会生成一系列新文件，例如  <code class="language-plaintext highlighter-rouge">mwe.auc</code>。</p>

<p>然后运行<code class="language-plaintext highlighter-rouge">bibtex mwe</code>。 BiBTeX 会编译生成新文件： <code class="language-plaintext highlighter-rouge">mwe.bbl</code> 和 <code class="language-plaintext highlighter-rouge">mwe.blg</code>. <code class="language-plaintext highlighter-rouge">mwe.blg</code> 是 BiBTeX 运行产生的日志文件， <code class="language-plaintext highlighter-rouge">mwe.bbl</code> 是提交需要的文件。</p>

<p>文件 <code class="language-plaintext highlighter-rouge">mwe.bbl</code> 内容：</p>

<div class="language-tex highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">\begin{thebibliography}</span><span class="p">{</span>1<span class="p">}</span>

<span class="k">\bibitem</span><span class="p">{</span>Goossens<span class="p">}</span>
Michel Goossens, Frank Mittelbach, and Alexander Samarin.
<span class="k">\newblock</span> <span class="p">{</span><span class="k">\em</span> The LaTeX Companion<span class="p">}</span>.
<span class="k">\newblock</span> Addison-Wesley, 1 edition, 1994.

<span class="k">\bibitem</span><span class="p">{</span>adams<span class="p">}</span>
Douglas Adams.
<span class="k">\newblock</span> <span class="p">{</span><span class="k">\em</span> The Restaurant at the End of the Universe<span class="p">}</span>.
<span class="k">\newblock</span> The Hitchhiker's Guide to the Galaxy. Pan Macmillan, 1980.

<span class="nt">\end{thebibliography}</span>
</code></pre></div></div>

<p><strong>第二次</strong>运行 <code class="language-plaintext highlighter-rouge">pdflatex mwe.tex</code> 得到正确的页码。</p>

<p>复制 <code class="language-plaintext highlighter-rouge">mwe.tex</code> 到 <code class="language-plaintext highlighter-rouge">mwe-arxiv.tex</code> 并删除对 <code class="language-plaintext highlighter-rouge">bibtex</code> 的引用，复制 <code class="language-plaintext highlighter-rouge">mwe.bbl</code> 文件内容或引用 <code class="language-plaintext highlighter-rouge">mwe.bbl</code> 文件到 <code class="language-plaintext highlighter-rouge">mwe-arxiv.tex</code>。</p>

<p>直接复制<code class="language-plaintext highlighter-rouge">mwe.bbl</code> 到 <code class="language-plaintext highlighter-rouge">mwe-arxiv.tex</code> 文件（适合引用较少，此时上传文件只需要包含<code class="language-plaintext highlighter-rouge">mwe-arxiv.tex</code>即可）：</p>

<div class="language-tex highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">\documentclass</span><span class="na">[10pt,a4paper]</span><span class="p">{</span>article<span class="p">}</span>

<span class="k">\usepackage</span><span class="p">{</span>hyperref<span class="p">}</span> <span class="c">% for better urls</span>


<span class="nt">\begin{document}</span>

This is text with <span class="k">\cite</span><span class="p">{</span>Goossens<span class="p">}</span> and <span class="k">\cite</span><span class="p">{</span>adams<span class="p">}</span>.

<span class="k">\nocite</span><span class="p">{</span>*<span class="p">}</span> <span class="c">% to test all bib entrys</span>
<span class="c">%\bibliographystyle{unsrt} % &lt;======================== not longer needed!</span>
<span class="c">%\bibliography{\jobname} % &lt;========================== not longer needed!</span>
<span class="nt">\begin{thebibliography}</span><span class="p">{</span>1<span class="p">}</span> <span class="c">% &lt;================================== mwe.bbl</span>

<span class="k">\bibitem</span><span class="p">{</span>Goossens<span class="p">}</span>
Michel Goossens, Frank Mittelbach, and Alexander Samarin.
<span class="k">\newblock</span> <span class="p">{</span><span class="k">\em</span> The LaTeX Companion<span class="p">}</span>.
<span class="k">\newblock</span> Addison-Wesley, 1 edition, 1994.

<span class="k">\bibitem</span><span class="p">{</span>adams<span class="p">}</span>
Douglas Adams.
<span class="k">\newblock</span> <span class="p">{</span><span class="k">\em</span> The Restaurant at the End of the Universe<span class="p">}</span>.
<span class="k">\newblock</span> The Hitchhiker's Guide to the Galaxy. Pan Macmillan, 1980.

<span class="nt">\end{thebibliography}</span> <span class="c">% &lt;======================================= mwe.bbl</span>

<span class="nt">\end{document}</span>
</code></pre></div></div>

<p>使用 <code class="language-plaintext highlighter-rouge">\input{mwe.bbl}</code> （上传文件包含<code class="language-plaintext highlighter-rouge">mwe-arxiv.tex</code> 和 <code class="language-plaintext highlighter-rouge">mwe.bbl</code>）：</p>

<div class="language-tex highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">\documentclass</span><span class="na">[10pt,a4paper]</span><span class="p">{</span>article<span class="p">}</span>

<span class="k">\usepackage</span><span class="p">{</span>hyperref<span class="p">}</span> <span class="c">% for better urls</span>


<span class="nt">\begin{document}</span>

This is text with <span class="k">\cite</span><span class="p">{</span>Goossens<span class="p">}</span> and <span class="k">\cite</span><span class="p">{</span>adams<span class="p">}</span>.

<span class="k">\nocite</span><span class="p">{</span>*<span class="p">}</span> <span class="c">% to test all bib entrys</span>
<span class="c">%\bibliographystyle{unsrt} % &lt;======================== not longer needed!</span>
<span class="c">%\bibliography{\jobname} % &lt;========================== not longer needed!</span>
<span class="k">\input</span><span class="p">{</span>mwe.bbl<span class="p">}</span> <span class="c">% &lt;============================================= mwe.bbl</span>

<span class="nt">\end{document}</span>
</code></pre></div></div>

<h3 id="总结">总结</h3>

<ol>
  <li>首先运行 <code class="language-plaintext highlighter-rouge">pdflatex mwe.tex</code> 和 <code class="language-plaintext highlighter-rouge">bibtex mwe</code> 生成必要文件。</li>
  <li>然后运行 <code class="language-plaintext highlighter-rouge">pdflatex mwe.tex</code> 得到正确的页码。</li>
  <li>将文件<code class="language-plaintext highlighter-rouge">mwe.bbl</code> 引入到 <code class="language-plaintext highlighter-rouge">mwe.tex</code>。</li>
  <li>上传<code class="language-plaintext highlighter-rouge">mwe.tex</code> 和 <code class="language-plaintext highlighter-rouge">mwe.bbl</code> 文件（其他 Figures 文件正常引入即可）。</li>
</ol>

<h3 id="tips">Tips</h3>

<p>支持的 Figure 格式：</p>

<ul>
  <li>PostScript (PS, EPS) — requires <a href="https://arxiv.org/help/submit_tex#latex">LaTeX processing</a></li>
  <li>JPEG, GIF, PNG or PDF figures — requires <a href="https://arxiv.org/help/submit_tex#pdflatex">PDFLaTeX processing</a></li>
</ul>

<p>文件名只允许包含 <code class="language-plaintext highlighter-rouge">a-z A-Z 0-9 _ + - . , =</code> 字符，<code class="language-plaintext highlighter-rouge">Figure1.PDF</code> and <code class="language-plaintext highlighter-rouge">figure1.pdf</code> 不是同一个文件。</p>

<p>Schedule(all times Eastern US):</p>

<table>
  <thead>
    <tr>
      <th>Submissions received between</th>
      <th>Will be announced</th>
      <th>Mailed to subscribers</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Monday 14:00 – Tuesday 14:00</td>
      <td>Tuesday 20:00</td>
      <td>Tuesday night / Wednesday morning</td>
    </tr>
    <tr>
      <td>Tuesday 14:00 – Wednesday 14:00</td>
      <td>Wednesday 20:00</td>
      <td>Wednesday night / Thursday morning</td>
    </tr>
    <tr>
      <td>Wednesday 14:00 – Thursday 14:00</td>
      <td>Thursday 20:00</td>
      <td>Thursday night / Friday morning</td>
    </tr>
    <tr>
      <td>Thursday 14:00 – Friday 14:00</td>
      <td>Sunday 20:00</td>
      <td>Sunday night / Monday morning</td>
    </tr>
    <tr>
      <td>Friday 14:00 – Monday 14:00</td>
      <td>Monday 20:00</td>
      <td>Monday night / Tuesday morning</td>
    </tr>
  </tbody>
</table>

<h3 id="参考">参考</h3>

<ol>
  <li>
    <p>https://tex.stackexchange.com/questions/329198/how-to-obtain-and-use-the-bbl-file-in-my-tex-document-for-arxiv-submission</p>
  </li>
  <li>
    <p>https://arxiv.org/help/submit</p>
  </li>
</ol>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[上传你的论文到 arXiv，看似麻烦，实则简单。 步骤 新建文件夹，将 mwe.tex 和 mwe.bib 放进去。 文件 mwe.bib： @Book{Goossens, author = {Goossens, Michel and Mittelbach, Frank and Samarin, Alexander}, title = {The LaTeX Companion}, edition = {1}, publisher = {Addison-Wesley}, location = {Reading, Mass.}, year = {1994}, } @Book{adams, title = {The Restaurant at the End of the Universe}, author = {Douglas Adams}, series = {The Hitchhiker's Guide to the Galaxy}, publisher = {Pan Macmillan}, year = {1980}, } 文件 mwe.tex： \documentclass[10pt,a4paper]{article} \usepackage{hyperref} % for better urls \begin{document} This is text with \cite{Goossens} and \cite{adams}. \nocite{*} % to test all bib entrys \bibliographystyle{unsrt} \bibliography{mwe} % file mwe.bib \end{document} 首先运行 pdflatex mwe.tex，会生成一系列新文件，例如 mwe.auc。 然后运行bibtex mwe。 BiBTeX 会编译生成新文件： mwe.bbl 和 mwe.blg. mwe.blg 是 BiBTeX 运行产生的日志文件， mwe.bbl 是提交需要的文件。 文件 mwe.bbl 内容： \begin{thebibliography}{1} \bibitem{Goossens} Michel Goossens, Frank Mittelbach, and Alexander Samarin. \newblock {\em The LaTeX Companion}. \newblock Addison-Wesley, 1 edition, 1994. \bibitem{adams} Douglas Adams. \newblock {\em The Restaurant at the End of the Universe}. \newblock The Hitchhiker's Guide to the Galaxy. Pan Macmillan, 1980. \end{thebibliography} 第二次运行 pdflatex mwe.tex 得到正确的页码。 复制 mwe.tex 到 mwe-arxiv.tex 并删除对 bibtex 的引用，复制 mwe.bbl 文件内容或引用 mwe.bbl 文件到 mwe-arxiv.tex。 直接复制mwe.bbl 到 mwe-arxiv.tex 文件（适合引用较少，此时上传文件只需要包含mwe-arxiv.tex即可）： \documentclass[10pt,a4paper]{article} \usepackage{hyperref} % for better urls \begin{document} This is text with \cite{Goossens} and \cite{adams}. \nocite{*} % to test all bib entrys %\bibliographystyle{unsrt} % &lt;======================== not longer needed! %\bibliography{\jobname} % &lt;========================== not longer needed! \begin{thebibliography}{1} % &lt;================================== mwe.bbl \bibitem{Goossens} Michel Goossens, Frank Mittelbach, and Alexander Samarin. \newblock {\em The LaTeX Companion}. \newblock Addison-Wesley, 1 edition, 1994. \bibitem{adams} Douglas Adams. \newblock {\em The Restaurant at the End of the Universe}. \newblock The Hitchhiker's Guide to the Galaxy. Pan Macmillan, 1980. \end{thebibliography} % &lt;======================================= mwe.bbl \end{document} 使用 \input{mwe.bbl} （上传文件包含mwe-arxiv.tex 和 mwe.bbl）： \documentclass[10pt,a4paper]{article} \usepackage{hyperref} % for better urls \begin{document} This is text with \cite{Goossens} and \cite{adams}. \nocite{*} % to test all bib entrys %\bibliographystyle{unsrt} % &lt;======================== not longer needed! %\bibliography{\jobname} % &lt;========================== not longer needed! \input{mwe.bbl} % &lt;============================================= mwe.bbl \end{document} 总结 首先运行 pdflatex mwe.tex 和 bibtex mwe 生成必要文件。 然后运行 pdflatex mwe.tex 得到正确的页码。 将文件mwe.bbl 引入到 mwe.tex。 上传mwe.tex 和 mwe.bbl 文件（其他 Figures 文件正常引入即可）。 Tips 支持的 Figure 格式： PostScript (PS, EPS) — requires LaTeX processing JPEG, GIF, PNG or PDF figures — requires PDFLaTeX processing 文件名只允许包含 a-z A-Z 0-9 _ + - . , = 字符，Figure1.PDF and figure1.pdf 不是同一个文件。 Schedule(all times Eastern US): Submissions received between Will be announced Mailed to subscribers Monday 14:00 – Tuesday 14:00 Tuesday 20:00 Tuesday night / Wednesday morning Tuesday 14:00 – Wednesday 14:00 Wednesday 20:00 Wednesday night / Thursday morning Wednesday 14:00 – Thursday 14:00 Thursday 20:00 Thursday night / Friday morning Thursday 14:00 – Friday 14:00 Sunday 20:00 Sunday night / Monday morning Friday 14:00 – Monday 14:00 Monday 20:00 Monday night / Tuesday morning 参考 https://tex.stackexchange.com/questions/329198/how-to-obtain-and-use-the-bbl-file-in-my-tex-document-for-arxiv-submission https://arxiv.org/help/submit]]></summary></entry><entry xml:lang="zh-CN"><title type="html">TensorFlow 的 TFRecord 和 QueueRunner 简介</title><link href="http://blog.lufficc.com/tf-record-and-queue_runner/" rel="alternate" type="text/html" title="TensorFlow 的 TFRecord 和 QueueRunner 简介" /><published>2017-10-31T23:36:00+00:00</published><updated>2017-10-31T23:36:00+00:00</updated><id>http://blog.lufficc.com/tf-record-and-queue_runner</id><content type="html" xml:base="http://blog.lufficc.com/tf-record-and-queue_runner/"><![CDATA[<p>通常我们下载的数据集都是以压缩文件的格式存在，解压后会有多个文件夹，像 <code class="language-plaintext highlighter-rouge">train</code>， <code class="language-plaintext highlighter-rouge">test</code>， <code class="language-plaintext highlighter-rouge">val</code> 等等。而文件也有可能多达数万或者数百万个。这种形式的数据集不但读取复杂、慢，而且占用磁盘空间。这时二进制的格式文件的优点便显现出来了。我们可以把数据集存储为<strong>一个二进制文件</strong>，这样就没有了 <code class="language-plaintext highlighter-rouge">train</code>， <code class="language-plaintext highlighter-rouge">test</code>， <code class="language-plaintext highlighter-rouge">val</code> 等等的文件夹。更重要的是，这些数据只会占据一块内存（Block of Memory），而不需要一个一个单独加载文件。因此使用<strong>二进制文件</strong>效率更高。</p>

<p>你以为 TensorFlow 都为你封装好二进制文件文件的读写、解析方式了吗？是的，都封装好了~本文就是介绍如何将数据转换为 TFRecord 格式。</p>

<h2 id="cifar-10-数据集">CIFAR-10 数据集</h2>
<p>本文以 CIFAR-10 数据集为例，什么是 CIFAR-10 数据集？看这儿 =&gt; <a href="https://lufficc.com/blog/machine-learning-image-datasets">图像数据集</a> ~</p>

<p>假设你已经有了以下数据：</p>

<p><img src="https://static.lufficc.com/image/QOHlSIeeJVFPfHmi2DAMFJEEh9H5If9roTNQloH7.png" alt="CIFAR-10" /></p>

<h2 id="写保存为-tfrecord-格式">写，保存为 TFRecord 格式</h2>
<p>定义的一些常量：</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_NUM_TRAIN_FILES</span> <span class="o">=</span> <span class="mi">5</span>
<span class="c1"># The height and width of each image.
</span><span class="n">_IMAGE_SIZE</span> <span class="o">=</span> <span class="mi">32</span>
<span class="c1"># The names of the classes.
</span><span class="n">_CLASS_NAMES</span> <span class="o">=</span> <span class="p">[</span>
    <span class="s">'airplane'</span><span class="p">,</span>
    <span class="s">'automobile'</span><span class="p">,</span>
    <span class="s">'bird'</span><span class="p">,</span>
    <span class="s">'cat'</span><span class="p">,</span>
    <span class="s">'deer'</span><span class="p">,</span>
    <span class="s">'dog'</span><span class="p">,</span>
    <span class="s">'frog'</span><span class="p">,</span>
    <span class="s">'horse'</span><span class="p">,</span>
    <span class="s">'ship'</span><span class="p">,</span>
    <span class="s">'truck'</span><span class="p">,</span>
<span class="p">]</span>
</code></pre></div></div>
<p>这里我们创建两个 split 文件，分别存储 train 和 test 需要的数据：</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dataset_dir</span> <span class="o">=</span> <span class="s">'data'</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">tf</span><span class="p">.</span><span class="n">gfile</span><span class="p">.</span><span class="n">Exists</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">):</span>
    <span class="n">tf</span><span class="p">.</span><span class="n">gfile</span><span class="p">.</span><span class="n">MakeDirs</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">)</span>
<span class="n">training_filename</span> <span class="o">=</span> <span class="n">_get_output_filename</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">,</span> <span class="s">'train'</span><span class="p">)</span>
<span class="n">testing_filename</span> <span class="o">=</span> <span class="n">_get_output_filename</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">,</span> <span class="s">'test'</span><span class="p">)</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">_get_output_filename</code> 函数用来生成文件名：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>def _get_output_filename(dataset_dir, split_name):
    """Creates the output filename.
    Args:
      dataset_dir: The dataset directory where the dataset is stored.
      split_name: The name of the train/test split.
    Returns:
      An absolute file path.
    """
    return '%s/cifar10_%s.tfrecord' % (dataset_dir, split_name)
</code></pre></div></div>
<p>然后，处理训练数据：</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># First, process the training data:
</span><span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">python_io</span><span class="p">.</span><span class="n">TFRecordWriter</span><span class="p">(</span><span class="n">training_filename</span><span class="p">)</span> <span class="k">as</span> <span class="n">tfrecord_writer</span><span class="p">:</span>
    <span class="n">offset</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">_NUM_TRAIN_FILES</span><span class="p">):</span>
        <span class="n">filename</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="s">'./cifar-10-batches-py'</span><span class="p">,</span> <span class="s">'data_batch_%d'</span> <span class="o">%</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">))</span>
        <span class="n">offset</span> <span class="o">=</span> <span class="n">_add_to_tfrecord</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">tfrecord_writer</span><span class="p">,</span> <span class="n">offset</span><span class="p">)</span>
</code></pre></div></div>
<p>即依次读取 <code class="language-plaintext highlighter-rouge">data_batch_?</code> 文件，调用 <code class="language-plaintext highlighter-rouge">_add_to_tfrecord</code> 将其保存为 TFRecord 格式。</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">_add_to_tfrecord</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">tfrecord_writer</span><span class="p">,</span> <span class="n">offset</span><span class="o">=</span><span class="mi">0</span><span class="p">):</span>
    <span class="s">"""Loads data from the cifar10 pickle files and writes files to a TFRecord.
    Args:
      filename: The filename of the cifar10 pickle file.
      tfrecord_writer: The TFRecord writer to use for writing.
      offset: An offset into the absolute number of images previously written.
    Returns:
      The new offset.
    """</span>
    <span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">gfile</span><span class="p">.</span><span class="n">Open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="n">data</span> <span class="o">=</span> <span class="n">pickle</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="s">'bytes'</span><span class="p">)</span>
    <span class="n">images</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="sa">b</span><span class="s">'data'</span><span class="p">]</span>
    <span class="n">num_images</span> <span class="o">=</span> <span class="n">images</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>

    <span class="n">images</span> <span class="o">=</span> <span class="n">images</span><span class="p">.</span><span class="n">reshape</span><span class="p">((</span><span class="n">num_images</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">32</span><span class="p">))</span>
    <span class="n">labels</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="sa">b</span><span class="s">'labels'</span><span class="p">]</span>
    <span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">Graph</span><span class="p">().</span><span class="n">as_default</span><span class="p">():</span>
        <span class="n">image_placeholder</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">placeholder</span><span class="p">(</span><span class="n">tf</span><span class="p">.</span><span class="n">uint8</span><span class="p">)</span>
        <span class="n">encoded_image</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">image</span><span class="p">.</span><span class="n">encode_png</span><span class="p">(</span><span class="n">image_placeholder</span><span class="p">)</span>
        <span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
            <span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">num_images</span><span class="p">):</span>
                <span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="s">'</span><span class="se">\r</span><span class="s">&gt;&gt; Reading file [%s] image %d/%d'</span> <span class="o">%</span> <span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="n">offset</span> <span class="o">+</span> <span class="n">j</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="n">offset</span> <span class="o">+</span> <span class="n">num_images</span><span class="p">))</span>
                <span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="n">flush</span><span class="p">()</span>
                <span class="n">image</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">squeeze</span><span class="p">(</span><span class="n">images</span><span class="p">[</span><span class="n">j</span><span class="p">]).</span><span class="n">transpose</span><span class="p">((</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
                <span class="n">label</span> <span class="o">=</span> <span class="n">labels</span><span class="p">[</span><span class="n">j</span><span class="p">]</span>
                <span class="n">png_string</span> <span class="o">=</span> <span class="n">sess</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">encoded_image</span><span class="p">,</span> <span class="n">feed_dict</span><span class="o">=</span><span class="p">{</span><span class="n">image_placeholder</span><span class="p">:</span> <span class="n">image</span><span class="p">})</span>
                <span class="n">example</span> <span class="o">=</span> <span class="n">image_to_tfexample</span><span class="p">(</span><span class="n">png_string</span><span class="p">,</span> <span class="sa">b</span><span class="s">'png'</span><span class="p">,</span> <span class="n">_IMAGE_SIZE</span><span class="p">,</span> <span class="n">_IMAGE_SIZE</span><span class="p">,</span> <span class="n">label</span><span class="p">)</span>
                <span class="n">tfrecord_writer</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">example</span><span class="p">.</span><span class="n">SerializeToString</span><span class="p">())</span>
    <span class="k">return</span> <span class="n">offset</span> <span class="o">+</span> <span class="n">num_images</span>
</code></pre></div></div>
<p>因为 CIFAR-10 数据集的图片是 <code class="language-plaintext highlighter-rouge">10000x3072 numpy array</code> 格式的，因此需要 <code class="language-plaintext highlighter-rouge">reshape</code> 为 <code class="language-plaintext highlighter-rouge">tf.image.encode_png</code> 需要的格式：<code class="language-plaintext highlighter-rouge">[height, width, channels]</code>。 <code class="language-plaintext highlighter-rouge">tf.image.encode_png</code> 返回编码后的字符串，然后还需要保存图片的宽高、格式信息。调用 <code class="language-plaintext highlighter-rouge">image_to_tfexample</code> 将这些数据保存到 <code class="language-plaintext highlighter-rouge">tf.train.Example</code> 中：</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">image_to_tfexample</span><span class="p">(</span><span class="n">image_data</span><span class="p">,</span> <span class="n">image_format</span><span class="p">,</span> <span class="n">height</span><span class="p">,</span> <span class="n">width</span><span class="p">,</span> <span class="n">class_id</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Example</span><span class="p">(</span><span class="n">features</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Features</span><span class="p">(</span><span class="n">feature</span><span class="o">=</span><span class="p">{</span>
        <span class="s">'image/encoded'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Feature</span><span class="p">(</span><span class="n">bytes_list</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">BytesList</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="p">[</span><span class="n">image_data</span><span class="p">])),</span>
        <span class="s">'image/format'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Feature</span><span class="p">(</span><span class="n">bytes_list</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">BytesList</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="p">[</span><span class="n">image_format</span><span class="p">])),</span>
        <span class="s">'image/class/label'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Feature</span><span class="p">(</span><span class="n">int64_list</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Int64List</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="p">[</span><span class="n">class_id</span><span class="p">])),</span>
        <span class="s">'image/height'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Feature</span><span class="p">(</span><span class="n">int64_list</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Int64List</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="p">[</span><span class="n">height</span><span class="p">])),</span>
        <span class="s">'image/width'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Feature</span><span class="p">(</span><span class="n">int64_list</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Int64List</span><span class="p">(</span><span class="n">value</span><span class="o">=</span><span class="p">[</span><span class="n">width</span><span class="p">])),</span>
    <span class="p">}))</span>
</code></pre></div></div>
<p>TensorFlow 会将数据转换为 <code class="language-plaintext highlighter-rouge">tf.train.Example</code> Protobuf 对象，<code class="language-plaintext highlighter-rouge">Example</code> 包含 <code class="language-plaintext highlighter-rouge">Features</code>， Features 包含 一个 <code class="language-plaintext highlighter-rouge">dict</code>，来区分不同的 <code class="language-plaintext highlighter-rouge">Feature</code>。<code class="language-plaintext highlighter-rouge">Feature</code> 可以包含 <code class="language-plaintext highlighter-rouge">FloatList</code>，<code class="language-plaintext highlighter-rouge">ByteList</code> 或者 <code class="language-plaintext highlighter-rouge">Int64List</code>。注意这里的 key，<code class="language-plaintext highlighter-rouge">image/encoded</code>，<code class="language-plaintext highlighter-rouge">image/format</code>等，是可以随便定义的，这里是TensorFlow 默认的图片数据集的 key ,我们一般采取 TensorFlow 默认的值。</p>

<p>有了 <code class="language-plaintext highlighter-rouge">example</code>，我们将其转换为字符串写入到文件就完成了整个 TFRecord 格式文件的制作。</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tfrecord_writer.write(example.SerializeToString())
</code></pre></div></div>

<p>同理，制作测试数据集：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Next, process the testing data:
with tf.python_io.TFRecordWriter(testing_filename) as tfrecord_writer:
    filename = os.path.join('./cifar-10-batches-py', 'test_batch')
    _add_to_tfrecord(filename, tfrecord_writer)
</code></pre></div></div>
<p>最后，我们会得到两个文件：</p>

<p><img src="https://static.lufficc.com/image/0s1p9yWfcdqheX9Ju2DbE7QmbcK4hLvkSqAo2e5n.png" alt="file" /></p>

<p>这就是最后的 TFRecord 格式文件，二进制文件。</p>

<h1 id="读">读</h1>
<p>最简单的就是直接读取：</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">reconstructed_images</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">record_iterator</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">python_io</span><span class="p">.</span><span class="n">tf_record_iterator</span><span class="p">(</span><span class="n">path</span><span class="o">=</span><span class="s">'./data/cifar10_train.tfrecord'</span><span class="p">)</span>
<span class="k">for</span> <span class="n">string_iterator</span> <span class="ow">in</span> <span class="n">record_iterator</span><span class="p">:</span>
    <span class="n">example</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">Example</span><span class="p">()</span>
    <span class="n">example</span><span class="p">.</span><span class="n">ParseFromString</span><span class="p">(</span><span class="n">string_iterator</span><span class="p">)</span>
    <span class="n">height</span> <span class="o">=</span> <span class="n">example</span><span class="p">.</span><span class="n">features</span><span class="p">.</span><span class="n">feature</span><span class="p">[</span><span class="s">'image/height'</span><span class="p">].</span><span class="n">int64_list</span><span class="p">.</span><span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">width</span> <span class="o">=</span> <span class="n">example</span><span class="p">.</span><span class="n">features</span><span class="p">.</span><span class="n">feature</span><span class="p">[</span><span class="s">'image/width'</span><span class="p">].</span><span class="n">int64_list</span><span class="p">.</span><span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">png_string</span> <span class="o">=</span> <span class="n">example</span><span class="p">.</span><span class="n">features</span><span class="p">.</span><span class="n">feature</span><span class="p">[</span><span class="s">'image/encoded'</span><span class="p">].</span><span class="n">bytes_list</span><span class="p">.</span><span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">label</span> <span class="o">=</span> <span class="n">example</span><span class="p">.</span><span class="n">features</span><span class="p">.</span><span class="n">feature</span><span class="p">[</span><span class="s">'image/class/label'</span><span class="p">].</span><span class="n">int64_list</span><span class="p">.</span><span class="n">value</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
        <span class="n">image_placeholder</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">placeholder</span><span class="p">(</span><span class="n">dtype</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">string</span><span class="p">)</span>
        <span class="n">decoded_img</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">image</span><span class="p">.</span><span class="n">decode_png</span><span class="p">(</span><span class="n">image_placeholder</span><span class="p">,</span> <span class="n">channels</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
        <span class="n">reconstructed_img</span> <span class="o">=</span> <span class="n">sess</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">decoded_img</span><span class="p">,</span> <span class="n">feed_dict</span><span class="o">=</span><span class="p">{</span><span class="n">image_placeholder</span><span class="p">:</span> <span class="n">png_string</span><span class="p">})</span>
    <span class="n">reconstructed_images</span><span class="p">.</span><span class="n">append</span><span class="p">((</span><span class="n">reconstructed_img</span><span class="p">,</span> <span class="n">label</span><span class="p">))</span>
</code></pre></div></div>
<p>其实就是“写”的逆过程。生成一个 <code class="language-plaintext highlighter-rouge">Example</code>，分析读取的字符串，然后从 <code class="language-plaintext highlighter-rouge">features</code> 中根据 key 获取相应的对象即可。图片的话我们使用 <code class="language-plaintext highlighter-rouge">tf.image.decode_png</code> 解码，即 <code class="language-plaintext highlighter-rouge">tf.image.encode_png</code> 逆过程。</p>

<p>读取后可以直接来显示：</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">plt</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">reconstructed_images</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">])</span>
<span class="n">plt</span><span class="p">.</span><span class="n">title</span><span class="p">(</span><span class="n">_CLASS_NAMES</span><span class="p">[</span><span class="n">reconstructed_images</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">1</span><span class="p">]])</span>
<span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p>这种方法比较直接，直接从文件读取并分析，但是如果数据较多就会你比较慢，而且没有考虑分布式、队列、多线程的问题。我们还可以使用文件队列来读取。</p>

<h2 id="队列">队列</h2>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># first construct a queue containing a list of filenames.
# this lets a user split up there dataset in multiple files to keep
# size down
</span><span class="n">filename_queue</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">string_input_producer</span><span class="p">([</span><span class="s">'./data/cifar10_train.tfrecord'</span><span class="p">])</span>
<span class="c1"># Unlike the TFRecordWriter, the TFRecordReader is symbolic, 即所做的操作不会立即执行
</span><span class="n">reader</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">TFRecordReader</span><span class="p">()</span>
<span class="n">_</span><span class="p">,</span> <span class="n">serialized_example</span> <span class="o">=</span> <span class="n">reader</span><span class="p">.</span><span class="n">read</span><span class="p">(</span><span class="n">filename_queue</span><span class="p">)</span>
<span class="n">features</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">parse_single_example</span><span class="p">(</span><span class="n">serialized_example</span><span class="p">,</span> <span class="n">features</span><span class="o">=</span><span class="p">{</span>
    <span class="s">'image/encoded'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">((),</span> <span class="n">tf</span><span class="p">.</span><span class="n">string</span><span class="p">,</span> <span class="n">default_value</span><span class="o">=</span><span class="s">''</span><span class="p">),</span>
    <span class="s">'image/format'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">((),</span> <span class="n">tf</span><span class="p">.</span><span class="n">string</span><span class="p">,</span> <span class="n">default_value</span><span class="o">=</span><span class="s">'png'</span><span class="p">),</span>
    <span class="s">'image/height'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">((),</span> <span class="n">tf</span><span class="p">.</span><span class="n">int64</span><span class="p">),</span>
    <span class="s">'image/width'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">((),</span> <span class="n">tf</span><span class="p">.</span><span class="n">int64</span><span class="p">),</span>
    <span class="s">'image/class/label'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">([],</span> <span class="n">tf</span><span class="p">.</span><span class="n">int64</span><span class="p">,</span> <span class="n">default_value</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">zeros</span><span class="p">([],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">int64</span><span class="p">))</span>
<span class="p">})</span>
<span class="n">image</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">image</span><span class="p">.</span><span class="n">decode_png</span><span class="p">(</span><span class="n">features</span><span class="p">[</span><span class="s">'image/encoded'</span><span class="p">],</span> <span class="n">channels</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<span class="n">image</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">image</span><span class="p">.</span><span class="n">resize_image_with_crop_or_pad</span><span class="p">(</span><span class="n">image</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">32</span><span class="p">)</span>
<span class="n">label</span> <span class="o">=</span> <span class="n">features</span><span class="p">[</span><span class="s">'image/class/label'</span><span class="p">]</span>
<span class="n">init_op</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">group</span><span class="p">(</span><span class="n">tf</span><span class="p">.</span><span class="n">global_variables_initializer</span><span class="p">(),</span> <span class="n">tf</span><span class="p">.</span><span class="n">local_variables_initializer</span><span class="p">())</span>
<span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
    <span class="n">sess</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">init_op</span><span class="p">)</span>
    <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">start_queue_runners</span><span class="p">()</span>
    <span class="c1"># grab examples back.
</span>    <span class="c1"># first example from file
</span>    <span class="n">image_val_1</span><span class="p">,</span> <span class="n">label_val_1</span> <span class="o">=</span> <span class="n">sess</span><span class="p">.</span><span class="n">run</span><span class="p">([</span><span class="n">image</span><span class="p">,</span> <span class="n">label</span><span class="p">])</span>
    <span class="c1"># second example from file
</span>    <span class="n">image_val_2</span><span class="p">,</span> <span class="n">label_val_2</span> <span class="o">=</span> <span class="n">sess</span><span class="p">.</span><span class="n">run</span><span class="p">([</span><span class="n">image</span><span class="p">,</span> <span class="n">label</span><span class="p">])</span>
    <span class="k">print</span><span class="p">(</span><span class="n">image_val_1</span><span class="p">,</span> <span class="n">label_val_1</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span><span class="n">image_val_2</span><span class="p">,</span> <span class="n">label_val_2</span><span class="p">)</span>
    <span class="n">plt</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">image_val_1</span><span class="p">)</span>
    <span class="n">plt</span><span class="p">.</span><span class="n">title</span><span class="p">(</span><span class="n">_CLASS_NAMES</span><span class="p">[</span><span class="n">label_val_1</span><span class="p">])</span>
    <span class="n">plt</span><span class="p">.</span><span class="n">show</span><span class="p">()</span>
</code></pre></div></div>

<p>首先定义我们的文件名队列 <code class="language-plaintext highlighter-rouge">filename_queue</code>，包含一个文件名列表，这样我们可以把大的文件分成多个小的文件，保证单个文件不会太大，本例只有一个文件。然后用 <code class="language-plaintext highlighter-rouge">TFRecordReader</code> 读取。TensorFlow 的 graphs 包含一些状态变量，允许 <code class="language-plaintext highlighter-rouge">TFRecordReader</code> 记住 <code class="language-plaintext highlighter-rouge">tfrecord</code> 读到了哪儿，下次从哪儿读起，因此我们需要 <code class="language-plaintext highlighter-rouge">sess.run(init_op)</code> 来初始化这些状态。与 <code class="language-plaintext highlighter-rouge">tf.python_io.tf_record_iterator</code> 不同的是 <code class="language-plaintext highlighter-rouge">TFRecordReader</code> 总是作用在文件名（<code class="language-plaintext highlighter-rouge">filename_queue</code>）队列上，它会弹出一个文件名读取数据，直到 <code class="language-plaintext highlighter-rouge">tfrecord</code> 为空，然后读取下一个文件名对应的文件。</p>

<p>如何生成文件名队列呢，这时我们需要 <code class="language-plaintext highlighter-rouge">QueueRunners</code> 来做。<code class="language-plaintext highlighter-rouge">QueueRunners</code> 其实就是一个线程，使用 <code class="language-plaintext highlighter-rouge">session</code> 不断执行入队操作，TensorFlow 已经封装好了 <code class="language-plaintext highlighter-rouge">tf.train.QueueRunner</code> 对象。但是大部分时间 <code class="language-plaintext highlighter-rouge">QueueRunner</code> 只是底层操作，我们不会直接操作它，本例使用 <code class="language-plaintext highlighter-rouge">tf.train.string_input_producer</code> 生成。</p>

<p>此时，需要发送信号让 TensorFlow 开起线程，执行 <code class="language-plaintext highlighter-rouge">QueueRunners</code>，否则，代码将会永远阻塞，等待数据入队。因此需要执行 <code class="language-plaintext highlighter-rouge">tf.train.start_queue_runners()</code>，此行代码执行完会立即创建线程。注意，<strong>必须在 initialization 运算符（<code class="language-plaintext highlighter-rouge">sess.run(init_op)</code>）执行之后调用</strong>。</p>

<p><code class="language-plaintext highlighter-rouge">tf.parse_single_example</code> 根据我们定义的 <code class="language-plaintext highlighter-rouge">features</code> 数据格式解析。最终，<code class="language-plaintext highlighter-rouge">image_val_1</code> 就是图片数据集中的一张图片，shape 为 <code class="language-plaintext highlighter-rouge">( 32, 32, 3)</code>。</p>

<p>队列读取的流程图如下：</p>

<p><img src="https://static.lufficc.com/image/7418mO0rbLmRtxJWF2m1WJazE2pbnkSloxQcBYwW.gif" alt="file" /></p>

<h2 id="batch">Batch</h2>
<p>上例中，我们获得的 <code class="language-plaintext highlighter-rouge">image</code> 和 <code class="language-plaintext highlighter-rouge">label</code> 都是单个的 Example 对象，代表数据集中的一条数据。训练的时候不可能单条数据训练，如何生成 batches？</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">images_batch</span><span class="p">,</span> <span class="n">labels_batch</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">shuffle_batch</span><span class="p">(</span>
    <span class="p">[</span><span class="n">image</span><span class="p">,</span> <span class="n">label</span><span class="p">],</span> <span class="n">batch_size</span><span class="o">=</span><span class="mi">128</span><span class="p">,</span>
    <span class="n">capacity</span><span class="o">=</span><span class="mi">2000</span><span class="p">,</span>
    <span class="n">min_after_dequeue</span><span class="o">=</span><span class="mi">1000</span><span class="p">)</span>

<span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">Session</span><span class="p">()</span> <span class="k">as</span> <span class="n">sess</span><span class="p">:</span>
    <span class="n">sess</span><span class="p">.</span><span class="n">run</span><span class="p">(</span><span class="n">init_op</span><span class="p">)</span>
    <span class="n">tf</span><span class="p">.</span><span class="n">train</span><span class="p">.</span><span class="n">start_queue_runners</span><span class="p">()</span>
    <span class="n">labels</span><span class="p">,</span> <span class="n">images</span> <span class="o">=</span> <span class="n">sess</span><span class="p">.</span><span class="n">run</span><span class="p">([</span><span class="n">labels_batch</span><span class="p">,</span> <span class="n">images_batch</span><span class="p">])</span>
    <span class="k">print</span><span class="p">(</span><span class="n">labels</span><span class="p">.</span><span class="n">shape</span><span class="p">)</span>
</code></pre></div></div>
<p>这里我们使用 <code class="language-plaintext highlighter-rouge">tf.train.shuffle_batch</code> 将单个的 <code class="language-plaintext highlighter-rouge">image</code> 和 <code class="language-plaintext highlighter-rouge">label</code> Example 对象生成 batches 。<code class="language-plaintext highlighter-rouge">tf.train.shuffle_batch</code> 实际上构建了另一种 <code class="language-plaintext highlighter-rouge">QueueRunner</code>，<code class="language-plaintext highlighter-rouge">RandomShuffleQueue</code>。<code class="language-plaintext highlighter-rouge">RandomShuffleQueue</code> 将单个的 <code class="language-plaintext highlighter-rouge">image</code> 和 <code class="language-plaintext highlighter-rouge">label</code>  累积成队列，直到包含 <code class="language-plaintext highlighter-rouge">batch_size + min_after_dequeue</code> 个。然后随机选择 <code class="language-plaintext highlighter-rouge">batch_size</code> 条数据返回，因此 <code class="language-plaintext highlighter-rouge">shuffle_batch</code> 的返回值实际上是 <code class="language-plaintext highlighter-rouge">RandomShuffleQueue</code> 执行 <code class="language-plaintext highlighter-rouge">dequeue_many</code> 的返回值。</p>

<p>如果 tensor 的形状为 <code class="language-plaintext highlighter-rouge">[x, y, z]</code>，<code class="language-plaintext highlighter-rouge">shuffle_batch</code> 返回的对应的 tensor 形状为 <code class="language-plaintext highlighter-rouge">[batch_size, x, y, z]</code>，本例 <code class="language-plaintext highlighter-rouge">labels</code> 和 <code class="language-plaintext highlighter-rouge">images</code> 形状分别为<code class="language-plaintext highlighter-rouge">(128, )</code> 和 <code class="language-plaintext highlighter-rouge">(128, 32, 32, 3)</code>。</p>

<h2 id="datasetdataprovider">DatasetDataProvider</h2>
<p>如果我们使用 <a href="https://github.com/tensorflow/models/tree/master/research/slim"><code class="language-plaintext highlighter-rouge">tf.contrib.slim</code></a>，我们可以将读取过程封装的更优雅。</p>

<p>定义我们的数据集 <code class="language-plaintext highlighter-rouge">cifar10.py</code>，具体怎么定义相信看了以上代码，下面的代码不用解释也能看懂了吧~</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">absolute_import</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">division</span>
<span class="kn">from</span> <span class="nn">__future__</span> <span class="kn">import</span> <span class="n">print_function</span>

<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">tensorflow</span> <span class="k">as</span> <span class="n">tf</span>

<span class="n">slim</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">contrib</span><span class="p">.</span><span class="n">slim</span>

<span class="n">_FILE_PATTERN</span> <span class="o">=</span> <span class="s">'cifar10_%s.tfrecord'</span>

<span class="n">SPLITS_TO_SIZES</span> <span class="o">=</span> <span class="p">{</span><span class="s">'train'</span><span class="p">:</span> <span class="mi">50000</span><span class="p">,</span> <span class="s">'test'</span><span class="p">:</span> <span class="mi">10000</span><span class="p">}</span>

<span class="n">_NUM_CLASSES</span> <span class="o">=</span> <span class="mi">10</span>

<span class="n">_ITEMS_TO_DESCRIPTIONS</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s">'image'</span><span class="p">:</span> <span class="s">'A [32 x 32 x 3] color image.'</span><span class="p">,</span>
    <span class="s">'label'</span><span class="p">:</span> <span class="s">'A single integer between 0 and 9'</span><span class="p">,</span>
<span class="p">}</span>


<span class="k">def</span> <span class="nf">get_split</span><span class="p">(</span><span class="n">split_name</span><span class="p">,</span> <span class="n">dataset_dir</span><span class="p">,</span> <span class="n">file_pattern</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">reader</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
    <span class="s">"""Gets a dataset tuple with instructions for reading cifar10.
    Args:
      split_name: A train/test split name.
      dataset_dir: The base directory of the dataset sources.
      file_pattern: The file pattern to use when matching the dataset sources.
        It is assumed that the pattern contains a '%s' string so that the split
        name can be inserted.
      reader: The TensorFlow reader type.
    Returns:
      A `Dataset` namedtuple.
    Raises:
      ValueError: if `split_name` is not a valid train/test split.
    """</span>
    <span class="k">if</span> <span class="n">split_name</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">SPLITS_TO_SIZES</span><span class="p">:</span>
        <span class="k">raise</span> <span class="nb">ValueError</span><span class="p">(</span><span class="s">'split name %s was not recognized.'</span> <span class="o">%</span> <span class="n">split_name</span><span class="p">)</span>

    <span class="k">if</span> <span class="ow">not</span> <span class="n">file_pattern</span><span class="p">:</span>
        <span class="n">file_pattern</span> <span class="o">=</span> <span class="n">_FILE_PATTERN</span>
    <span class="n">file_pattern</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">,</span> <span class="n">file_pattern</span> <span class="o">%</span> <span class="n">split_name</span><span class="p">)</span>

    <span class="c1"># Allowing None in the signature so that dataset_factory can use the default.
</span>    <span class="k">if</span> <span class="ow">not</span> <span class="n">reader</span><span class="p">:</span>
        <span class="n">reader</span> <span class="o">=</span> <span class="n">tf</span><span class="p">.</span><span class="n">TFRecordReader</span>

    <span class="n">keys_to_features</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">'image/encoded'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">((),</span> <span class="n">tf</span><span class="p">.</span><span class="n">string</span><span class="p">,</span> <span class="n">default_value</span><span class="o">=</span><span class="s">''</span><span class="p">),</span>
        <span class="s">'image/format'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">((),</span> <span class="n">tf</span><span class="p">.</span><span class="n">string</span><span class="p">,</span> <span class="n">default_value</span><span class="o">=</span><span class="s">'png'</span><span class="p">),</span>
        <span class="s">'image/class/label'</span><span class="p">:</span> <span class="n">tf</span><span class="p">.</span><span class="n">FixedLenFeature</span><span class="p">(</span>
            <span class="p">[],</span> <span class="n">tf</span><span class="p">.</span><span class="n">int64</span><span class="p">,</span> <span class="n">default_value</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">zeros</span><span class="p">([],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">tf</span><span class="p">.</span><span class="n">int64</span><span class="p">)),</span>
    <span class="p">}</span>

    <span class="n">items_to_handlers</span> <span class="o">=</span> <span class="p">{</span>
        <span class="s">'image'</span><span class="p">:</span> <span class="n">slim</span><span class="p">.</span><span class="n">tfexample_decoder</span><span class="p">.</span><span class="n">Image</span><span class="p">(</span><span class="n">shape</span><span class="o">=</span><span class="p">[</span><span class="mi">32</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">3</span><span class="p">]),</span>
        <span class="s">'label'</span><span class="p">:</span> <span class="n">slim</span><span class="p">.</span><span class="n">tfexample_decoder</span><span class="p">.</span><span class="n">Tensor</span><span class="p">(</span><span class="s">'image/class/label'</span><span class="p">),</span>
    <span class="p">}</span>

    <span class="n">decoder</span> <span class="o">=</span> <span class="n">slim</span><span class="p">.</span><span class="n">tfexample_decoder</span><span class="p">.</span><span class="n">TFExampleDecoder</span><span class="p">(</span>
        <span class="n">keys_to_features</span><span class="p">,</span> <span class="n">items_to_handlers</span><span class="p">)</span>

    <span class="n">labels_to_names</span> <span class="o">=</span> <span class="bp">None</span>
    <span class="k">if</span> <span class="n">has_labels</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">):</span>
        <span class="n">labels_to_names</span> <span class="o">=</span> <span class="n">read_label_file</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">slim</span><span class="p">.</span><span class="n">dataset</span><span class="p">.</span><span class="n">Dataset</span><span class="p">(</span>
        <span class="n">data_sources</span><span class="o">=</span><span class="n">file_pattern</span><span class="p">,</span>
        <span class="n">reader</span><span class="o">=</span><span class="n">reader</span><span class="p">,</span>
        <span class="n">decoder</span><span class="o">=</span><span class="n">decoder</span><span class="p">,</span>
        <span class="n">num_samples</span><span class="o">=</span><span class="n">SPLITS_TO_SIZES</span><span class="p">[</span><span class="n">split_name</span><span class="p">],</span>
        <span class="n">items_to_descriptions</span><span class="o">=</span><span class="n">_ITEMS_TO_DESCRIPTIONS</span><span class="p">,</span>
        <span class="n">num_classes</span><span class="o">=</span><span class="n">_NUM_CLASSES</span><span class="p">,</span>
        <span class="n">labels_to_names</span><span class="o">=</span><span class="n">labels_to_names</span><span class="p">)</span>


<span class="k">def</span> <span class="nf">has_labels</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">,</span> <span class="n">filename</span><span class="o">=</span><span class="s">'labels.txt'</span><span class="p">):</span>
    <span class="s">"""Specifies whether or not the dataset directory contains a label map file.
    Args:
      dataset_dir: The directory in which the labels file is found.
      filename: The filename where the class names are written.
    Returns:
      `True` if the labels file exists and `False` otherwise.
    """</span>
    <span class="k">return</span> <span class="n">tf</span><span class="p">.</span><span class="n">gfile</span><span class="p">.</span><span class="n">Exists</span><span class="p">(</span><span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">,</span> <span class="n">filename</span><span class="p">))</span>


<span class="k">def</span> <span class="nf">read_label_file</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">,</span> <span class="n">filename</span><span class="o">=</span><span class="s">'labels.txt'</span><span class="p">):</span>
    <span class="s">"""Reads the labels file and returns a mapping from ID to class name.
    Args:
      dataset_dir: The directory in which the labels file is found.
      filename: The filename where the class names are written.
    Returns:
      A map from a label (integer) to class name.
    """</span>
    <span class="n">labels_filename</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">path</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">dataset_dir</span><span class="p">,</span> <span class="n">filename</span><span class="p">)</span>
    <span class="k">with</span> <span class="n">tf</span><span class="p">.</span><span class="n">gfile</span><span class="p">.</span><span class="n">Open</span><span class="p">(</span><span class="n">labels_filename</span><span class="p">,</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="n">lines</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">().</span><span class="n">decode</span><span class="p">()</span>
    <span class="n">lines</span> <span class="o">=</span> <span class="n">lines</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="s">'</span><span class="se">\n</span><span class="s">'</span><span class="p">)</span>
    <span class="n">lines</span> <span class="o">=</span> <span class="nb">filter</span><span class="p">(</span><span class="bp">None</span><span class="p">,</span> <span class="n">lines</span><span class="p">)</span>

    <span class="n">labels_to_class_names</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">for</span> <span class="n">line</span> <span class="ow">in</span> <span class="n">lines</span><span class="p">:</span>
        <span class="n">index</span> <span class="o">=</span> <span class="n">line</span><span class="p">.</span><span class="n">index</span><span class="p">(</span><span class="s">':'</span><span class="p">)</span>
        <span class="n">labels_to_class_names</span><span class="p">[</span><span class="nb">int</span><span class="p">(</span><span class="n">line</span><span class="p">[:</span><span class="n">index</span><span class="p">])]</span> <span class="o">=</span> <span class="n">line</span><span class="p">[</span><span class="n">index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">:]</span>
    <span class="k">return</span> <span class="n">labels_to_class_names</span>
</code></pre></div></div>
<p>读取的话就非常简单了：</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">dataset</span> <span class="o">=</span> <span class="n">cifar10</span><span class="p">.</span><span class="n">get_split</span><span class="p">(</span><span class="s">'train'</span><span class="p">,</span> <span class="n">DATA_DIR</span><span class="p">)</span>
<span class="n">provider</span> <span class="o">=</span> <span class="n">slim</span><span class="p">.</span><span class="n">dataset_data_provider</span><span class="p">.</span><span class="n">DatasetDataProvider</span><span class="p">(</span><span class="n">dataset</span><span class="p">)</span>
<span class="p">[</span><span class="n">image</span><span class="p">,</span> <span class="n">label</span><span class="p">]</span> <span class="o">=</span> <span class="n">provider</span><span class="p">.</span><span class="n">get</span><span class="p">([</span><span class="s">'image'</span><span class="p">,</span> <span class="s">'label'</span><span class="p">])</span>
</code></pre></div></div>

<p>以上所有代码都可以在 tensorflow/model 仓库下的 <a href="https://github.com/tensorflow/models/tree/master/research/slim">slim</a> 中找到~</p>

<h2 id="总结">总结</h2>
<p>流程：</p>
<ol>
  <li>生成 TFRecord 格式文件</li>
  <li>定义 record reader 分析 TFRecord 文件</li>
  <li>定义 batcher</li>
  <li>构建网络模型</li>
  <li>初始化所有运算符</li>
  <li>开始 queue runners.</li>
  <li>训练 loop</li>
</ol>

<h2 id="参考">参考</h2>
<ol>
  <li><a href="http://warmspringwinds.github.io/tensorflow/tf-slim/2016/12/21/tfrecords-guide/">Tfrecords Guide</a></li>
  <li><a href="https://indico.io/blog/tensorflow-data-inputs-part1-placeholders-protobufs-queues/">TensorFlow Data Input (Part 1): Placeholders, Protobufs &amp; Queues</a></li>
  <li><a href="http://blog.csdn.net/sunquan_ok/article/details/51832442">关于tensorflow 的数据读取线程管理QueueRunner</a></li>
</ol>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[通常我们下载的数据集都是以压缩文件的格式存在，解压后会有多个文件夹，像 train， test， val 等等。而文件也有可能多达数万或者数百万个。这种形式的数据集不但读取复杂、慢，而且占用磁盘空间。这时二进制的格式文件的优点便显现出来了。我们可以把数据集存储为一个二进制文件，这样就没有了 train， test， val 等等的文件夹。更重要的是，这些数据只会占据一块内存（Block of Memory），而不需要一个一个单独加载文件。因此使用二进制文件效率更高。 你以为 TensorFlow 都为你封装好二进制文件文件的读写、解析方式了吗？是的，都封装好了~本文就是介绍如何将数据转换为 TFRecord 格式。 CIFAR-10 数据集 本文以 CIFAR-10 数据集为例，什么是 CIFAR-10 数据集？看这儿 =&gt; 图像数据集 ~ 假设你已经有了以下数据： 写，保存为 TFRecord 格式 定义的一些常量： _NUM_TRAIN_FILES = 5 # The height and width of each image. _IMAGE_SIZE = 32 # The names of the classes. _CLASS_NAMES = [ 'airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck', ] 这里我们创建两个 split 文件，分别存储 train 和 test 需要的数据： dataset_dir = 'data' if not tf.gfile.Exists(dataset_dir): tf.gfile.MakeDirs(dataset_dir) training_filename = _get_output_filename(dataset_dir, 'train') testing_filename = _get_output_filename(dataset_dir, 'test') _get_output_filename 函数用来生成文件名： def _get_output_filename(dataset_dir, split_name): """Creates the output filename. Args: dataset_dir: The dataset directory where the dataset is stored. split_name: The name of the train/test split. Returns: An absolute file path. """ return '%s/cifar10_%s.tfrecord' % (dataset_dir, split_name) 然后，处理训练数据： # First, process the training data: with tf.python_io.TFRecordWriter(training_filename) as tfrecord_writer: offset = 0 for i in range(_NUM_TRAIN_FILES): filename = os.path.join('./cifar-10-batches-py', 'data_batch_%d' % (i + 1)) offset = _add_to_tfrecord(filename, tfrecord_writer, offset) 即依次读取 data_batch_? 文件，调用 _add_to_tfrecord 将其保存为 TFRecord 格式。 def _add_to_tfrecord(filename, tfrecord_writer, offset=0): """Loads data from the cifar10 pickle files and writes files to a TFRecord. Args: filename: The filename of the cifar10 pickle file. tfrecord_writer: The TFRecord writer to use for writing. offset: An offset into the absolute number of images previously written. Returns: The new offset. """ with tf.gfile.Open(filename, 'rb') as f: data = pickle.load(f, encoding='bytes') images = data[b'data'] num_images = images.shape[0] images = images.reshape((num_images, 3, 32, 32)) labels = data[b'labels'] with tf.Graph().as_default(): image_placeholder = tf.placeholder(tf.uint8) encoded_image = tf.image.encode_png(image_placeholder) with tf.Session() as sess: for j in range(num_images): sys.stdout.write('\r&gt;&gt; Reading file [%s] image %d/%d' % (filename, offset + j + 1, offset + num_images)) sys.stdout.flush() image = np.squeeze(images[j]).transpose((1, 2, 0)) label = labels[j] png_string = sess.run(encoded_image, feed_dict={image_placeholder: image}) example = image_to_tfexample(png_string, b'png', _IMAGE_SIZE, _IMAGE_SIZE, label) tfrecord_writer.write(example.SerializeToString()) return offset + num_images 因为 CIFAR-10 数据集的图片是 10000x3072 numpy array 格式的，因此需要 reshape 为 tf.image.encode_png 需要的格式：[height, width, channels]。 tf.image.encode_png 返回编码后的字符串，然后还需要保存图片的宽高、格式信息。调用 image_to_tfexample 将这些数据保存到 tf.train.Example 中： def image_to_tfexample(image_data, image_format, height, width, class_id): return tf.train.Example(features=tf.train.Features(feature={ 'image/encoded': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_data])), 'image/format': tf.train.Feature(bytes_list=tf.train.BytesList(value=[image_format])), 'image/class/label': tf.train.Feature(int64_list=tf.train.Int64List(value=[class_id])), 'image/height': tf.train.Feature(int64_list=tf.train.Int64List(value=[height])), 'image/width': tf.train.Feature(int64_list=tf.train.Int64List(value=[width])), })) TensorFlow 会将数据转换为 tf.train.Example Protobuf 对象，Example 包含 Features， Features 包含 一个 dict，来区分不同的 Feature。Feature 可以包含 FloatList，ByteList 或者 Int64List。注意这里的 key，image/encoded，image/format等，是可以随便定义的，这里是TensorFlow 默认的图片数据集的 key ,我们一般采取 TensorFlow 默认的值。 有了 example，我们将其转换为字符串写入到文件就完成了整个 TFRecord 格式文件的制作。 tfrecord_writer.write(example.SerializeToString()) 同理，制作测试数据集： # Next, process the testing data: with tf.python_io.TFRecordWriter(testing_filename) as tfrecord_writer: filename = os.path.join('./cifar-10-batches-py', 'test_batch') _add_to_tfrecord(filename, tfrecord_writer) 最后，我们会得到两个文件： 这就是最后的 TFRecord 格式文件，二进制文件。 读 最简单的就是直接读取： reconstructed_images = [] record_iterator = tf.python_io.tf_record_iterator(path='./data/cifar10_train.tfrecord') for string_iterator in record_iterator: example = tf.train.Example() example.ParseFromString(string_iterator) height = example.features.feature['image/height'].int64_list.value[0] width = example.features.feature['image/width'].int64_list.value[0] png_string = example.features.feature['image/encoded'].bytes_list.value[0] label = example.features.feature['image/class/label'].int64_list.value[0] with tf.Session() as sess: image_placeholder = tf.placeholder(dtype=tf.string) decoded_img = tf.image.decode_png(image_placeholder, channels=3) reconstructed_img = sess.run(decoded_img, feed_dict={image_placeholder: png_string}) reconstructed_images.append((reconstructed_img, label)) 其实就是“写”的逆过程。生成一个 Example，分析读取的字符串，然后从 features 中根据 key 获取相应的对象即可。图片的话我们使用 tf.image.decode_png 解码，即 tf.image.encode_png 逆过程。 读取后可以直接来显示： plt.imshow(reconstructed_images[0][0]) plt.title(_CLASS_NAMES[reconstructed_images[0][1]]) plt.show() 这种方法比较直接，直接从文件读取并分析，但是如果数据较多就会你比较慢，而且没有考虑分布式、队列、多线程的问题。我们还可以使用文件队列来读取。 队列 # first construct a queue containing a list of filenames. # this lets a user split up there dataset in multiple files to keep # size down filename_queue = tf.train.string_input_producer(['./data/cifar10_train.tfrecord']) # Unlike the TFRecordWriter, the TFRecordReader is symbolic, 即所做的操作不会立即执行 reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example(serialized_example, features={ 'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''), 'image/format': tf.FixedLenFeature((), tf.string, default_value='png'), 'image/height': tf.FixedLenFeature((), tf.int64), 'image/width': tf.FixedLenFeature((), tf.int64), 'image/class/label': tf.FixedLenFeature([], tf.int64, default_value=tf.zeros([], dtype=tf.int64)) }) image = tf.image.decode_png(features['image/encoded'], channels=3) image = tf.image.resize_image_with_crop_or_pad(image, 32, 32) label = features['image/class/label'] init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer()) with tf.Session() as sess: sess.run(init_op) tf.train.start_queue_runners() # grab examples back. # first example from file image_val_1, label_val_1 = sess.run([image, label]) # second example from file image_val_2, label_val_2 = sess.run([image, label]) print(image_val_1, label_val_1) print(image_val_2, label_val_2) plt.imshow(image_val_1) plt.title(_CLASS_NAMES[label_val_1]) plt.show() 首先定义我们的文件名队列 filename_queue，包含一个文件名列表，这样我们可以把大的文件分成多个小的文件，保证单个文件不会太大，本例只有一个文件。然后用 TFRecordReader 读取。TensorFlow 的 graphs 包含一些状态变量，允许 TFRecordReader 记住 tfrecord 读到了哪儿，下次从哪儿读起，因此我们需要 sess.run(init_op) 来初始化这些状态。与 tf.python_io.tf_record_iterator 不同的是 TFRecordReader 总是作用在文件名（filename_queue）队列上，它会弹出一个文件名读取数据，直到 tfrecord 为空，然后读取下一个文件名对应的文件。 如何生成文件名队列呢，这时我们需要 QueueRunners 来做。QueueRunners 其实就是一个线程，使用 session 不断执行入队操作，TensorFlow 已经封装好了 tf.train.QueueRunner 对象。但是大部分时间 QueueRunner 只是底层操作，我们不会直接操作它，本例使用 tf.train.string_input_producer 生成。 此时，需要发送信号让 TensorFlow 开起线程，执行 QueueRunners，否则，代码将会永远阻塞，等待数据入队。因此需要执行 tf.train.start_queue_runners()，此行代码执行完会立即创建线程。注意，必须在 initialization 运算符（sess.run(init_op)）执行之后调用。 tf.parse_single_example 根据我们定义的 features 数据格式解析。最终，image_val_1 就是图片数据集中的一张图片，shape 为 ( 32, 32, 3)。 队列读取的流程图如下： Batch 上例中，我们获得的 image 和 label 都是单个的 Example 对象，代表数据集中的一条数据。训练的时候不可能单条数据训练，如何生成 batches？ images_batch, labels_batch = tf.train.shuffle_batch( [image, label], batch_size=128, capacity=2000, min_after_dequeue=1000) with tf.Session() as sess: sess.run(init_op) tf.train.start_queue_runners() labels, images = sess.run([labels_batch, images_batch]) print(labels.shape) 这里我们使用 tf.train.shuffle_batch 将单个的 image 和 label Example 对象生成 batches 。tf.train.shuffle_batch 实际上构建了另一种 QueueRunner，RandomShuffleQueue。RandomShuffleQueue 将单个的 image 和 label 累积成队列，直到包含 batch_size + min_after_dequeue 个。然后随机选择 batch_size 条数据返回，因此 shuffle_batch 的返回值实际上是 RandomShuffleQueue 执行 dequeue_many 的返回值。 如果 tensor 的形状为 [x, y, z]，shuffle_batch 返回的对应的 tensor 形状为 [batch_size, x, y, z]，本例 labels 和 images 形状分别为(128, ) 和 (128, 32, 32, 3)。 DatasetDataProvider 如果我们使用 tf.contrib.slim，我们可以将读取过程封装的更优雅。 定义我们的数据集 cifar10.py，具体怎么定义相信看了以上代码，下面的代码不用解释也能看懂了吧~ from __future__ import absolute_import from __future__ import division from __future__ import print_function import os import tensorflow as tf slim = tf.contrib.slim _FILE_PATTERN = 'cifar10_%s.tfrecord' SPLITS_TO_SIZES = {'train': 50000, 'test': 10000} _NUM_CLASSES = 10 _ITEMS_TO_DESCRIPTIONS = { 'image': 'A [32 x 32 x 3] color image.', 'label': 'A single integer between 0 and 9', } def get_split(split_name, dataset_dir, file_pattern=None, reader=None): """Gets a dataset tuple with instructions for reading cifar10. Args: split_name: A train/test split name. dataset_dir: The base directory of the dataset sources. file_pattern: The file pattern to use when matching the dataset sources. It is assumed that the pattern contains a '%s' string so that the split name can be inserted. reader: The TensorFlow reader type. Returns: A `Dataset` namedtuple. Raises: ValueError: if `split_name` is not a valid train/test split. """ if split_name not in SPLITS_TO_SIZES: raise ValueError('split name %s was not recognized.' % split_name) if not file_pattern: file_pattern = _FILE_PATTERN file_pattern = os.path.join(dataset_dir, file_pattern % split_name) # Allowing None in the signature so that dataset_factory can use the default. if not reader: reader = tf.TFRecordReader keys_to_features = { 'image/encoded': tf.FixedLenFeature((), tf.string, default_value=''), 'image/format': tf.FixedLenFeature((), tf.string, default_value='png'), 'image/class/label': tf.FixedLenFeature( [], tf.int64, default_value=tf.zeros([], dtype=tf.int64)), } items_to_handlers = { 'image': slim.tfexample_decoder.Image(shape=[32, 32, 3]), 'label': slim.tfexample_decoder.Tensor('image/class/label'), } decoder = slim.tfexample_decoder.TFExampleDecoder( keys_to_features, items_to_handlers) labels_to_names = None if has_labels(dataset_dir): labels_to_names = read_label_file(dataset_dir) return slim.dataset.Dataset( data_sources=file_pattern, reader=reader, decoder=decoder, num_samples=SPLITS_TO_SIZES[split_name], items_to_descriptions=_ITEMS_TO_DESCRIPTIONS, num_classes=_NUM_CLASSES, labels_to_names=labels_to_names) def has_labels(dataset_dir, filename='labels.txt'): """Specifies whether or not the dataset directory contains a label map file. Args: dataset_dir: The directory in which the labels file is found. filename: The filename where the class names are written. Returns: `True` if the labels file exists and `False` otherwise. """ return tf.gfile.Exists(os.path.join(dataset_dir, filename)) def read_label_file(dataset_dir, filename='labels.txt'): """Reads the labels file and returns a mapping from ID to class name. Args: dataset_dir: The directory in which the labels file is found. filename: The filename where the class names are written. Returns: A map from a label (integer) to class name. """ labels_filename = os.path.join(dataset_dir, filename) with tf.gfile.Open(labels_filename, 'rb') as f: lines = f.read().decode() lines = lines.split('\n') lines = filter(None, lines) labels_to_class_names = {} for line in lines: index = line.index(':') labels_to_class_names[int(line[:index])] = line[index + 1:] return labels_to_class_names 读取的话就非常简单了： dataset = cifar10.get_split('train', DATA_DIR) provider = slim.dataset_data_provider.DatasetDataProvider(dataset) [image, label] = provider.get(['image', 'label']) 以上所有代码都可以在 tensorflow/model 仓库下的 slim 中找到~ 总结 流程： 生成 TFRecord 格式文件 定义 record reader 分析 TFRecord 文件 定义 batcher 构建网络模型 初始化所有运算符 开始 queue runners. 训练 loop 参考 Tfrecords Guide TensorFlow Data Input (Part 1): Placeholders, Protobufs &amp; Queues 关于tensorflow 的数据读取线程管理QueueRunner]]></summary></entry><entry xml:lang="zh-CN"><title type="html">用 LSTM 预测字符序列</title><link href="http://blog.lufficc.com/lstm-char/" rel="alternate" type="text/html" title="用 LSTM 预测字符序列" /><published>2017-10-31T03:26:00+00:00</published><updated>2017-10-31T03:26:00+00:00</updated><id>http://blog.lufficc.com/lstm-char</id><content type="html" xml:base="http://blog.lufficc.com/lstm-char/"><![CDATA[<p>在<a href="https://blog.lufficc.com/an-introduction-of-recurrent-neural-network-and-difference-between-traditional-neural-network">循环神经网络（Recurrent Neural Network）简介</a>中我们了解了什么是 RNN，本文用 TensorFlow 实现一个超级简单的字符预测模型，并对代码进行详细说明，防止自己以后忘记( ╯□╰ )。</p>

<p>首先定义 RNN 网络：</p>
<pre><code class="language-py3">class RNN:
    def __init__(self,
                 in_size,
                 cell_size,
                 num_layers,
                 out_size,
                 sess,
                 lr=0.003):
        self.in_size = in_size  # 输入数据大小，这里为字符数目（注意这里将大写字母转换为了小写字母，减小了字符数目，加快训练）
        self.cell_size = cell_size  # LSTM cell 的unit数目
        self.num_layers = num_layers  # 因为是多层LSTM，num_layers指定了层数
        self.out_size = out_size  # 输出数据大小，同样为字符数目
        self.sess = sess  # session
        self.lr = lr  # 学习速率

        # 储存上一次的state, 测试的时候用
        self.last_state = np.zeros([num_layers * 2 * cell_size])

        # 输入数据，(batch, time_step, in_size)
        self.inputs = tf.placeholder(tf.float32, shape=[None, None, in_size])
        self.lstm_cells = [
            rnn.BasicLSTMCell(cell_size, state_is_tuple=False)
            for _ in range(num_layers)
        ]
        self.lstm = rnn.MultiRNNCell(self.lstm_cells, state_is_tuple=False)

        self.init_state = tf.placeholder(tf.float32, [None, num_layers * 2 * cell_size])

        # 定义 recurrent neural network
        outputs, self.new_state = tf.nn.dynamic_rnn(
            self.lstm,
            self.inputs,
            initial_state=self.init_state,
            dtype=tf.float32)

        # 最后使用全连接层来计算loss，w b 为全连接层参数
        w = tf.Variable(tf.random_normal([cell_size, out_size], stddev=0.01))
        b = tf.Variable(tf.random_normal([out_size], stddev=0.01))

        reshaped_outputs = tf.matmul(tf.reshape(outputs, [-1, cell_size]), w) + b
        # 将输出转换为概率
        fc = tf.nn.softmax(reshaped_outputs)

        shape = tf.shape(outputs)
        self.final_outputs = tf.reshape(fc, [shape[0], shape[1], out_size])

        # labels, (batch, time_step, in_size)
        self.targets = tf.placeholder(tf.float32, [None, None, out_size])

        # loss
        self.cost = tf.reduce_mean(
            tf.nn.softmax_cross_entropy_with_logits(
                logits=reshaped_outputs,
                labels=tf.reshape(self.targets, [-1, out_size])))
        self.optimizer = tf.train.RMSPropOptimizer(lr).minimize(self.cost)

    def train(self, inputs, targets):
        # init_state,(batch_size, num_layers * 2 * self.cell_size)
        init_state = np.zeros(
            [inputs.shape[0], self.num_layers * 2 * self.cell_size])

        _, loss = self.sess.run(
            [self.optimizer, self.cost],
            feed_dict={
                self.inputs: inputs,
                self.targets: targets,
                self.init_state: init_state
            })

        return loss

    def get_next_char_pro(self, x, init=False):
        """根据输入字符x, 预测下一个字符并返回，x 的 shape 为(1, in_size), 即为单个字符的 one-hot 形式
        """
        if init:
            init_state = np.zeros([self.num_layers * 2 * self.cell_size])
        else:
            init_state = self.last_state

        out, next_state = self.sess.run(
            [self.final_outputs, self.new_state],
            feed_dict={self.inputs: [x], self.init_state: [init_state]})
        # 将当前state储存起来，下次预测的时候使用，这样就保留了上下文信息
        self.last_state = next_state[0]
        return out[0][0]
</code></pre>

<p>处理数据：</p>
<pre><code class="language-py3">def generate_one_hot_data(data, vocabulary):
    data_ = np.zeros([len(data), len(vocabulary)])

    count = 0
    for char in data:
        i = vocabulary.index(char)
        data_[count, i] = 1.0
        count += 1
    return data_


path = 'data/data.txt'

with open(path, 'r') as f:
    data = f.read()

data = data.lower()  # 全部小写，降低复杂度

vocabulary = list(set(data))  # 字符列表

one_hot_data = generate_one_hot_data(data, vocabulary)  # one_hot_data，shape 为(len(data), len(vocabulary))
</code></pre>

<p>定义参数：</p>
<pre><code class="language-py3">in_size = out_size = len(vocabulary)

cell_size = 128
num_layers = 2
batch_size = 64
time_steps = 128

NUM_TRAIN_BATCHES = 5000

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.InteractiveSession(config=config)

model = RNN(in_size=in_size,
            cell_size=cell_size,
            num_layers=num_layers,
            out_size=out_size,
            sess=sess)

sess.run(tf.global_variables_initializer())

inputs = np.zeros([batch_size, time_steps, in_size])
targets = np.zeros([batch_size, time_steps, out_size])

possible_batch_start_ids = range(len(data) - time_steps - 1)
</code></pre>

<p>生成训练数据：</p>
<pre><code class="language-py3">for i in range(NUM_TRAIN_BATCHES):
    batch_start_ids = random.sample(possible_batch_start_ids, batch_size)

    for j in range(time_steps):
        inputs_ids = [k + j for k in batch_start_ids]
        targets_ids = [k + j + 1 for k in batch_start_ids]

        inputs[:, j, :] = one_hot_data[inputs_ids, :]
        targets[:, j, :] = one_hot_data[targets_ids, :]

    loss = model.train(inputs, targets)

    if i % 100 == 0:
        print('loss: {:.5f} of batch {}'.format(loss, i))

</code></pre>
<p>测试：</p>
<pre><code class="language-py3"># 预测一个以'we ' 开头的句子
TEST_PREFIX = 'we '

for i in range(len(TEST_PREFIX)):
    out = model.get_next_char_pro(
        generate_one_hot_data(TEST_PREFIX[i], vocabulary), i == 0)

gen_str = TEST_PREFIX

for i in range(200):
    index = np.random.choice(range(len(vocabulary)), p=out)
    pred = vocabulary[index]
    # print(index, len(out), len(vocabulary))
    gen_str += pred

    out = model.get_next_char_pro(generate_one_hot_data(pred, vocabulary))

print(gen_str)
</code></pre>
<p>训练结果：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>loss: 3.12114 of batch 100
loss: 3.06464 of batch 200
loss: 2.49030 of batch 300
loss: 2.20478 of batch 400
loss: 2.02495 of batch 500
loss: 1.88005 of batch 600
loss: 1.74067 of batch 700
loss: 1.70199 of batch 800
loss: 1.69426 of batch 900
loss: 1.54315 of batch 1000
loss: 1.55390 of batch 1100
loss: 1.48157 of batch 1200
loss: 1.49068 of batch 1300
loss: 1.48002 of batch 1400
loss: 1.44718 of batch 1500
loss: 1.44463 of batch 1600
loss: 1.44997 of batch 1700
loss: 1.41875 of batch 1800
loss: 1.39939 of batch 1900
loss: 1.40289 of batch 2000
loss: 1.36561 of batch 2100
loss: 1.37862 of batch 2200
loss: 1.39195 of batch 2300
loss: 1.36577 of batch 2400
loss: 1.33385 of batch 2500
loss: 1.36705 of batch 2600
loss: 1.28989 of batch 2700
loss: 1.35604 of batch 2800
loss: 1.30150 of batch 2900
loss: 1.30576 of batch 3000
loss: 1.30832 of batch 3100
loss: 1.28968 of batch 3200
loss: 1.28770 of batch 3300
loss: 1.27296 of batch 3400
loss: 1.31154 of batch 3500
loss: 1.30212 of batch 3600
loss: 1.29709 of batch 3700
loss: 1.28448 of batch 3800
loss: 1.28766 of batch 3900
loss: 1.25970 of batch 4000
loss: 1.27008 of batch 4100
loss: 1.29763 of batch 4200
loss: 1.25666 of batch 4300
loss: 1.29813 of batch 4400
loss: 1.26807 of batch 4500
loss: 1.23903 of batch 4600
loss: 1.21010 of batch 4700
loss: 1.27084 of batch 4800
loss: 1.27161 of batch 4900
loss: 1.25789 of batch 5000
loss: 1.22986 of batch 5100
loss: 1.24404 of batch 5200
loss: 1.27089 of batch 5300
loss: 1.23036 of batch 5400
loss: 1.25348 of batch 5500
loss: 1.23626 of batch 5600
loss: 1.21493 of batch 5700
loss: 1.20419 of batch 5800
loss: 1.23771 of batch 5900
loss: 1.20754 of batch 6000
loss: 1.23489 of batch 6100
loss: 1.20233 of batch 6200
loss: 1.20366 of batch 6300
loss: 1.23586 of batch 6400
loss: 1.21687 of batch 6500
loss: 1.19479 of batch 6600
loss: 1.21297 of batch 6700
loss: 1.23598 of batch 6800
loss: 1.19476 of batch 6900
loss: 1.21584 of batch 7000
loss: 1.22816 of batch 7100
loss: 1.19449 of batch 7200
loss: 1.19346 of batch 7300
loss: 1.23466 of batch 7400
loss: 1.18541 of batch 7500
loss: 1.19469 of batch 7600
loss: 1.21069 of batch 7700
loss: 1.19641 of batch 7800
loss: 1.15550 of batch 7900
loss: 1.19861 of batch 8000
loss: 1.22582 of batch 8100
loss: 1.19766 of batch 8200
loss: 1.19041 of batch 8300
loss: 1.15410 of batch 8400
loss: 1.13109 of batch 8500
loss: 1.16434 of batch 8600
loss: 1.19457 of batch 8700
loss: 1.18558 of batch 8800
loss: 1.18043 of batch 8900
loss: 1.17171 of batch 9000
loss: 1.17663 of batch 9100
loss: 1.14107 of batch 9200
loss: 1.20001 of batch 9300
loss: 1.16926 of batch 9400
loss: 1.14761 of batch 9500
loss: 1.15305 of batch 9600
loss: 1.20601 of batch 9700
loss: 1.16141 of batch 9800
loss: 1.15704 of batch 9900
</code></pre></div></div>
<p>一些输出：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>prefix:she
sheated here do in the weary blood
unless he single thou mayst break your ancients dren
lads, then i in to thy fair age.
so i say the nursed with mine eyes from my sleep.

clarence:
how is it pale and mo


prefix:the
the tears of mine.
hark! they may providy to the extremement,
make an office of day, like to be so run out.
dost than the heaven and your pardon did forght
the crown: all shall be too close hather change
prefix:i
in heavy part:
in your comforts of our state exposed
for in the kin a phalteres times disprison.

isabella:
dispatch and great to do a plance:
and now they do repent us to be pitteth caser
which in my 


prefix:i
it confined him to his necessities,
but to an e this young belonged,'
as it susminine of france of vices of folly,
when though 'i trust up me not, our reward,
sir, you shall be absent disposed.

warwic


prefix:me
me;
for will is angelo, take it,--and thou knows he did:
one catesby!

pronirerd:
uppecured for me hencefully, you make one flight,
making down thee thought in who thy by the world,
but name, though hat
</code></pre></div></div>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[在循环神经网络（Recurrent Neural Network）简介中我们了解了什么是 RNN，本文用 TensorFlow 实现一个超级简单的字符预测模型，并对代码进行详细说明，防止自己以后忘记( ╯□╰ )。 首先定义 RNN 网络： class RNN: def __init__(self, in_size, cell_size, num_layers, out_size, sess, lr=0.003): self.in_size = in_size # 输入数据大小，这里为字符数目（注意这里将大写字母转换为了小写字母，减小了字符数目，加快训练） self.cell_size = cell_size # LSTM cell 的unit数目 self.num_layers = num_layers # 因为是多层LSTM，num_layers指定了层数 self.out_size = out_size # 输出数据大小，同样为字符数目 self.sess = sess # session self.lr = lr # 学习速率 # 储存上一次的state, 测试的时候用 self.last_state = np.zeros([num_layers * 2 * cell_size]) # 输入数据，(batch, time_step, in_size) self.inputs = tf.placeholder(tf.float32, shape=[None, None, in_size]) self.lstm_cells = [ rnn.BasicLSTMCell(cell_size, state_is_tuple=False) for _ in range(num_layers) ] self.lstm = rnn.MultiRNNCell(self.lstm_cells, state_is_tuple=False) self.init_state = tf.placeholder(tf.float32, [None, num_layers * 2 * cell_size]) # 定义 recurrent neural network outputs, self.new_state = tf.nn.dynamic_rnn( self.lstm, self.inputs, initial_state=self.init_state, dtype=tf.float32) # 最后使用全连接层来计算loss，w b 为全连接层参数 w = tf.Variable(tf.random_normal([cell_size, out_size], stddev=0.01)) b = tf.Variable(tf.random_normal([out_size], stddev=0.01)) reshaped_outputs = tf.matmul(tf.reshape(outputs, [-1, cell_size]), w) + b # 将输出转换为概率 fc = tf.nn.softmax(reshaped_outputs) shape = tf.shape(outputs) self.final_outputs = tf.reshape(fc, [shape[0], shape[1], out_size]) # labels, (batch, time_step, in_size) self.targets = tf.placeholder(tf.float32, [None, None, out_size]) # loss self.cost = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits( logits=reshaped_outputs, labels=tf.reshape(self.targets, [-1, out_size]))) self.optimizer = tf.train.RMSPropOptimizer(lr).minimize(self.cost) def train(self, inputs, targets): # init_state,(batch_size, num_layers * 2 * self.cell_size) init_state = np.zeros( [inputs.shape[0], self.num_layers * 2 * self.cell_size]) _, loss = self.sess.run( [self.optimizer, self.cost], feed_dict={ self.inputs: inputs, self.targets: targets, self.init_state: init_state }) return loss def get_next_char_pro(self, x, init=False): """根据输入字符x, 预测下一个字符并返回，x 的 shape 为(1, in_size), 即为单个字符的 one-hot 形式 """ if init: init_state = np.zeros([self.num_layers * 2 * self.cell_size]) else: init_state = self.last_state out, next_state = self.sess.run( [self.final_outputs, self.new_state], feed_dict={self.inputs: [x], self.init_state: [init_state]}) # 将当前state储存起来，下次预测的时候使用，这样就保留了上下文信息 self.last_state = next_state[0] return out[0][0] 处理数据： def generate_one_hot_data(data, vocabulary): data_ = np.zeros([len(data), len(vocabulary)]) count = 0 for char in data: i = vocabulary.index(char) data_[count, i] = 1.0 count += 1 return data_ path = 'data/data.txt' with open(path, 'r') as f: data = f.read() data = data.lower() # 全部小写，降低复杂度 vocabulary = list(set(data)) # 字符列表 one_hot_data = generate_one_hot_data(data, vocabulary) # one_hot_data，shape 为(len(data), len(vocabulary)) 定义参数： in_size = out_size = len(vocabulary) cell_size = 128 num_layers = 2 batch_size = 64 time_steps = 128 NUM_TRAIN_BATCHES = 5000 config = tf.ConfigProto() config.gpu_options.allow_growth = True sess = tf.InteractiveSession(config=config) model = RNN(in_size=in_size, cell_size=cell_size, num_layers=num_layers, out_size=out_size, sess=sess) sess.run(tf.global_variables_initializer()) inputs = np.zeros([batch_size, time_steps, in_size]) targets = np.zeros([batch_size, time_steps, out_size]) possible_batch_start_ids = range(len(data) - time_steps - 1) 生成训练数据： for i in range(NUM_TRAIN_BATCHES): batch_start_ids = random.sample(possible_batch_start_ids, batch_size) for j in range(time_steps): inputs_ids = [k + j for k in batch_start_ids] targets_ids = [k + j + 1 for k in batch_start_ids] inputs[:, j, :] = one_hot_data[inputs_ids, :] targets[:, j, :] = one_hot_data[targets_ids, :] loss = model.train(inputs, targets) if i % 100 == 0: print('loss: {:.5f} of batch {}'.format(loss, i)) 测试： # 预测一个以'we ' 开头的句子 TEST_PREFIX = 'we ' for i in range(len(TEST_PREFIX)): out = model.get_next_char_pro( generate_one_hot_data(TEST_PREFIX[i], vocabulary), i == 0) gen_str = TEST_PREFIX for i in range(200): index = np.random.choice(range(len(vocabulary)), p=out) pred = vocabulary[index] # print(index, len(out), len(vocabulary)) gen_str += pred out = model.get_next_char_pro(generate_one_hot_data(pred, vocabulary)) print(gen_str) 训练结果： loss: 3.12114 of batch 100 loss: 3.06464 of batch 200 loss: 2.49030 of batch 300 loss: 2.20478 of batch 400 loss: 2.02495 of batch 500 loss: 1.88005 of batch 600 loss: 1.74067 of batch 700 loss: 1.70199 of batch 800 loss: 1.69426 of batch 900 loss: 1.54315 of batch 1000 loss: 1.55390 of batch 1100 loss: 1.48157 of batch 1200 loss: 1.49068 of batch 1300 loss: 1.48002 of batch 1400 loss: 1.44718 of batch 1500 loss: 1.44463 of batch 1600 loss: 1.44997 of batch 1700 loss: 1.41875 of batch 1800 loss: 1.39939 of batch 1900 loss: 1.40289 of batch 2000 loss: 1.36561 of batch 2100 loss: 1.37862 of batch 2200 loss: 1.39195 of batch 2300 loss: 1.36577 of batch 2400 loss: 1.33385 of batch 2500 loss: 1.36705 of batch 2600 loss: 1.28989 of batch 2700 loss: 1.35604 of batch 2800 loss: 1.30150 of batch 2900 loss: 1.30576 of batch 3000 loss: 1.30832 of batch 3100 loss: 1.28968 of batch 3200 loss: 1.28770 of batch 3300 loss: 1.27296 of batch 3400 loss: 1.31154 of batch 3500 loss: 1.30212 of batch 3600 loss: 1.29709 of batch 3700 loss: 1.28448 of batch 3800 loss: 1.28766 of batch 3900 loss: 1.25970 of batch 4000 loss: 1.27008 of batch 4100 loss: 1.29763 of batch 4200 loss: 1.25666 of batch 4300 loss: 1.29813 of batch 4400 loss: 1.26807 of batch 4500 loss: 1.23903 of batch 4600 loss: 1.21010 of batch 4700 loss: 1.27084 of batch 4800 loss: 1.27161 of batch 4900 loss: 1.25789 of batch 5000 loss: 1.22986 of batch 5100 loss: 1.24404 of batch 5200 loss: 1.27089 of batch 5300 loss: 1.23036 of batch 5400 loss: 1.25348 of batch 5500 loss: 1.23626 of batch 5600 loss: 1.21493 of batch 5700 loss: 1.20419 of batch 5800 loss: 1.23771 of batch 5900 loss: 1.20754 of batch 6000 loss: 1.23489 of batch 6100 loss: 1.20233 of batch 6200 loss: 1.20366 of batch 6300 loss: 1.23586 of batch 6400 loss: 1.21687 of batch 6500 loss: 1.19479 of batch 6600 loss: 1.21297 of batch 6700 loss: 1.23598 of batch 6800 loss: 1.19476 of batch 6900 loss: 1.21584 of batch 7000 loss: 1.22816 of batch 7100 loss: 1.19449 of batch 7200 loss: 1.19346 of batch 7300 loss: 1.23466 of batch 7400 loss: 1.18541 of batch 7500 loss: 1.19469 of batch 7600 loss: 1.21069 of batch 7700 loss: 1.19641 of batch 7800 loss: 1.15550 of batch 7900 loss: 1.19861 of batch 8000 loss: 1.22582 of batch 8100 loss: 1.19766 of batch 8200 loss: 1.19041 of batch 8300 loss: 1.15410 of batch 8400 loss: 1.13109 of batch 8500 loss: 1.16434 of batch 8600 loss: 1.19457 of batch 8700 loss: 1.18558 of batch 8800 loss: 1.18043 of batch 8900 loss: 1.17171 of batch 9000 loss: 1.17663 of batch 9100 loss: 1.14107 of batch 9200 loss: 1.20001 of batch 9300 loss: 1.16926 of batch 9400 loss: 1.14761 of batch 9500 loss: 1.15305 of batch 9600 loss: 1.20601 of batch 9700 loss: 1.16141 of batch 9800 loss: 1.15704 of batch 9900 一些输出： prefix:she sheated here do in the weary blood unless he single thou mayst break your ancients dren lads, then i in to thy fair age. so i say the nursed with mine eyes from my sleep. clarence: how is it pale and mo prefix:the the tears of mine. hark! they may providy to the extremement, make an office of day, like to be so run out. dost than the heaven and your pardon did forght the crown: all shall be too close hather change prefix:i in heavy part: in your comforts of our state exposed for in the kin a phalteres times disprison. isabella: dispatch and great to do a plance: and now they do repent us to be pitteth caser which in my prefix:i it confined him to his necessities, but to an e this young belonged,' as it susminine of france of vices of folly, when though 'i trust up me not, our reward, sir, you shall be absent disposed. warwic prefix:me me; for will is angelo, take it,--and thou knows he did: one catesby! pronirerd: uppecured for me hencefully, you make one flight, making down thee thought in who thy by the world, but name, though hat]]></summary></entry><entry xml:lang="zh-CN"><title type="html">TensorFlow 和 NumPy 的 Broadcasting 机制</title><link href="http://blog.lufficc.com/tensorflow-and-numpy-broadcasting/" rel="alternate" type="text/html" title="TensorFlow 和 NumPy 的 Broadcasting 机制" /><published>2017-10-27T13:21:12+00:00</published><updated>2017-10-27T13:21:12+00:00</updated><id>http://blog.lufficc.com/tensorflow-and-numpy-broadcasting</id><content type="html" xml:base="http://blog.lufficc.com/tensorflow-and-numpy-broadcasting/"><![CDATA[<p>TensorFlow 采用 NumPy 的 Broadcasting  机制，来处理不同形状的 Tensor 之间的算术运算，来节省内存、提高计算效率。</p>

<p>NumPy 数组运算通常是逐元素（element-by-element ）计算，因此要求两个数组的<strong>形状必须相同</strong>：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; a = np.array([1.0, 2.0, 3.0])
&gt;&gt;&gt; b = np.array([2.0, 2.0, 2.0])
&gt;&gt;&gt; a * b
array([ 2.,  4.,  6.])
</code></pre></div></div>
<p>NumPy 的 Broadcasting  机制解除了这种限制，在两个数组的<strong>形状</strong>满足<strong>某种条件</strong>的情况下，不同形状的数组之间仍可以进行算术运算。最简单的就是数组乘以一个标量：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; a = np.array([1.0, 2.0, 3.0])
&gt;&gt;&gt; b = 2.0
&gt;&gt;&gt; a * b
array([ 2.,  4.,  6.])
</code></pre></div></div>
<p>结果和第一个 <code class="language-plaintext highlighter-rouge">b</code> 是数组的例子相同。可以认为标量 <code class="language-plaintext highlighter-rouge">b</code> 被拉伸成了和 <code class="language-plaintext highlighter-rouge">a</code> 相同形状的数组，拉伸后数组每个元素的值为先前标量值的复制，这样形式上和第一种例子相同，因此结果当然一样。但这只是<strong>理论</strong>上的，复制操作并不会真正进行，只是在计算时使用标量的值罢了。因此，<strong>第二个例子效率更高，因为节省了内存</strong>。</p>

<h2 id="broadcasting-规则">Broadcasting 规则</h2>
<p>当两个数组进行算术运算时，<strong>NumPy 从后向前，逐元素比较两个数组的形状</strong>。当逐个比较的元素值满足以下条件时，认为满足 Broadcasting 的条件：</p>
<ol>
  <li><strong>相等</strong></li>
  <li><strong>其中一个是1</strong></li>
</ol>

<p>当不满足时，会抛出 <code class="language-plaintext highlighter-rouge">ValueError: frames are not aligne</code> 异常。算术运算的结果的形状的每一元素，是两个数组形状逐元素比较时的最大值。</p>

<p>而且，两个数组可以有不同的维度。比如一个 ` 256x256x3<code class="language-plaintext highlighter-rouge"> 的数组储存 RGB 值，如果对每个颜色通道进行不同的放缩，我们可以乘以一个一维、形状为 </code>(3, )<code class="language-plaintext highlighter-rouge"> 的数组。因为是**从后向前**比较，因此 </code>3 == 3`，符合 Broadcasting 规则 。</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Image  (3d array): 256 x 256 x 3
Scale  (1d array):             3
Result (3d array): 256 x 256 x 3
</code></pre></div></div>

<p>当其中一个是 <code class="language-plaintext highlighter-rouge">1</code> 时，就会被“拉伸”成和另一个相同大小，即“复制”（没有真正复制）元素值来 Match 另一个，如：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (4d array):  8 x 7 x 6 x 5
</code></pre></div></div>

<p>更多的例子：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>A      (2d array):  5 x 4
B      (1d array):      1
Result (2d array):  5 x 4

A      (2d array):  5 x 4
B      (1d array):      4
Result (2d array):  5 x 4

A      (3d array):  15 x 3 x 5
B      (3d array):  15 x 1 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 5
Result (3d array):  15 x 3 x 5

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 1
Result (3d array):  15 x 3 x 5
</code></pre></div></div>

<p>一些反例（不满足 Broadcasting 规则 ）：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>A      (1d array):  3
B      (1d array):  4 # trailing dimensions do not match

A      (2d array):      2 x 1
B      (3d array):  8 x 4 x 3 # second from last dimensions mismatched
</code></pre></div></div>

<p>实践：</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; x = np.arange(4)
&gt;&gt;&gt; xx = x.reshape(4,1)
&gt;&gt;&gt; y = np.ones(5)
&gt;&gt;&gt; z = np.ones((3,4))

&gt;&gt;&gt; x.shape
(4,)

&gt;&gt;&gt; y.shape
(5,)

&gt;&gt;&gt; x + y
&lt;type 'exceptions.ValueError'&gt;: shape mismatch: objects cannot be broadcast to a single shape

&gt;&gt;&gt; xx.shape
(4, 1)

&gt;&gt;&gt; y.shape
(5,)

&gt;&gt;&gt; (xx + y).shape
(4, 5)

&gt;&gt;&gt; xx + y
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.],
       [ 4.,  4.,  4.,  4.,  4.]])

&gt;&gt;&gt; x.shape
(4,)

&gt;&gt;&gt; z.shape
(3, 4)

&gt;&gt;&gt; (x + z).shape
(3, 4)

&gt;&gt;&gt; x + z
array([[ 1.,  2.,  3.,  4.],
       [ 1.,  2.,  3.,  4.],
       [ 1.,  2.,  3.,  4.]])
</code></pre></div></div>
<p>再例如：</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; a = np.array([0.0, 10.0, 20.0, 30.0])
&gt;&gt;&gt; b = np.array([1.0, 2.0, 3.0])
&gt;&gt;&gt; a[:, np.newaxis] + b
array([[  1.,   2.,   3.],
       [ 11.,  12.,  13.],
       [ 21.,  22.,  23.],
       [ 31.,  32.,  33.]])
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">newaxis</code> 操作为数组 <code class="language-plaintext highlighter-rouge">a</code> 插入一维，变成二维 <code class="language-plaintext highlighter-rouge">4x1</code> 数组，因此 <code class="language-plaintext highlighter-rouge">4x1</code> 的数组加 <code class="language-plaintext highlighter-rouge">(3, )</code> 的数组，结果为 <code class="language-plaintext highlighter-rouge">4x3</code> 的数组。</p>

<h2 id="参考">参考</h2>
<ol>
  <li>https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html</li>
</ol>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[TensorFlow 采用 NumPy 的 Broadcasting 机制，来处理不同形状的 Tensor 之间的算术运算，来节省内存、提高计算效率。 NumPy 数组运算通常是逐元素（element-by-element ）计算，因此要求两个数组的形状必须相同： &gt;&gt;&gt; a = np.array([1.0, 2.0, 3.0]) &gt;&gt;&gt; b = np.array([2.0, 2.0, 2.0]) &gt;&gt;&gt; a * b array([ 2., 4., 6.]) NumPy 的 Broadcasting 机制解除了这种限制，在两个数组的形状满足某种条件的情况下，不同形状的数组之间仍可以进行算术运算。最简单的就是数组乘以一个标量： &gt;&gt;&gt; a = np.array([1.0, 2.0, 3.0]) &gt;&gt;&gt; b = 2.0 &gt;&gt;&gt; a * b array([ 2., 4., 6.]) 结果和第一个 b 是数组的例子相同。可以认为标量 b 被拉伸成了和 a 相同形状的数组，拉伸后数组每个元素的值为先前标量值的复制，这样形式上和第一种例子相同，因此结果当然一样。但这只是理论上的，复制操作并不会真正进行，只是在计算时使用标量的值罢了。因此，第二个例子效率更高，因为节省了内存。 Broadcasting 规则 当两个数组进行算术运算时，NumPy 从后向前，逐元素比较两个数组的形状。当逐个比较的元素值满足以下条件时，认为满足 Broadcasting 的条件： 相等 其中一个是1 当不满足时，会抛出 ValueError: frames are not aligne 异常。算术运算的结果的形状的每一元素，是两个数组形状逐元素比较时的最大值。 而且，两个数组可以有不同的维度。比如一个 ` 256x256x3 的数组储存 RGB 值，如果对每个颜色通道进行不同的放缩，我们可以乘以一个一维、形状为 (3, ) 的数组。因为是**从后向前**比较，因此 3 == 3`，符合 Broadcasting 规则 。 Image (3d array): 256 x 256 x 3 Scale (1d array): 3 Result (3d array): 256 x 256 x 3 当其中一个是 1 时，就会被“拉伸”成和另一个相同大小，即“复制”（没有真正复制）元素值来 Match 另一个，如： A (4d array): 8 x 1 x 6 x 1 B (3d array): 7 x 1 x 5 Result (4d array): 8 x 7 x 6 x 5 更多的例子： A (2d array): 5 x 4 B (1d array): 1 Result (2d array): 5 x 4 A (2d array): 5 x 4 B (1d array): 4 Result (2d array): 5 x 4 A (3d array): 15 x 3 x 5 B (3d array): 15 x 1 x 5 Result (3d array): 15 x 3 x 5 A (3d array): 15 x 3 x 5 B (2d array): 3 x 5 Result (3d array): 15 x 3 x 5 A (3d array): 15 x 3 x 5 B (2d array): 3 x 1 Result (3d array): 15 x 3 x 5 一些反例（不满足 Broadcasting 规则 ）： A (1d array): 3 B (1d array): 4 # trailing dimensions do not match A (2d array): 2 x 1 B (3d array): 8 x 4 x 3 # second from last dimensions mismatched 实践： &gt;&gt;&gt; x = np.arange(4) &gt;&gt;&gt; xx = x.reshape(4,1) &gt;&gt;&gt; y = np.ones(5) &gt;&gt;&gt; z = np.ones((3,4)) &gt;&gt;&gt; x.shape (4,) &gt;&gt;&gt; y.shape (5,) &gt;&gt;&gt; x + y &lt;type 'exceptions.ValueError'&gt;: shape mismatch: objects cannot be broadcast to a single shape &gt;&gt;&gt; xx.shape (4, 1) &gt;&gt;&gt; y.shape (5,) &gt;&gt;&gt; (xx + y).shape (4, 5) &gt;&gt;&gt; xx + y array([[ 1., 1., 1., 1., 1.], [ 2., 2., 2., 2., 2.], [ 3., 3., 3., 3., 3.], [ 4., 4., 4., 4., 4.]]) &gt;&gt;&gt; x.shape (4,) &gt;&gt;&gt; z.shape (3, 4) &gt;&gt;&gt; (x + z).shape (3, 4) &gt;&gt;&gt; x + z array([[ 1., 2., 3., 4.], [ 1., 2., 3., 4.], [ 1., 2., 3., 4.]]) 再例如： &gt;&gt;&gt; a = np.array([0.0, 10.0, 20.0, 30.0]) &gt;&gt;&gt; b = np.array([1.0, 2.0, 3.0]) &gt;&gt;&gt; a[:, np.newaxis] + b array([[ 1., 2., 3.], [ 11., 12., 13.], [ 21., 22., 23.], [ 31., 32., 33.]]) newaxis 操作为数组 a 插入一维，变成二维 4x1 数组，因此 4x1 的数组加 (3, ) 的数组，结果为 4x3 的数组。 参考 https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html]]></summary></entry><entry xml:lang="zh-CN"><title type="html">常用代码片段</title><link href="http://blog.lufficc.com/code-snippets/" rel="alternate" type="text/html" title="常用代码片段" /><published>2017-10-10T12:00:00+00:00</published><updated>2017-10-10T12:00:00+00:00</updated><id>http://blog.lufficc.com/code-snippets</id><content type="html" xml:base="http://blog.lufficc.com/code-snippets/"><![CDATA[<h3 id="ubuntu-安装最新-nginx">Ubuntu 安装最新 Nginx：</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo</span> <span class="nt">-s</span>
<span class="nv">nginx</span><span class="o">=</span>stable <span class="c"># use nginx=development for latest development version</span>
add-apt-repository ppa:nginx/<span class="nv">$nginx</span>
apt-get update
apt-get <span class="nb">install </span>nginx
</code></pre></div></div>

<h3 id="查看-mysql-各数据表大小">查看 Mysql 各数据表大小：</h3>
<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="n">table_schema</span>                                        <span class="nv">"DB Name"</span><span class="p">,</span> 
   <span class="n">Round</span><span class="p">(</span><span class="k">Sum</span><span class="p">(</span><span class="n">data_length</span> <span class="o">+</span> <span class="n">index_length</span><span class="p">)</span> <span class="o">/</span> <span class="mi">1024</span> <span class="o">/</span> <span class="mi">1024</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">"DB Size in MB"</span> 
<span class="k">FROM</span>   <span class="n">information_schema</span><span class="p">.</span><span class="n">tables</span> 
<span class="k">GROUP</span>  <span class="k">BY</span> <span class="n">table_schema</span><span class="p">;</span> 
</code></pre></div></div>

<h3 id="安装55">安装55：</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>apt-get update
<span class="nb">sudo </span>apt-get <span class="nb">install </span>python3-pip
<span class="nb">sudo </span>pip3 <span class="nb">install </span>shadowsocks
<span class="nb">sudo mkdir</span> /var/ss
<span class="nb">sudo </span>vim /var/ss/ss.json
ssserver <span class="nt">-c</span> /var/ss/ss.json <span class="nt">-d</span> start
</code></pre></div></div>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
    </span><span class="nl">"server"</span><span class="p">:</span><span class="s2">"0.0.0.0"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"server_port"</span><span class="p">:</span><span class="mi">8388</span><span class="p">,</span><span class="w">
    </span><span class="nl">"local_address"</span><span class="p">:</span><span class="w"> </span><span class="s2">"127.0.0.1"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"local_port"</span><span class="p">:</span><span class="mi">1080</span><span class="p">,</span><span class="w">
    </span><span class="nl">"password"</span><span class="p">:</span><span class="s2">"mypassword"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"timeout"</span><span class="p">:</span><span class="mi">300</span><span class="p">,</span><span class="w">
    </span><span class="nl">"method"</span><span class="p">:</span><span class="s2">"aes-256-cfb"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"fast_open"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<h3 id="useful">Useful</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># NVIDIA GPUs </span>
nvidia-smi

<span class="c"># folder size. -h: human readable, -s: for summary</span>
<span class="nb">du</span> <span class="nt">-hs</span> /path/to/directory
<span class="c"># 查看 CUDA 版本</span>
<span class="nb">cat</span> /usr/local/cuda/version.txt
<span class="c"># 查看 cuDNN 版本</span>
<span class="nb">cat</span> /usr/local/cuda/include/cudnn.h | <span class="nb">grep </span>CUDNN_MAJOR <span class="nt">-A</span> 2
</code></pre></div></div>

<h3 id="解压">解压</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">tar</span> <span class="nt">-cvf</span> myfolder.tar myfolder
<span class="nb">tar</span> <span class="nt">-xf</span> archive.tar <span class="nt">-C</span> /target/directory
<span class="nb">tar</span> <span class="nt">-xvzf</span> archive.tar.gz <span class="nt">-C</span> /target/directory
</code></pre></div></div>

<h3 id="文件">文件</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 查看当前文件夹文件数量</span>
find <span class="nb">.</span> <span class="nt">-type</span> f | <span class="nb">wc</span> <span class="nt">-l</span>
<span class="c"># 查询文件行数</span>
<span class="nb">wc</span> <span class="nt">-l</span> a.txt
<span class="c"># 查询文件单词个数</span>
<span class="nb">wc</span> <span class="nt">-w</span> a.txt
<span class="c"># 输出整个文件，并在每行前面添加行号</span>
<span class="nb">cat</span> <span class="nt">-n</span> a.txt 
<span class="c"># 查看磁盘空间及目录容量</span>
<span class="nb">df</span> <span class="nt">-hl</span>
<span class="nb">du</span> <span class="nt">-sh</span> <span class="o">[</span>目录名] 返回该目录的大小
</code></pre></div></div>

<h3 id="ssh">SSH</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># 将远程服务器 &lt;remote-ip&gt; 的 127.0.0.1:6006 端口转发到本地 16006 端口。即本地输入</span>
<span class="c"># localhost:16006 实际上访问的是远程服务器的 127.0.0.1:6006。</span>
ssh <span class="nt">-L</span> 16006:127.0.0.1:6006 &lt;username&gt;@&lt;remote-ip&gt; <span class="nt">-p</span> &lt;port&gt;

ssh <span class="nt">-N</span> <span class="nt">-f</span> <span class="nt">-L</span> localhost:16006:localhost:6006 &lt;user@remote&gt;
<span class="nt">-N</span> : no remote commands
<span class="nt">-f</span> : put ssh <span class="k">in </span>the background
<span class="nt">-L</span> &lt;machine1&gt;:&lt;portA&gt;:&lt;machine2&gt;:&lt;portB&gt; : forward &lt;machine2&gt;:&lt;portB&gt; <span class="o">(</span>remote scope<span class="o">)</span> to &lt;machine1&gt;:&lt;portA&gt; <span class="o">(</span><span class="nb">local </span>scope<span class="o">)</span>
</code></pre></div></div>

<h3 id="copy--sync">Copy &amp; Sync</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># -a will keep permissions,etc, and -h will be human readable. </span>
<span class="c"># --progress2 which shows the overall percentage</span>
rsync <span class="nt">-ah</span> <span class="nt">--info</span><span class="o">=</span>progress2 <span class="nb">source </span>destination
</code></pre></div></div>

<h3 id="安装-pyenv">安装 pyenv</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>apt-get <span class="nb">install</span> <span class="nt">--no-install-recommends</span> make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev
git clone https://github.com/pyenv/pyenv.git ~/.pyenv
<span class="nb">echo</span> <span class="s1">'export PYENV_ROOT="$HOME/.pyenv"'</span> <span class="o">&gt;&gt;</span> ~/.zshrc
<span class="nb">echo</span> <span class="s1">'export PATH="$PYENV_ROOT/bin:$PATH"'</span> <span class="o">&gt;&gt;</span> ~/.zshrc
<span class="nb">echo</span> <span class="nt">-e</span> <span class="s1">'if command -v pyenv 1&gt;/dev/null 2&gt;&amp;1; then\n  eval "$(pyenv init -)"\nfi'</span> <span class="o">&gt;&gt;</span> ~/.zshrc
<span class="nb">exec</span> <span class="s2">"</span><span class="nv">$SHELL</span><span class="s2">"</span>
pyenv <span class="nb">install </span>3.6.5
pyenv global 3.6.5
<span class="nb">exec</span> <span class="s2">"</span><span class="nv">$SHELL</span><span class="s2">"</span>
</code></pre></div></div>

<h3 id="安装-cocoapi">安装 cocoapi</h3>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/cocodataset/cocoapi.git
<span class="nb">cd </span>cocoapi/PythonAPI
python setup.py build_ext <span class="nb">install</span>
</code></pre></div></div>]]></content><author><name>lufficc</name></author><summary type="html"><![CDATA[Ubuntu 安装最新 Nginx： sudo -s nginx=stable # use nginx=development for latest development version add-apt-repository ppa:nginx/$nginx apt-get update apt-get install nginx 查看 Mysql 各数据表大小： SELECT table_schema "DB Name", Round(Sum(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB" FROM information_schema.tables GROUP BY table_schema; 安装55： sudo apt-get update sudo apt-get install python3-pip sudo pip3 install shadowsocks sudo mkdir /var/ss sudo vim /var/ss/ss.json ssserver -c /var/ss/ss.json -d start { "server":"0.0.0.0", "server_port":8388, "local_address": "127.0.0.1", "local_port":1080, "password":"mypassword", "timeout":300, "method":"aes-256-cfb", "fast_open": false } Useful # NVIDIA GPUs nvidia-smi # folder size. -h: human readable, -s: for summary du -hs /path/to/directory # 查看 CUDA 版本 cat /usr/local/cuda/version.txt # 查看 cuDNN 版本 cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2 解压 tar -cvf myfolder.tar myfolder tar -xf archive.tar -C /target/directory tar -xvzf archive.tar.gz -C /target/directory 文件 # 查看当前文件夹文件数量 find . -type f | wc -l # 查询文件行数 wc -l a.txt # 查询文件单词个数 wc -w a.txt # 输出整个文件，并在每行前面添加行号 cat -n a.txt # 查看磁盘空间及目录容量 df -hl du -sh [目录名] 返回该目录的大小 SSH # 将远程服务器 &lt;remote-ip&gt; 的 127.0.0.1:6006 端口转发到本地 16006 端口。即本地输入 # localhost:16006 实际上访问的是远程服务器的 127.0.0.1:6006。 ssh -L 16006:127.0.0.1:6006 &lt;username&gt;@&lt;remote-ip&gt; -p &lt;port&gt; ssh -N -f -L localhost:16006:localhost:6006 &lt;user@remote&gt; -N : no remote commands -f : put ssh in the background -L &lt;machine1&gt;:&lt;portA&gt;:&lt;machine2&gt;:&lt;portB&gt; : forward &lt;machine2&gt;:&lt;portB&gt; (remote scope) to &lt;machine1&gt;:&lt;portA&gt; (local scope) Copy &amp; Sync # -a will keep permissions,etc, and -h will be human readable. # --progress2 which shows the overall percentage rsync -ah --info=progress2 source destination 安装 pyenv sudo apt-get install --no-install-recommends make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev git clone https://github.com/pyenv/pyenv.git ~/.pyenv echo 'export PYENV_ROOT="$HOME/.pyenv"' &gt;&gt; ~/.zshrc echo 'export PATH="$PYENV_ROOT/bin:$PATH"' &gt;&gt; ~/.zshrc echo -e 'if command -v pyenv 1&gt;/dev/null 2&gt;&amp;1; then\n eval "$(pyenv init -)"\nfi' &gt;&gt; ~/.zshrc exec "$SHELL" pyenv install 3.6.5 pyenv global 3.6.5 exec "$SHELL" 安装 cocoapi git clone https://github.com/cocodataset/cocoapi.git cd cocoapi/PythonAPI python setup.py build_ext install]]></summary></entry></feed>