# Deep Learning from Scratch to GPU, Part 2: Bias and Activation Function

In the current state, the network combines all layers into a single linear transformation . We can introduce basic decision-making capability by adding a cutoff to the output of each neuron. When the weighted sums of its inputs are below that threshold, the output is zero and when they are above, the output is one.

\begin{equation} output = \left\{ \begin{array}{ll} 0 & W\mathbf{x} \leq threshold \\ 1 & W\mathbf{x} > threshold \\ \end{array} \right. \end{equation}

Since we keep the current outputs in a (potentially) huge vector, it would be inconvenient to write a scalar-based logic for that. I prefer to use a vectorized function, or create one if there is not exactly what we need.

Neanderthal does not have the exact cutoff function, but we can create one by subtracting threshold from the maximum of each threshold and the signal value and then mapping the signum function to the result. There are simpler ways to compute this, but I wanted to use the existing functions, and do the computation in-place. It is of purely educational value, anyway. We will see soon that there are better things to use for transforming the output than the vanilla step function.

(defn step! [threshold x] (fmap! signum (axpy! -1.0 threshold (fmax! threshold x x))))

(let [threshold (dv [1 2 3]) x (dv [0 2 7])] (step! threshold x))

nil#RealBlockVector[double, n:3, offset: 0, stride:1] [ 0.00 0.00 1.00 ]

I'm going to show you a few steps in the evolution of the code, so I will
reuse weights and x. To simplify the example, we will use global `def`

and not care about properly releasing the memory.
It will not matter in a REPL session, but not forget to do it in the real code.
Continuing the example fromPart 1:

(def x (dv 0.3 0.9)) (def w1 (dge 4 2 [0.3 0.6 0.1 2.0 0.9 3.7 0.0 1.0] {:layout :row})) (def threshold (dv 0.7 0.2 1.1 2))

Since we do not care about extra instances at the moment, we'll use the pure
`mv`

function instead of
`mv!`

for convenience. `mv`

creates the resulting vector `y`

,
instead of mutating the one that has to be provided as an argument.

(step! threshold (mv w1 x))

nil#RealBlockVector[double, n:4, offset: 0, stride:1] [ 0.00 1.00 1.00 0.00 ]

The bias is simply the threshold moved to the left side of the equation:

\begin{equation} output = \left\{ \begin{array}{ll} 0 & W\mathbf{x} - bias \leq 0 \\ 1 & W\mathbf{x} - bias > 0 \\ \end{array} \right. \end{equation}

(def bias (dv 0.7 0.2 1.1 2)) (def zero (dv 4))

(step! zero (axpy! -1.0 bias (mv w1 x)))

nil#RealBlockVector[double, n:4, offset: 0, stride:1] [ 0.00 1.00 1.00 0.00 ]

Remember that bias is the same as threshold. There is no need for the extra zero vector.

(step! bias (mv w1 x))

nil#RealBlockVector[double, n:4, offset: 0, stride:1] [ 0.00 1.00 1.00 0.00 ]

- GitHub超赞项目：深度学习10年顶会论文 代码精选
- 用于深度强化学习的结构化控制网络（ICML 论文讲解）
- 独家揭秘：微博深度学习平台如何支撑 4 亿用户愉快吃瓜？
- 20th Century Fox Uses ML to Predict a Movie Audience
- PyCharm Docker：打造最舒适的深度学习炼丹炉
- 10分钟搭建你的第一个图像识别模型（附步骤、代码）
- 无人驾驶技术入门（十六）| 初识深度学习之交通标志分类
- 万字长文带你看尽深度学习中的各种卷积网络（上篇）
- Apache Spark Intel Analytics Zoo 进行深度学习
- 机器如何“猜你喜欢”？深度学习模型在1688的应用实践