在Julia中使用@generated宏进行渐变符号
出于性能原因,我需要与用户定义函数一样快的梯度和 Hessians(例如,ForwardDiff 库使我的代码显着变慢)。然后我尝试使用@generated宏进行元编程,用一个简单的函数进行测试
using Calculus
hand_defined_derivative(x) = 2x - sin(x)
symbolic_primal = :( x^2 + cos(x) )
symbolic_derivative = differentiate(symbolic_primal,:x)
@generated functional_derivative(x) = symbolic_derivative
这正是我想要的:
rand_x = rand(10000);
exact_values = hand_defined_derivative.(rand_x)
test_values = functional_derivative.(rand_x)
isequal(exact_values,test_values) # >> true
@btime hand_defined_derivative.(rand_x); # >> 73.358 ?s (5 allocations: 78.27 KiB)
@btime functional_derivative.(rand_x); # >> 73.456 ?s (5 allocations: 78.27 KiB)
我现在需要将其推广到具有更多参数的函数。明显的推断是:
symbolic_primal = :( x^2 + cos(x) + y^2 )
symbolic_gradient = differentiate(symbolic_primal,[:x,:y])
该symbolic_gradient预期(就如同在一维的情况下),但我相信它会在@generated宏不应对多种尺寸的行为:
@generated functional_gradient(x,y) = symbolic_gradient
functional_gradient(1.0,1.0)
>> 2-element Array{Any,1}:
:(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
:(2 * 1 * y ^ (2 - 1))
也就是说,它不会将符号转换为生成的函数。有没有简单的方法来解决这个问题?
PS:我知道我可以将关于每个参数的导数定义为一维函数并将它们捆绑在一起形成一个梯度(这就是我目前正在做的),但我相信一定有更好的方法。
回答
首先,我认为您不需要在@generated这里使用:这是代码生成的“简单”案例,我认为使用@eval更简单且不那么令人惊讶。
所以一维情况可以这样重写:
julia> using Calculus
julia> symbolic_primal = :( x^2 + cos(x) )
:(x ^ 2 + cos(x))
julia> symbolic_derivative = differentiate(symbolic_primal,:x)
:(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
julia> hand_defined_derivative(x) = 2x - sin(x)
hand_defined_derivative (generic function with 1 method)
# Let's check first what code we'll be evaluating
# (`quote` returns the unevaluated expression passed to it)
julia> quote
functional_derivative(x) = $symbolic_derivative
end
quote
functional_derivative(x) = begin
2 * 1 * x ^ (2 - 1) + 1 * -(sin(x))
end
end
# Looks OK => let's evaluate it now
# (since `@eval` is macro, its argument will be left unevaluated
# => no `quote` here)
julia> @eval begin
functional_derivative(x) = $symbolic_derivative
end
functional_derivative (generic function with 1 method)
julia> rand_x = rand(10000);
julia> exact_values = hand_defined_derivative.(rand_x);
julia> test_values = functional_derivative.(rand_x);
julia> @assert isequal(exact_values,test_values)
# Don't forget to interpolate array arguments when using `BenchmarkTools`
julia> using BenchmarkTools
julia> @btime hand_defined_derivative.($rand_x);
104.259 ?s (2 allocations: 78.20 KiB)
julia> @btime functional_derivative.($rand_x);
104.537 ?s (2 allocations: 78.20 KiB)
现在 2D 情况不起作用,因为 的输出differentiate是一个表达式数组(每个组件一个表达式),您需要将其转换为构建组件数组(或元组,以提高性能)的表达式。这是symbolic_gradient_expr在下面的例子中:
julia> symbolic_primal = :( x^2 + cos(x) + y^2 )
:(x ^ 2 + cos(x) + y ^ 2)
julia> hand_defined_gradient(x, y) = (2x - sin(x), 2y)
hand_defined_gradient (generic function with 1 method)
# This is a vector of expressions
julia> symbolic_gradient = differentiate(symbolic_primal,[:x,:y])
2-element Array{Any,1}:
:(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
:(2 * 1 * y ^ (2 - 1))
# Wrap expressions for all components of the gradient into a single expression
# generating a tuple of them:
julia> symbolic_gradient_expr = Expr(:tuple, symbolic_gradient...)
:((2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)), 2 * 1 * y ^ (2 - 1)))
julia> @eval functional_gradient(x, y) = $symbolic_gradient_expr
functional_gradient (generic function with 1 method)
与 1D 情况一样,这与手写版本的表现相同:
julia> rand_x = rand(10000); rand_y = rand(10000);
julia> exact_values = hand_defined_gradient.(rand_x, rand_y);
julia> test_values = functional_gradient.(rand_x, rand_y);
julia> @assert isequal(exact_values,test_values)
julia> @btime hand_defined_gradient.($rand_x, $rand_y);
113.182 ?s (2 allocations: 156.33 KiB)
julia> @btime functional_gradient.($rand_x, $rand_y);
112.283 ?s (2 allocations: 156.33 KiB)
- `quote ... end` is more-or-less `:( ... )`, it makes an `Expr`, usually (as here) with other expressions interpolated into it. Instead of `@eval begin ...` (which quotes the block for you) you could write `eval(quote ...`, passing an expression to a function.