为什么这个Julia片段比Python等价的代码慢这么多?(有字典)
我在 Python Jupyter 中有以下代码:
n = 10**7
d = {}
%timeit for i in range(n): d[i] = i
%timeit for i in range(n): _ = d[i]
%timeit d[10]
以下时间:
763 ms ± 19.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
692 ms ± 3.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
39.5 ns ± 0.186 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
这在朱莉娅身上
using BenchmarkTools
d = Dict{Int64, Int64}()
n = 10^7
r = 1:n
@btime begin
for i in r
d[i] = i
end
end
@btime begin
for i in r
_ = d[i]
end
end
@btime d[10]
随着时间:
2.951 s (29999490 allocations: 610.34 MiB)
3.327 s (39998979 allocations: 762.92 MiB)
20.163 ns (0 allocations: 0 bytes)
我不太能理解的是,为什么 Julia 的字典值分配和循环检索(前两个测试)似乎要慢得多,但同时在单键检索(最后一个测试)中要快得多)。在循环中它似乎慢 4 倍,但如果不在循环中则快两倍。我是 Julia 的新手,所以我不确定我是否在做一些不理想的事情,或者这是否在意料之中。
回答
由于您在顶级范围内进行基准测试@btime,$因此您必须插入变量,因此对代码进行基准测试的方法是:
julia> using BenchmarkTools
julia> d = Dict{Int64, Int64}()
Dict{Int64, Int64}()
julia> n = 10^7
10000000
julia> r = 1:n
1:10000000
julia> @btime begin
for i in $r
$d[i] = i
end
end
842.891 ms (0 allocations: 0 bytes)
julia> @btime begin
for i in $r
_ = $d[i]
end
end
618.808 ms (0 allocations: 0 bytes)
julia> @btime $d[10]
6.300 ns (0 allocations: 0 bytes)
10
在 Jupyter Notebook 的同一台机器上运行 Python 3 的时间是:
n = int(10.0**7)
d = {}
%timeit for i in range(n): d[i] = i
913 ms ± 87.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit for i in range(n): _ = d[i]
816 ms ± 92.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit d[10]
50.2 ns ± 2.97 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
但是,对于第一个操作,我假设您更想对此进行基准测试:
julia> function f(n)
d = Dict{Int64, Int64}()
for i in 1:n
d[i] = i
end
end
f (generic function with 1 method)
julia> @btime f($n)
1.069 s (72 allocations: 541.17 MiB)
反对这一点:
def f(n):
d = {}
for i in range(n):
d[i] = i
%timeit f(n)
1.18 s ± 65.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
还应该注意的是,使用特定的值n可能会产生误导,因为 Julia 和 Python 不能保证在同一时刻将它们的集合大小调整为相同的新大小(为了存储字典,您通常会分配比需要更多的内存)避免哈希冲突,这里实际上测试的特定值n可能很重要)。
编辑
请注意,如果我将全局变量声明为constall 很快,那么编译器可以优化代码(它知道绑定到全局变量的值的类型不能改变);因此$不需要使用:
julia> using BenchmarkTools
julia> const d = Dict{Int64, Int64}()
Dict{Int64, Int64}()
julia> const n = 10^7
10000000
julia> const r = 1:n
1:10000000
julia> @btime begin
for i in r
d[i] = i
end
end
895.788 ms (0 allocations: 0 bytes)
julia> @btime begin
for i in $r
_ = $d[i]
end
end
582.214 ms (0 allocations: 0 bytes)
julia> @btime $d[10]
6.800 ns (0 allocations: 0 bytes)
10
如果您好奇拥有对线程的本机支持有什么好处,这里是一个简单的基准测试(此功能是语言的一部分):
julia> Threads.nthreads()
4
julia> @btime begin
Threads.@threads for i in $r
_ = $d[i]
end
end
215.461 ms (23 allocations: 2.17 KiB)
- without `$` as you can see in your benchmarks you have hundreds of millions of allocations. The reason is (among other things, but let me point a single one) that Julia supports multi-threading natively. Therefore if you use a global variable the compiler cannot be sure that this variable will not be rebound to a value of different type during the execution of the code by other thread. Thus it does more work than needed (as it must be ready for the variable to be bound to `Any`thing). When you use `$` the value is interpolated into executed code block, so Julia knows its type will not change.