有没有一种方便的方法可以在Raku中复制R的“命名向量”概念,可能使用Mixins?

在计算器上关于最近的问题混入在乐都激起了我的兴趣,以混入是否可以适用于重复出现在其他编程语言的特点。

例如,在R语言中,可以给向量的元素起一个名字(即属性),这对于数据分析非常方便。有关一个很好的示例,请参阅Andrie de Vries 和 Joris Meys 的“How to Name the Values in Your Vectors in R”,他们使用R的内置islands数据集说明了此功能。下面是一个更普通的例子(代码在 R-REPL 中运行):

> #R-code
> x <- 1:4
> names(x) <- LETTERS[1:4]
> str(x)
 Named int [1:4] 1 2 3 4
 - attr(*, "names")= chr [1:4] "A" "B" "C" "D"
> x
A B C D 
1 2 3 4 
> x[1]
A 
1 
> sum(x)
[1] 10

下面我尝试使用 de Vries 和 MeysR使用的相同islands数据集复制'named-vectors' 。虽然下面的脚本运行并且(通常,请参见下面的 #3)产生所需/预期的输出,但我在底部留下了三个主要问题:

#Raku-script below;

put "Read in data.";

my $islands_A = <11506,5500,16988,2968,16,184,23,280,84,73,25,43,21,82,3745,840,13,30,30,89,40,33,49,14,42,227,16,36,29,15,306,44,58,43,9390,32,13,29,6795,16,15,183,14,26,19,13,12,82>.split(","); #Area

my $islands_N = <<"Africa" "Antarctica" "Asia" "Australia" "Axel Heiberg" "Baffin" "Banks" "Borneo" "Britain" "Celebes" "Celon" "Cuba" "Devon" "Ellesmere" "Europe" "Greenland" "Hainan" "Hispaniola" "Hokkaido" "Honshu" "Iceland" "Ireland" "Java" "Kyushu" "Luzon" "Madagascar" "Melville" "Mindanao" "Moluccas" "New Britain" "New Guinea" "New Zealand (N)" "New Zealand (S)" "Newfoundland" "North America" "Novaya Zemlya" "Prince of Wales" "Sakhalin" "South America" "Southampton" "Spitsbergen" "Sumatra" "Taiwan" "Tasmania" "Tierra del Fuego" "Timor" "Vancouver" "Victoria">>; #Name

"----".say;

put "Count elements (Area): ", $islands_A.elems; #OUTPUT 48
put "Count elements (Name): ", $islands_N.elems; #OUTPUT 48

"----".say;

put "Create 'named vector' array (and output):n";
my @islands;
my $i=0;
for (1..$islands_A.elems) { 
    @islands[$i] := $islands_A[$i] but $islands_N[$i].Str;
    $i++;
};

say "All islands (returns Area): ",     @islands;             #OUTPUT: returns 48 areas (above)
say "All islands (returns Name): ",     @islands>>.Str;       #OUTPUT: returns 48 names (above)
say "Islands--slice (returns Area): ",  @islands[0..3];       #OUTPUT: (11506 5500 16988 2968)
say "Islands--slice (returns Name): ",  @islands[0..3]>>.Str; #OUTPUT: (Africa Antarctica Asia Australia)
say "Islands--first (returns Area): ",  @islands[0];          #OUTPUT: 11506
say "Islands--first (returns Name): ",  @islands[0]>>.Str;    #OUTPUT: (Africa)

put "Islands--first (returns Name): ",  @islands[0];          #OUTPUT: Africa
put "Islands--first (returns Name): ",  @islands[0]>>.Str;    #OUTPUT: Africa
  1. 有没有更简单的方法来编写 Mixin 循环...$islands_A[$i] but $islands_N[$i].Str;?可以完全消除循环吗?

  2. 是否可以编写anamed-vectornvecwrapper以与 R 相同的方式put返回(name)n(value),即使对于单个元素也是如此?Raku 的Pair方法在这里有用吗?

  3. 与上面的 #2 相关,调用put单个元素@islands[0]返回 nameAfrica而不是 Area value 11506。[请注意,调用say]不会发生这种情况。是否有任何简单的代码可以实现以确保对于数组的全长切片put始终返回(数字)value或始终返回(Mixin)name

回答

  1. 有没有更简单的方法?
    使用 zip 元运算符Z结合中缀but

    my @islands = $islands_A[] Z[but] $islands_N[];
    
  2. 你为什么不修改数组来改变格式?

  3. put调用.Str它获得的值,say调用.gist

如果要put输出某些特定文本,请确保该.Str方法输出该文本。

不过,我认为您实际上并不想输出该格式。我想你想say输出那种格式。那是因为say是为了让人类理解,而您希望它对人类更好。


当你有一个“乐能做X吗”的问题时,答案是肯定的,这只是需要做多少工作,以及你现在是否仍称它为乐。

你真正想问的问题是做X有多么容易。


我去实现类似你提供的链接的东西

请注意,这只是我在睡觉前创建的一个快速实现。因此,请将此视为第一个草稿。

如果我真的要真正地做到这一点,我可能会扔掉它并在花了几天时间学习足够的 R 来弄清楚它实际上在做什么之后重新开始。

class NamedVec does Positional does Associative {
  has @.names is List;
  has @.nums is List handles <sum>;
  has %!kv is Map;

  class Partial {
    has $.name;
    has $.num;
  }

  submethod TWEAK {
    %!kv := %!kv.new: @!names Z=> @!nums;
  }

  method from-pairlist ( +@pairs ) {
    my @names;
    my @nums;
    for @pairs -> (:$key, :$value) {
      push @names, $key;
      push @nums, $value;
    }
    self.new: :@names, :@nums
  }

  method from-list ( +@list ){
    my @names;
    my @nums;
    for @list -> (:$name, :$num) {
      push @names, $name;
      push @nums, $num;
    }
    self.new: :@names, :@nums
  }

  method gist () {
    my @widths = @!names».chars Zmax @!nums».chars;
    sub infix:<fmt> ( $str, $width is copy ){
      $width -= $str.chars;
      my $l = $width div 2;
      my $r = $width - $l;
      (' ' x $l) ~ $str ~ (' ' x $r)
    }
    (@!names Zfmt @widths) ~ "n" ~ (@!nums Zfmt @widths)
  }

  method R-str () {
    chomp qq :to/END/
    Named num [1:@!nums.elems()] @!nums[]
     - attr(*, "names")= chr [1:@!names.elems()] @!names.map(*.raku)
    END
  }

  method of () {}
  method AT-POS ( $i ){
    Partial.new: name => @!names[$i], num => @!nums[$i]
  }
  method AT-KEY ( $name ){
    Partial.new: :$name, num => %!kv{$name}
  }
}

multi sub postcircumfix:<{ }> (NamedVec:D $v, Str:D $name){
  $v.from-list: callsame
}
multi sub postcircumfix:<{ }> (NamedVec:D $v, List l){
  $v.from-list: callsame
}
 

my $islands_A = <11506,5500,16988,2968,16,184,23,280,84,73,25,43,21,82,3745,840,13,30,30,89,40,33,49,14,42,227,16,36,29,15,306,44,58,43,9390,32,13,29,6795,16,15,183,14,26,19,13,12,82>.split(","); #Area
my $islands_N = <<"Africa" "Antarctica" "Asia" "Australia" "Axel Heiberg" "Baffin" "Banks" "Borneo" "Britain" "Celebes" "Celon" "Cuba" "Devon" "Ellesmere" "Europe" "Greenland" "Hainan" "Hispaniola" "Hokkaido" "Honshu" "Iceland" "Ireland" "Java" "Kyushu" "Luzon" "Madagascar" "Melville" "Mindanao" "Moluccas" "New Britain" "New Guinea" "New Zealand (N)" "New Zealand (S)" "Newfoundland" "North America" "Novaya Zemlya" "Prince of Wales" "Sakhalin" "South America" "Southampton" "Spitsbergen" "Sumatra" "Taiwan" "Tasmania" "Tierra del Fuego" "Timor" "Vancouver" "Victoria">>; 

# either will work
#my $islands = NamedVec.from-pairlist( $islands_N[] Z=> $islands_A[] );
my $islands = NamedVec.new( names => $islands_N, nums => $islands_A );

put $islands.R-str;

say $islands<Asia Africa Antarctica>;

say $islands.sum;


回答

命名向量本质上将向量与从名称到整数位置的映射组合在一起,并允许您按名称寻址元素。命名矢量改变的行为载体,而不是它的元素。所以在 Raku 中我们需要为一个数组定义一个角色:

role Named does Associative {
    has $.names;
    has %!index;

    submethod TWEAK {
        my $i = 0;
        %!index = map { $_ => $i++ }, $!names.list;
    }

    method AT-KEY($key) {
        with %!index{$key} { return-rw self.AT-POS($_) }
        else { self.default }
    }

    method EXISTS-KEY($key) {
        %!index{$key}:exists;
    }

    method gist() {
        join "n", $!names.join("t"), map(*.gist, self).join("t");
    }
}

multi sub postcircumfix:<[ ]>(Named:D list, index, Bool() :$named!) {
    my slice = list[index];
    $named ?? slice but Named(list.names[index]) !! slice;
}

multi sub postcircumfix:<{ }>(Named:D list, names, Bool() :$named!) {
    my slice = list{names};
    $named ?? slice but Named(names) !! slice;
}

混合使用此角色可为您提供 R 命名向量的大部分功能:

my $named = [1, 2, 3] but Named<first second last>;
say $named;                 # OUTPUT: «first?second?last?1?2?3?»
say $named[0, 1]:named;     # OUTPUT: «first?second?1?2?»
say $named<last> = Inf;     # OUTPUT: «Inf?»
say $named<end>:exists;     # OUTPUT: «False?»
say $named<last end>:named; # OUTPUT: «last?end?Inf?(Any)?»

由于这只是一个概念证明,该Named角色不能很好地处理不存在元素的命名。它也不支持修改一部分名称。它可能确实支持创建一个可以混合到多个列表中的双关语。

请注意,此实现依赖于未记录的事实,即下标运算符是多项式。如果您想将角色和运算符放在单独的文件中,您可能希望将is export特征应用于运算符。


以上是有没有一种方便的方法可以在Raku中复制R的“命名向量”概念,可能使用Mixins?的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>