[prev in list] [next in list] [prev in thread] [next in thread] 

List:       r-devel
Subject:    [Rd] as.matrix.dist patch (performance)
From:       Tim Taylor <tim.taylor () hiddenelephants ! co ! uk>
Date:       2023-08-10 21:38:44
Message-ID: c919dc37-e226-3b9c-74a8-455c5dea79af () hiddenelephants ! co ! uk
[Download RAW message or body]

Please find attached a small patch to improve the performance of 
as.matrix.dist().  It's a tiny bit more involved than the current code 
but does bring a reasonable speed improvement for larger <dist> objects 
(remaining comparable for smaller ones).

Example:

set.seed(1)
dat <- matrix(rnorm(20000), ncol = 2);
system.time(as.matrix(dist(dat)))

As of r84931:

    user  system elapsed
   3.370   1.154   4.535

With this patch:

    user  system elapsed
   1.925   0.754   2.685

Submitting here in the first instance but happy to move to Bugzilla if 
more appropriate.

Cheers

Tim

["patch.diff" (text/x-patch)]

Index: src/library/stats/R/dist.R
===================================================================
--- src/library/stats/R/dist.R	(revision 84931)
+++ src/library/stats/R/dist.R	(working copy)
@@ -49,10 +49,13 @@
 {
     size <- attr(x, "Size")
     df <- matrix(0, size, size)
-    lower <- row(df) > col(df)
+    idx <- seq_len(size)
+    d1 <- unlist(lapply(idx[-1L], seq.int, to = size, by = 1L))
+    d2 <- rep.int(idx[-size], times = rev(idx[-size]))
+    lower <- cbind(d1,d2)
+    upper <- cbind(d2,d1)
     df[lower] <- x ## preserving NAs in x
-    df <- t(df)
-    df[lower] <- x
+    df[upper] <- x
     labels <- attr(x, "Labels")
     dimnames(df) <-
 	if(is.null(labels)) list(seq_len(size), seq_len(size)) else list(labels,labels)


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic