[prev in list] [next in list] [prev in thread] [next in thread]
List: r-devel
Subject: [Rd] as.matrix.dist patch (performance)
From: Tim Taylor <tim.taylor () hiddenelephants ! co ! uk>
Date: 2023-08-10 21:38:44
Message-ID: c919dc37-e226-3b9c-74a8-455c5dea79af () hiddenelephants ! co ! uk
[Download RAW message or body]
Please find attached a small patch to improve the performance of
as.matrix.dist(). It's a tiny bit more involved than the current code
but does bring a reasonable speed improvement for larger <dist> objects
(remaining comparable for smaller ones).
Example:
set.seed(1)
dat <- matrix(rnorm(20000), ncol = 2);
system.time(as.matrix(dist(dat)))
As of r84931:
user system elapsed
3.370 1.154 4.535
With this patch:
user system elapsed
1.925 0.754 2.685
Submitting here in the first instance but happy to move to Bugzilla if
more appropriate.
Cheers
Tim
["patch.diff" (text/x-patch)]
Index: src/library/stats/R/dist.R
===================================================================
--- src/library/stats/R/dist.R (revision 84931)
+++ src/library/stats/R/dist.R (working copy)
@@ -49,10 +49,13 @@
{
size <- attr(x, "Size")
df <- matrix(0, size, size)
- lower <- row(df) > col(df)
+ idx <- seq_len(size)
+ d1 <- unlist(lapply(idx[-1L], seq.int, to = size, by = 1L))
+ d2 <- rep.int(idx[-size], times = rev(idx[-size]))
+ lower <- cbind(d1,d2)
+ upper <- cbind(d2,d1)
df[lower] <- x ## preserving NAs in x
- df <- t(df)
- df[lower] <- x
+ df[upper] <- x
labels <- attr(x, "Labels")
dimnames(df) <-
if(is.null(labels)) list(seq_len(size), seq_len(size)) else list(labels,labels)
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic