[prev in list] [next in list] [prev in thread] [next in thread]
List: r-sig-geo
Subject: Re: [R-sig-Geo] Odd behavior of dismo's extract function
From: Nick Matzke <matzke () nimbios ! org>
Date: 2016-07-25 2:35:04
Message-ID: CAJdu7BBs=qMhs+DH860XRLprDR1MB3FY0+Mko7BSaVgvLdtMOw () mail ! gmail ! com
[Download RAW message or body]
I am on R 3.3.1 and don't get the huge time difference. There is a tiny
decrease though from 250 to 251:
=============
wd = "~/Downloads/extract_weirdness/"
setwd(wd)
library(raster)
library(dismo)
extract.test <- function(env, N){
extract(env, randomPoints(env, N))
}
env.files <- list.files(path = ".", pattern = "pc", full.names =
TRUE)
env <- stack(env.files)
system.time(extract.test(env, 250))
user system elapsed
1.455 0.043 1.492
Warning message:
In couldBeLonLat(mask) : CRS is NA. Assuming it is longitude/latitude
system.time(extract.test(env, 251)) user system elapsed
1.137 0.033 1.158
Warning message:
In couldBeLonLat(mask) : CRS is NA. Assuming it is longitude/latitude
=============
...but I won't worry about it myself. Thanks for the solution though, that
was weird behavior!
Nick
On Mon, Jul 25, 2016 at 12:29 PM, Dan Warren <dan.l.warren@gmail.com> wrote:
> Updating to R 3.3.1 fixed it. Thanks! Still baffled as to why the sudden
> dropoff between 250 and 251, but as long as it's working all is well.
>
> Cheers!
>
>
> On Mon, Jul 25, 2016 at 12:24 PM, Dan Warren <dan.l.warren@gmail.com>
> wrote:
>
> > How very odd. I'm using R 3.3.0, but as far as I can tell I'm using the
> > same package versions as you. I've tried this on two machines (12 core
> Mac
> > Pro and an older Macbook Pro) and I'm getting the same phenomenon on
> both.
> > Could it be a weird OSX thing? I'll try updating R and then if it still
> > persists I'll bootcamp over into Windows and see if it's happening for me
> > there.
> >
> >
> > My session info (sorry for not including that the first time):
> >
> > Session info
> >
> ----------------------------------------------------------------------------------------------------------------------------------------
>
> > setting value
> > version R version 3.3.0 (2016-05-03)
> > system x86_64, darwin13.4.0
> > ui RStudio (0.99.491)
> > language (EN)
> > collate en_AU.UTF-8
> > tz Australia/Sydney
> > date 2016-07-25
> >
> > Packages
> >
> --------------------------------------------------------------------------------------------------------------------------------------------
>
> > package * version date source
> > colorspace 1.2-6 2015-03-11 CRAN (R 3.3.0)
> > devtools 1.12.0 2016-06-24 CRAN (R 3.3.0)
> > digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
> > dismo * 1.1-1 2016-06-16 CRAN (R 3.3.0)
> > ENMTools * 0.1 2016-07-25 local
> > ggplot2 * 2.1.0 2016-03-01 CRAN (R 3.3.0)
> > gridExtra * 2.2.1 2016-02-29 CRAN (R 3.3.0)
> > gtable 0.2.0 2016-02-26 CRAN (R 3.3.0)
> > highr 0.6 2016-05-09 CRAN (R 3.3.0)
> > knitr * 1.13 2016-05-09 CRAN (R 3.3.0)
> > lattice 0.20-33 2015-07-14 CRAN (R 3.3.0)
> > memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
> > munsell 0.4.3 2016-02-13 CRAN (R 3.3.0)
> > plyr * 1.8.4 2016-06-08 CRAN (R 3.3.0)
> > raster * 2.5-8 2016-06-02 CRAN (R 3.3.0)
> > Rcpp 0.12.5 2016-05-14 CRAN (R 3.3.0)
> > rgeos * 0.3-19 2016-04-04 CRAN (R 3.3.0)
> > rJava 0.9-8 2016-01-07 CRAN (R 3.3.0)
> > scales 0.4.0 2016-02-26 CRAN (R 3.3.0)
> > sp * 1.2-3 2016-04-14 CRAN (R 3.3.0)
> > viridis * 0.3.4 2016-03-12 CRAN (R 3.3.0)
> > withr 1.0.2 2016-06-20 CRAN (R 3.3.0)
> >
> >
> > On Mon, Jul 25, 2016 at 12:15 PM, Michael Sumner <mdsumner@gmail.com>
> > wrote:
> >
> > >
> > >
> > > On Mon, 25 Jul 2016 at 11:35 Dan Warren <dan.l.warren@gmail.com> wrote:
> > >
> > > > Just realized I pasted in the results backwards. It should have been
> > > >
> > > > system.time(extract.test(env, 250))
> > > >
> > > > user system elapsed
> > > > 124.562 0.516 125.061
> > > >
> > > > system.time(extract.test(env, 251))
> > > >
> > > > user system elapsed
> > > > 2.807 0.084 2.891
> > > >
> > > >
> > > >
> > > >
> > > I don't see the effect.
> > >
> > > Perhaps it was fixed in recent version of raster?
> > >
> > > Please post reproducible details, I downloaded your data files to
> > > "test/testdata/" to try this.
> > >
> > > Cheers, Mike.
> > >
> > >
> > > library(raster)
> > > library(dismo)
> > > extract.test <- function(env, N){
> > > extract(env, dismo::randomPoints(env, N))
> > > }
> > >
> > > env.files <- list.files(path = "test/testdata/", pattern = "pc",
> > > full.names =
> > > TRUE)
> > > env <- raster::stack(env.files)
> > >
> > > library(rbenchmark)
> > > benchmark(n250 = extract.test(env, 250),
> > > n251 = extract.test(env, 251), replications = 4)
> > > # test replications elapsed relative user.self sys.self user.child
> > > sys.child
> > > # 1 n250 4 6.31 1.008 5.13 1.14 NA
> > > NA
> > > # 2 n251 4 6.26 1.000 5.02 1.22 NA
> > > NA
> > > devtools::session_info()
> > > # Session info
> > >
> -------------------------------------------------------------------------------------------------------------------------------
>
> > > # setting value
> > > # version R version 3.3.1 Patched (2016-07-09 r70874)
> > > # system x86_64, mingw32
> > > # ui RStudio (0.99.1261)
> > > # language (EN)
> > > # collate English_Australia.1252
> > > # tz Australia/Hobart
> > > # date 2016-07-25
> > > #
> > > # Packages
> > >
> -----------------------------------------------------------------------------------------------------------------------------------
>
> > > # package * version date source
> > > # devtools * 1.12.0 2016-06-24 CRAN (R 3.3.1)
> > > # digest 0.6.9 2016-01-08 CRAN (R 3.3.1)
> > > # dismo * 1.1-1 2016-06-16 CRAN (R 3.3.1)
> > > # evaluate 0.9 2016-04-29 CRAN (R 3.3.1)
> > > # htmltools 0.3.5 2016-03-21 CRAN (R 3.3.1)
> > > # knitr 1.13 2016-05-09 CRAN (R 3.3.1)
> > > # lattice 0.20-33 2015-07-14 CRAN (R 3.3.1)
> > > # magrittr 1.5 2014-11-22 CRAN (R 3.3.1)
> > > # memoise 1.0.0 2016-01-29 CRAN (R 3.3.1)
> > > # raster * 2.5-8 2016-06-02 CRAN (R 3.3.1)
> > > # rbenchmark * 1.0.0 2012-08-30 CRAN (R 3.3.0)
> > > # Rcpp 0.12.5 2016-05-14 CRAN (R 3.3.1)
> > > # rgdal 1.1-10 2016-05-12 CRAN (R 3.3.1)
> > > # rmarkdown 1.0.2 2016-07-19 Github (rstudio/rmarkdown@b65e177)
> > > # sp * 1.2-3 2016-04-14 CRAN (R 3.3.1)
> > > # stringi 1.1.1 2016-05-27 CRAN (R 3.3.0)
> > > # stringr 1.0.0 2015-04-30 CRAN (R 3.3.1)
> > > # withr 1.0.2 2016-06-20 CRAN (R 3.3.1)
> > >
> > >
> > >
> > >
> > >
> > > > Dan Warren, Ph.D.
> > > > Department of Biology
> > > > Macquarie University
> > > > Email: dan.warren@mq.edu.au <dan.warren@anu.edu.au>
> > > > Phone (US): 530-848-3809
> > > > Phone (Australia): 0468 696 897
> > > > Phone (Work): 02 9850 8587
> > > > Skype: dan.l.warren
> > > > Google Scholar
> > > > <https://scholar.google.com/citations?user=NTzu9c8AAAAJ&hl=en> Orcid
> > > > <http://orcid.org/0000-0002-8747-2451> ResearcherID
> > > > <http://www.researcherid.com/rid/B-3821-2010> Scopus
> > > > <http://www.scopus.com/authid/detail.url?authorId=7202133982>
> > > >
> > > >
> > > > On Mon, Jul 25, 2016 at 10:34 AM, Dan Warren <dan.l.warren@gmail.com>
> > > > wrote:
> > > >
> > > > > This is not an error per se so much as just something very weird
> that I
> > > > > have noticed with a project I've been working on recently. I'm
> > > > wondering
> > > > > if anyone here has any insight as to what may be causing this
> > > > behavior. I
> > > > > haven't yet been able to duplicate it with simulated rasters (more
> > > > info on
> > > > > that below), but it appears very reliably with real environmental
> data
> > > > > including the PC rasters for Cuba I have hosted here:
> > > > >
> > > > > https://github.com/danlwarren/ENMTools/tree/master/test/testdata
> > > > >
> > > > > What's happening is this: if I go to extract data from those rasters
> > > > using
> > > > > occurrence points, the amount of time it takes increases very rapidly
> > > > up to
> > > > > exactly 250 points, and falls dramatically after that. So
> dramatically
> > > > > that it takes over two minutes to extract data for 250 points but
> just
> > > > > under three seconds for 251. I've established that it's not a
> > > > question of
> > > > > the points themselves being wonky, because it happens with random
> > > > points as
> > > > > well.
> > > > >
> > > > >
> > > > > extract.test <- function(env, N){
> > > > > extract(env, randomPoints(env, N))
> > > > > }
> > > > >
> > > > > env.files <- list.files(path = "testdata/", pattern = "pc",
> full.names
> > > > =
> > > > > TRUE)
> > > > > env <- stack(env.files)
> > > > >
> > > > > system.time(extract.test(env, 250))
> > > > >
> > > > > user system elapsed
> > > > > 2.807 0.084 2.891
> > > > >
> > > > > system.time(extract.test(env, 251))
> > > > >
> > > > > user system elapsed
> > > > > 124.562 0.516 125.061
> > > > >
> > > > > numpoints,time
> > > > > 1,1.54
> > > > > 5,3.93
> > > > > 10,6.764
> > > > > 50,29.939
> > > > > 100,61.431
> > > > > 150,79.295
> > > > > 200,110.283
> > > > > 250,120.118
> > > > > 251,2.748
> > > > > 252,2.756
> > > > > 254,2.767
> > > > > 500,2.876
> > > > > 1000,3.153
> > > > >
> > > > > The data being extracted looks perfectly reasonable in all cases.
> It's
> > > > > not just these layers, either. Although (as I mentioned above) I
> have
> > > > yet
> > > > > to come up with simulated rasters that show this behavior, I see this
> > > > > behavior for both of the sets of rasters for real environmental data
> > > > that
> > > > > I've tried. The results above are from a PCA on Worldclim data for
> > > > Cuba,
> > > > > but I just tried them on some Climond data I've got for Australia and
> > > > I get
> > > > > the same behavior. Those rasters are much larger, though, and a
> > > > result the
> > > > > times are longer; 251 points took about 43 seconds, whereas I just
> had
> > > > to
> > > > > give up and stop the 250 point extraction after about 30 minutes.
> > > > >
> > > > > As for those simulated rasters, I've tried the following:
> > > > >
> > > > > Plain grids of sequential numbers
> > > > > As above, but with a bunch of NAs added
> > > > > Filling the Cuban rasters with sequential numbers
> > > > > Filling the Cuban rasters with random numbers from a uniform (0,1)
> > > > > distribution
> > > > >
> > > > > None of those show this issue. Anyone have any thoughts about what
> > > > might
> > > > > be going on here?
> > > > >
> > > > >
> > > >
> > > > [[alternative HTML version deleted]]
> > > >
> > > > _______________________________________________
> > > > R-sig-Geo mailing list
> > > > R-sig-Geo@r-project.org
> > > > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> > > >
> > > --
> > > Dr. Michael Sumner
> > > Software and Database Engineer
> > > Australian Antarctic Division
> > > 203 Channel Highway
> > > Kingston Tasmania 7050 Australia
> > >
> > >
> >
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
[[alternative HTML version deleted]]
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic