[prev in list] [next in list] [prev in thread] [next in thread] 

List:       apache-modperl
Subject:    Re: Confused about two development utils [EXT]
From:       Matthias Peng <pengmatthias () gmail ! com>
Date:       2020-12-26 8:01:14
Message-ID: CAOZE12p12mhw52yOaWSS37gf2bsgDs4qoVY7ovNfkSZw0gQaGA () mail ! gmail ! com
[Download RAW message or body]

If I have been using modperl for development, does it influence I drive
Maserati? :)


> unsubscribe.
> 
> On Fri, Dec 25, 2020 at 10:30 PM André Warnier (tomcat/perl) <
> aw@ice-sa.com> wrote:
> 
> > Hello James.
> > Bravo and many thanks for this excellent overview of your activities. Of
> > course the setup
> > (in your previous message) and the activities are very impressive by
> > themselves.
> > But in addition, even though your message is not in itself a perl
> > advocacy message, I feel
> > that it would have its right place in some perl/mod_perl advocacy forum,
> > because it
> > touches on some general idea which are valid /also/ for perl and mod_perl.
> > It was very refreshing to read for once a clear exposé of why it is still
> > important
> > nowadays to think before programming, to program efficiently, and to
> > choose the right tool
> > for the job at hand (be it perl, mod_perl, or any other) without the kind
> > of off-the-cuff
> > general a-priori which tend to plague these discussions.
> > 
> > And even though our own (commercial) activities and setups do not have
> > anything even close
> > to the scope which you describe, I would like to say that the same basic
> > principles which
> > you mention in your exposé are just as valid when you scale-down as when
> > you scale-up.
> > ("--you can't just throw memory, CPUs, power at a problem – you have to
> > think – how can I do what I need to do with the least resources..")
> > Even when you think of a single server, or a single server rack, at any
> > one period in time
> > there is always a practical limit as to how much memory or CPUs you can
> > fit in a given
> > server, or how many servers you can fit in a rack, or how many additional
> > Gb of bandwidth
> > you can allocate per server, beyond which there is a sudden "quantum
> > jump" as to how
> > practical and cost-effective a whole project becomes.
> > In that sense, I particulary enjoyed your examples of the database and of
> > the additional
> > power line.
> > 
> > 
> > On 24.12.2020 02:38, James Smith wrote:
> > > We don't use perl for everything, yes we use it for web data, yes we
> > still use it as the
> > > glue language in a lot of cases, the most complex stuff is done with C
> > (not even C++ as
> > > that is too slow). Others on site use Python, Java, Rust, Go, PHP,
> > along with looking at
> > > using GPUs in cases where code can be highly parallelised
> > > 
> > > It is not just one application – but many, many applications… All with
> > a common goal of
> > > understanding the human genome, and using it to assist in developing
> > new understanding and
> > > techniques which can advance health care.
> > > 
> > > We are a very large sequencing centre (one of the largest in the world)
> > – what I was
> > > pointing out is that you can't just throw memory, CPUs, power at a
> > problem – you have to
> > > think – how can I do what I need to do with the least resources. Rather
> > than what
> > > resources can I throw at the problem.
> > > 
> > > Currently we are acting as the central repository for all COVID-19
> > sequencing in the UK,
> > > along with one of the largest "wet" labs sequencing data for it – and
> > that is half the
> > > sequenced samples in the whole world. The UK is sequencing more
> > COVID-19 genomes a day
> > > than most other countries have sequenced since the start of the
> > pandemic in Feb/Mar. This
> > > has lead to us discovering a new more transmissible version of the
> > virus, and it what part
> > > of the country the different strains are present – no other country in
> > the world has the
> > > information, technology or infrastructure in place to achieve this.
> > > 
> > > But this is just a small part of the genomic sequencing we are looking
> > at – we work on:
> > > * other pathogens – e.g. Plasmodium (Malaria);
> > > * cancer genomes (and how effective drugs are);
> > > * are a major part of the Human Cell Atlas which is looking at how the
> > expression of genes
> > > (in the simplest terms which ones are switched on and switched off) are
> > different in
> > > different tissues;
> > > * sequencing the genomes of other animals to understand their evolution;
> > > * and looking at some other species in detail, to see what we can learn
> > from them when
> > > they have defective genes;
> > > 
> > > Although all these are currently scaled back so that we can work
> > relentlessly to support
> > > the medical teams and other researchers get on top of COVID-19.
> > > 
> > > What is interesting is that many of the developers we have on campus
> > (well all wfh at the
> > > moment) are all (relatively) old as we learnt to develop code on
> > machines with limited CPU
> > > and limited memory – so that things had to be efficient, had to be
> > compact…. And that is
> > > as important now as it was 20 or 30 years ago – the data we handle is
> > going up faster than
> > > Moore's Law! Many of us have pride in doing things as efficiently as
> > possible.
> > > 
> > > It took around 10 years to sequence and assemble the first human genome
> > {well we are still
> > > tinkering with it and filling in the gaps} – now at the institute we
> > can sequence and
> > > assemble around 400 human genomes in a day – to the same quality!
> > > 
> > > So most of our issues are due to the scale of the problems we face –
> > e.g. the human genome
> > > has 3 billion base-pairs (A, C, G, Ts) , so normal solutions don't
> > scale to that (once
> > > many years ago we looked at setting up an Oracle database where there
> > was at least 1 row
> > > for every base pair – recording all variants (think of them as spelling
> > mistakes, for
> > > example a T rather than an A, or an extra letter inserted or deleted)
> > for that base pair…
> > > The schema was set up – and then they realised it would take 12 months
> > to load the data
> > > which we had then (which is probably less than a millionth of what we
> > have now)!
> > > 
> > > Moving compute off site is a problem as the transfer of the level of
> > data we have would
> > > cause a problem – you can't easily move all the data to the compute –
> > so you have to bring
> > > the compute to the data.
> > > 
> > > The site I worked on before I became a more general developer was doing
> > that – and the
> > > code that was written 12-15 years ago is actually still going strong –
> > it has seen a few
> > > changes over the year – many displays have had to be redeveloped as the
> > scale of the data
> > > has got so big that even the summary pages we produced 10 years ago
> > have to be summarised
> > > because they are so large.
> > > 
> > > *From:*Mithun Bhattacharya <mithnb@gmail.com>
> > > *Sent:* 24 December 2020 00:06
> > > *To:* mod_perl list <modperl@perl.apache.org>
> > > *Subject:* Re: Confused about two development utils [EXT]
> > > 
> > > James would you be able to share more info about your setup ?
> > > 
> > > 1. What exactly is your application doing which requires so much memory
> > and CPU - is it
> > > something like gene splicing (no i don't know much about it beyond
> > Jurassic Park :D )
> > > 
> > > 2. Do you feel Perl was the best choice for whatever you are doing and
> > if yes then why ?
> > > How much of your stuff is using mod_perl considering you mentioned not
> > much is web related ?
> > > 
> > > 3. What are the challenges you are currently facing with your
> > implementation ?
> > > 
> > > On Wed, Dec 23, 2020 at 6:58 AM James Smith <js5@sanger.ac.uk <mailto:
> > js5@sanger.ac.uk>>
> > > wrote:
> > > 
> > > Oh but memory is a problem – but not if you have just a small
> > cluster of machines!
> > > 
> > > Our boxes are larger than that – but they all run virtual machine
> > {only a small
> > > proportion web related} – machines/memory would rapidly become in
> > our data centre - we
> > > run VMWARE [995 hosts] and openstack [10,000s of hosts] + a
> > selection of large memory
> > > machines {measured in TBs of memory per machine }.
> > > 
> > > We would be looking at somewhere between 0.5 PB and 1 PB of memory
> > – not just the
> > > price of buying that amount of memory - for many machines we need
> > the fastest memory
> > > money can buy for the workload, but we would need a lot more CPUs
> > then we currently
> > > have as we would need a larger amount of machines to have 64GB
> > virtual machines {we
> > > would get 2 VMs per host. We currently have approx. 1-2000 CPUs
> > running our hardware
> > > (last time I had a figure) – it would probably need to go to
> > approximately 5-10,000!
> > > It is not just the initial outlay but the environmental and
> > financial cost of running
> > > that number of machines, and finding space to run them without
> > putting the cooling
> > > costs through the roof!! That is without considering what
> > additional constraints on
> > > storage having the extra machines may have (at the last count a
> > year ago we had over
> > > 30 PBytes of storage on side – and a large amount of offsite backup.
> > > 
> > > We would also stretch the amount of power we can get from the
> > national grid to power
> > > it all - we currently have 3 feeds from different part of the
> > national grid (we are
> > > fortunately in position where this is possible) and the dedicated
> > link we would need
> > > to add more power would be at least 50 miles long!
> > > 
> > > So - managing cores/memory is vitally important to us – moving to
> > the cloud is an
> > > option we are looking at – but that is more than 4 times the price
> > of our onsite
> > > set-up (with substantial discounts from AWS) and would require an
> > upgrade of our
> > > existing link to the internet – which is currently 40Gbit of data
> > (I think).
> > > 
> > > Currently we are analysing a very large amounts of data directly
> > linked to the current
> > > major world problem – this is why the UK is currently being
> > isolated as we have
> > > discovered and can track a new strain, in near real time – other
> > countries have no
> > > ability to do this – we in a day can and do handle, sequence and
> > analyse more samples
> > > than the whole of France has sequenced since February. We probably
> > don't have more of
> > > the new variant strain than in other areas of the world – it is
> > just that we know we
> > > have because of the amount of sequencing and analysis that we in
> > the UK have done.
> > > 
> > > *From:*Matthias Peng <pengmatthias@gmail.com <mailto:
> > pengmatthias@gmail.com>>
> > > *Sent:* 23 December 2020 12:02
> > > *To:* mod_perl list <modperl@perl.apache.org <mailto:
> > modperl@perl.apache.org>>
> > > *Subject:* Re: Confused about two development utils [EXT]
> > > 
> > > Today memory is not serious problem, each of our server has 64GB
> > memory.
> > > 
> > > 
> > > Forgot to add - so our FCGI servers need a lot (and I mean a
> > lot) more memory than
> > > the mod_perl servers to serve the same level of content (just
> > in case memory blows
> > > up with FCGI backends)
> > > 
> > > -----Original Message-----
> > > From: James Smith <js5@sanger.ac.uk <mailto:js5@sanger.ac.uk>>
> > > Sent: 23 December 2020 11:34
> > > To: André Warnier (tomcat/perl) <aw@ice-sa.com <mailto:
> > aw@ice-sa.com>>;
> > > modperl@perl.apache.org <mailto:modperl@perl.apache.org>
> > > Subject: RE: Confused about two development utils [EXT]
> > > 
> > > 
> > > > This costs memory, and all the more since many perl modules
> > are not
> > > thread-safe, so if you use them in your code, at this moment
> > the only safe way to
> > > do it is to use the Apache httpd prefork model. This means that
> > each Apache httpd
> > > child process has its own copy of the perl interpreter, which
> > means that the
> > > memory used by this embedded perl interpreter has to be counted
> > n times (as many
> > > times as there are Apache httpd child processes running at any
> > one time).
> > > 
> > > This isn't quite true - if you load modules before the process
> > forks then they can
> > > cleverly share the same parts of memory. It is useful to be
> > able to "pre-load"
> > > core functionality which is used across all functions {this is
> > the case in Linux
> > > anyway}. It also speeds up child process generation as the
> > modules are already in
> > > memory and converted to byte code.
> > > 
> > > One of the great advantages of mod_perl is Apache2::SizeLimit
> > which can blow away
> > > large child process - and then if needed create new ones. This
> > is not the case
> > > with some of the FCGI solutions as the individual processes can
> > grow if there is a
> > > memory leak or a request that retrieves a large amount of
> > content (even if not
> > > served), but perl can't give the memory back. So FCGI processes
> > only get bigger
> > > and bigger and eventually blow up memory (or hit swap first)
> > > 
> > > 
> > > 
> > > 
> > > 
> > > --
> > > The Wellcome Sanger Institute is operated by Genome Research
> > Limited, a charity
> > > registered in England with number 1021457 and a  company
> > registered in England
> > > with number 2742969, whose registered  office is 215 Euston
> > Road, London, NW1
> > <https://www.google.com/maps/search/215+Euston+Road,+London,+NW1?entry=gmail&source=g>
> >  2
> > > [google.com]
> > > <
> > www.google.com_maps_search_s-" rel="nofollow">https://urldefense.proofpoint.com/v2/url?u=https-3A__www.google.com_maps_search_s-> \
> > 2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-26source-3Dg&d=DwMF \
> > aQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=friR8y \
> > kiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=
> > 
> > > BE.
> > > 
> > > 
> > > 
> > > --
> > > The Wellcome Sanger Institute is operated by Genome Research
> > > Limited, a charity registered in England with number 1021457
> > and a
> > > company registered in England with number 2742969, whose
> > registered
> > > office is 215 Euston Road, London, NW1
> > <https://www.google.com/maps/search/215+Euston+Road,+London,+NW1?entry=gmail&source=g>
> >  2 [google.com]
> > > <
> > www.google.com_maps_search_s-" rel="nofollow">https://urldefense.proofpoint.com/v2/url?u=https-3A__www.google.com_maps_search_s-> \
> > 2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-26source-3Dg&d=DwMF \
> > aQ&c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&r=oH2yp0ge1ecj4oDX0XM7vQ&m=friR8y \
> > kiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&e=
> > 
> > > BE.
> > > 
> > > -- The Wellcome Sanger Institute is operated by Genome Research
> > Limited, a charity
> > > registered in England with number 1021457 and a company registered
> > in England with
> > > number 2742969, whose registered office is 215 Euston Road,
> > London, NW1 2BE
> > <https://www.google.com/maps/search/215+Euston+Road,+London,+NW1+2BE?entry=gmail&source=g>
> >                 
> > .
> > > 
> > > -- The Wellcome Sanger Institute is operated by Genome Research
> > Limited, a charity
> > > registered in England with number 1021457 and a company registered in
> > England with number
> > > 2742969, whose registered office is 215 Euston Road, London, NW1 2BE
> > <https://www.google.com/maps/search/215+Euston+Road,+London,+NW1+2BE?entry=gmail&source=g>
> >                 
> > .
> > 
> > 


[Attachment #3 (text/html)]

<div dir="auto">If I have been using modperl for development, does it influence I \
drive Maserati? :)</div><div><br><div class="gmail_quote"><blockquote \
class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc \
solid;padding-left:1ex"><div dir="ltr"><br>unsubscribe.</div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Dec 25, 2020 at 10:30 \
PM André Warnier (tomcat/perl) &lt;<a href="mailto:aw@ice-sa.com" \
target="_blank">aw@ice-sa.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" \
style="margin:0px 0px 0px 0.8ex;border-left:1px solid \
rgb(204,204,204);padding-left:1ex">Hello James.<br> Bravo and many thanks for this \
excellent overview of your activities. Of course the setup <br> (in your previous \
message) and the activities are very impressive by themselves.<br> But in addition, \
even though your message is not in itself a perl advocacy message, I feel <br> that \
it would have its right place in some perl/mod_perl advocacy forum, because it <br> \
touches on some general idea which are valid /also/ for perl and mod_perl.<br> It was \
very refreshing to read for once a clear exposé of why it is still important <br> \
nowadays to think before programming, to program efficiently, and to choose the right \
tool <br> for the job at hand (be it perl, mod_perl, or any other) without the kind \
of off-the-cuff <br> general a-priori which tend to plague these discussions.<br>
<br>
And even though our own (commercial) activities and setups do not have anything even \
close <br> to the scope which you describe, I would like to say that the same basic \
principles which <br> you mention in your exposé are just as valid when you \
scale-down as when you scale-up.<br> (&quot;--you can't just throw memory, CPUs, \
power at a problem – you have to<br> think – how can I do what I need to do with \
the least resources..&quot;)<br> Even when you think of a single server, or a single \
server rack, at any one period in time <br> there is always a practical limit as to \
how much memory or CPUs you can fit in a given <br> server, or how many servers you \
can fit in a rack, or how many additional Gb of bandwidth <br> you can allocate per \
server, beyond which there is a sudden &quot;quantum jump&quot; as to how <br> \
practical and cost-effective a whole project becomes.<br> In that sense, I \
particulary enjoyed your examples of the database and of the additional <br> power \
line.<br> <br>
<br>
On 24.12.2020 02:38, James Smith wrote:<br>
&gt; We don't use perl for everything, yes we use it for web data, yes we still use \
it as the <br> &gt; glue language in a lot of cases, the most complex stuff is done \
with C (not even C++ as <br> &gt; that is too slow). Others on site use Python, Java, \
Rust, Go, PHP, along with looking at <br> &gt; using GPUs in cases where code can be \
highly parallelised<br> &gt; <br>
&gt; It is not just one application – but many, many applications… All with a \
common goal of <br> &gt; understanding the human genome, and using it to assist in \
developing new understanding and <br> &gt; techniques which can advance health \
care.<br> &gt; <br>
&gt; We are a very large sequencing centre (one of the largest in the world) – what \
I was <br> &gt; pointing out is that you can't just throw memory, CPUs, power at a \
problem – you have to <br> &gt; think – how can I do what I need to do with the \
least resources. Rather than what <br> &gt; resources can I throw at the problem.<br>
&gt; <br>
&gt; Currently we are acting as the central repository for all COVID-19 sequencing in \
the UK, <br> &gt; along with one of the largest "wet" labs sequencing data for it – \
and that is half the <br> &gt; sequenced samples in the whole world. The UK is \
sequencing more COVID-19 genomes a day <br> &gt; than most other countries have \
sequenced since the start of the pandemic in Feb/Mar. This <br> &gt; has lead to us \
discovering a new more transmissible version of the virus, and it what part <br> &gt; \
of the country the different strains are present – no other country in the world \
has the <br> &gt; information, technology or infrastructure in place to achieve \
this.<br> &gt; <br>
&gt; But this is just a small part of the genomic sequencing we are looking at – we \
work on:<br> &gt; * other pathogens – e.g. Plasmodium (Malaria);<br>
&gt; * cancer genomes (and how effective drugs are);<br>
&gt; * are a major part of the Human Cell Atlas which is looking at how the \
expression of genes <br> &gt; (in the simplest terms which ones are switched on and \
switched off) are different in <br> &gt; different tissues;<br>
&gt; * sequencing the genomes of other animals to understand their evolution;<br>
&gt; * and looking at some other species in detail, to see what we can learn from \
them when <br> &gt; they have defective genes;<br>
&gt; <br>
&gt; Although all these are currently scaled back so that we can work relentlessly to \
support <br> &gt; the medical teams and other researchers get on top of COVID-19.<br>
&gt; <br>
&gt; What is interesting is that many of the developers we have on campus (well all \
wfh at the <br> &gt; moment) are all (relatively) old as we learnt to develop code on \
machines with limited CPU <br> &gt; and limited memory – so that things had to be \
efficient, had to be compact…. And that is <br> &gt; as important now as it was 20 \
or 30 years ago – the data we handle is going up faster than <br> &gt; Moore's Law! \
Many of us have pride in doing things as efficiently as possible.<br> &gt; <br>
&gt; It took around 10 years to sequence and assemble the first human genome {well we \
are still <br> &gt; tinkering with it and filling in the gaps} – now at the \
institute we can sequence and <br> &gt; assemble around 400 human genomes in a day \
– to the same quality!<br> &gt; <br>
&gt; So most of our issues are due to the scale of the problems we face – e.g. the \
human genome <br> &gt; has 3 billion base-pairs (A, C, G, Ts) , so normal solutions \
don't scale to that (once <br> &gt; many years ago we looked at setting up an Oracle \
database where there was at least 1 row <br> &gt; for every base pair – recording \
all variants (think of them as spelling mistakes, for <br> &gt; example a T rather \
than an A, or an extra letter inserted or deleted) for that base pair… <br> &gt; \
The schema was set up – and then they realised it would take 12 months to load the \
data <br> &gt; which we had then (which is probably less than a millionth of what we \
have now)!<br> &gt; <br>
&gt; Moving compute off site is a problem as the transfer of the level of data we \
have would <br> &gt; cause a problem – you can't easily move all the data to the \
compute – so you have to bring <br> &gt; the compute to the data.<br>
&gt; <br>
&gt; The site I worked on before I became a more general developer was doing that – \
and the <br> &gt; code that was written 12-15 years ago is actually still going \
strong – it has seen a few <br> &gt; changes over the year – many displays have \
had to be redeveloped as the scale of the data <br> &gt; has got so big that even the \
summary pages we produced 10 years ago have to be summarised <br> &gt; because they \
are so large.<br> &gt; <br>
&gt; *From:*Mithun Bhattacharya &lt;<a href="mailto:mithnb@gmail.com" \
target="_blank">mithnb@gmail.com</a>&gt;<br> &gt; *Sent:* 24 December 2020 00:06<br>
&gt; *To:* mod_perl list &lt;<a href="mailto:modperl@perl.apache.org" \
target="_blank">modperl@perl.apache.org</a>&gt;<br> &gt; *Subject:* Re: Confused \
about two development utils [EXT]<br> &gt; <br>
&gt; James would you be able to share more info about your setup ?<br>
&gt; <br>
&gt; 1. What exactly is your application doing which requires so much memory and CPU \
- is it <br> &gt; something like gene splicing (no i don&#39;t know much about it \
beyond Jurassic Park :D )<br> &gt; <br>
&gt; 2. Do you feel Perl was the best choice for whatever you are doing and if yes \
then why ? <br> &gt; How much of your stuff is using mod_perl considering you \
mentioned not much is web related ?<br> &gt; <br>
&gt; 3. What are the challenges you are currently facing with your implementation \
?<br> &gt; <br>
&gt; On Wed, Dec 23, 2020 at 6:58 AM James Smith &lt;<a \
href="mailto:js5@sanger.ac.uk" target="_blank">js5@sanger.ac.uk</a> &lt;mailto:<a \
href="mailto:js5@sanger.ac.uk" target="_blank">js5@sanger.ac.uk</a>&gt;&gt; <br> &gt; \
wrote:<br> &gt; <br>
&gt;        Oh but memory is a problem – but not if you have just a small cluster \
of machines!<br> &gt; <br>
&gt;        Our boxes are larger than that – but they all run virtual machine {only \
a small<br> &gt;        proportion web related} – machines/memory would rapidly \
become in our data centre - we<br> &gt;        run VMWARE [995 hosts] and openstack \
[10,000s of hosts] + a selection of large memory<br> &gt;        machines {measured \
in TBs of memory per machine }.<br> &gt; <br>
&gt;        We would be looking at somewhere between 0.5 PB and 1 PB of memory – \
not just the<br> &gt;        price of buying that amount of memory - for many \
machines we need the fastest memory<br> &gt;        money can buy for the workload, \
but we would need a lot more CPUs then we currently<br> &gt;        have as we would \
need a larger amount of machines to have 64GB virtual machines {we<br> &gt;        \
would get 2 VMs per host. We currently have approx. 1-2000 CPUs running our \
hardware<br> &gt;        (last time I had a figure) – it would probably need to go \
to approximately 5-10,000!<br> &gt;        It is not just the initial outlay but the \
environmental and financial cost of running<br> &gt;        that number of machines, \
and finding space to run them without putting the cooling<br> &gt;        costs \
through the roof!! That is without considering what additional constraints on<br> \
&gt;        storage having the extra machines may have (at the last count a year ago \
we had over<br> &gt;        30 PBytes of storage on side – and a large amount of \
offsite backup.<br> &gt; <br>
&gt;        We would also stretch the amount of power we can get from the national \
grid to power<br> &gt;        it all - we currently have 3 feeds from different part \
of the national grid (we are<br> &gt;        fortunately in position where this is \
possible) and the dedicated link we would need<br> &gt;        to add more power \
would be at least 50 miles long!<br> &gt; <br>
&gt;        So - managing cores/memory is vitally important to us – moving to the \
cloud is an<br> &gt;        option we are looking at – but that is more than 4 \
times the price of our onsite<br> &gt;        set-up (with substantial discounts from \
AWS) and would require an upgrade of our<br> &gt;        existing link to the \
internet – which is currently 40Gbit of data (I think).<br> &gt; <br>
&gt;        Currently we are analysing a very large amounts of data directly linked \
to the current<br> &gt;        major world problem – this is why the UK is \
currently being isolated as we have<br> &gt;        discovered and can track a new \
strain, in near real time – other countries have no<br> &gt;        ability to do \
this – we in a day can and do handle, sequence and analyse more samples<br> &gt;    \
than the whole of France has sequenced since February. We probably don't have more \
of<br> &gt;        the new variant strain than in other areas of the world – it is \
just that we know we<br> &gt;        have because of the amount of sequencing and \
analysis that we in the UK have done.<br> &gt; <br>
&gt;        *From:*Matthias Peng &lt;<a href="mailto:pengmatthias@gmail.com" \
target="_blank">pengmatthias@gmail.com</a> &lt;mailto:<a \
href="mailto:pengmatthias@gmail.com" \
target="_blank">pengmatthias@gmail.com</a>&gt;&gt;<br> &gt;        *Sent:* 23 \
December 2020 12:02<br> &gt;        *To:* mod_perl list &lt;<a \
href="mailto:modperl@perl.apache.org" target="_blank">modperl@perl.apache.org</a> \
&lt;mailto:<a href="mailto:modperl@perl.apache.org" \
target="_blank">modperl@perl.apache.org</a>&gt;&gt;<br> &gt;        *Subject:* Re: \
Confused about two development utils [EXT]<br> &gt; <br>
&gt;        Today memory is not serious problem, each of our server has 64GB \
memory.<br> &gt; <br>
&gt; <br>
&gt;              Forgot to add - so our FCGI servers need a lot (and I mean a lot) \
more memory than<br> &gt;              the mod_perl servers to serve the same level \
of content (just in case memory blows<br> &gt;              up with FCGI \
backends)<br> &gt; <br>
&gt;              -----Original Message-----<br>
&gt;              From: James Smith &lt;<a href="mailto:js5@sanger.ac.uk" \
target="_blank">js5@sanger.ac.uk</a> &lt;mailto:<a href="mailto:js5@sanger.ac.uk" \
target="_blank">js5@sanger.ac.uk</a>&gt;&gt;<br> &gt;              Sent: 23 December \
2020 11:34<br> &gt;              To: André Warnier (tomcat/perl) &lt;<a \
href="mailto:aw@ice-sa.com" target="_blank">aw@ice-sa.com</a> &lt;mailto:<a \
href="mailto:aw@ice-sa.com" target="_blank">aw@ice-sa.com</a>&gt;&gt;;<br> &gt;       \
<a href="mailto:modperl@perl.apache.org" target="_blank">modperl@perl.apache.org</a> \
&lt;mailto:<a href="mailto:modperl@perl.apache.org" \
target="_blank">modperl@perl.apache.org</a>&gt;<br> &gt;              Subject: RE: \
Confused about two development utils [EXT]<br> &gt; <br>
&gt; <br>
&gt;               &gt; This costs memory, and all the more since many perl modules \
are not<br> &gt;              thread-safe, so if you use them in your code, at this \
moment the only safe way to<br> &gt;              do it is to use the Apache httpd \
prefork model. This means that each Apache httpd<br> &gt;              child process \
has its own copy of the perl interpreter, which means that the<br> &gt;              \
memory used by this embedded perl interpreter has to be counted n times (as many<br> \
&gt;              times as there are Apache httpd child processes running at any one \
time).<br> &gt; <br>
&gt;              This isn't quite true - if you load modules before the process \
forks then they can<br> &gt;              cleverly share the same parts of memory. It \
is useful to be able to &quot;pre-load&quot;<br> &gt;              core functionality \
which is used across all functions {this is the case in Linux<br> &gt;              \
anyway}. It also speeds up child process generation as the modules are already in<br> \
&gt;              memory and converted to byte code.<br> &gt; <br>
&gt;              One of the great advantages of mod_perl is Apache2::SizeLimit which \
can blow away<br> &gt;              large child process - and then if needed create \
new ones. This is not the case<br> &gt;              with some of the FCGI solutions \
as the individual processes can grow if there is a<br> &gt;              memory leak \
or a request that retrieves a large amount of content (even if not<br> &gt;           \
served), but perl can&#39;t give the memory back. So FCGI processes only get \
bigger<br> &gt;              and bigger and eventually blow up memory (or hit swap \
first)<br> &gt; <br>
&gt; <br>
&gt; <br>
&gt; <br>
&gt; <br>
&gt;              --<br>
&gt;                 The Wellcome Sanger Institute is operated by Genome Research   \
Limited, a charity<br> &gt;              registered in England with number 1021457 \
and a   company registered in England<br> &gt;              with number 2742969, \
whose registered   office is <a \
href="https://www.google.com/maps/search/215+Euston+Road,+London,+NW1?entry=gmail&amp;source=g">215 \
Euston Road, London, NW1</a> 2<br> &gt;              [<a href="http://google.com" \
rel="noreferrer" target="_blank">google.com</a>]<br> &gt;              &lt;<a \
href="www.google.com_maps_search_" rel="nofollow">https://urldefense.proofpoint.com/v2/url?u=https-3A__www.google.com_maps_search_> \
s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-26source-3Dg&amp;d=Dw \
MFaQ&amp;c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&amp;r=oH2yp0ge1ecj4oDX0XM7vQ&am \
p;m=friR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&amp;s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&amp;e=" \
rel="noreferrer" target="_blank">https://urldefense.proofpoint.com/v2/url?u=https-3A__ \
www.google.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3D \
gmail-26source-3Dg&amp;d=DwMFaQ&amp;c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&amp; \
r=oH2yp0ge1ecj4oDX0XM7vQ&amp;m=friR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&amp;s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&amp;e=</a>&gt;BE.<br>
 &gt; <br>
&gt; <br>
&gt; <br>
&gt;              -- <br>
&gt;                 The Wellcome Sanger Institute is operated by Genome Research<br>
&gt;                 Limited, a charity registered in England with number 1021457 and \
a<br> &gt;                 company registered in England with number 2742969, whose \
registered<br> &gt;                 office is <a \
href="https://www.google.com/maps/search/215+Euston+Road,+London,+NW1?entry=gmail&amp;source=g">215 \
Euston Road, London, NW1</a> 2 [<a href="http://google.com" rel="noreferrer" \
target="_blank">google.com</a>]<br> &gt;              &lt;<a \
href="www.google.com_maps_search_" rel="nofollow">https://urldefense.proofpoint.com/v2/url?u=https-3A__www.google.com_maps_search_> \
s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3Dgmail-26source-3Dg&amp;d=Dw \
MFaQ&amp;c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&amp;r=oH2yp0ge1ecj4oDX0XM7vQ&am \
p;m=friR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&amp;s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&amp;e=" \
rel="noreferrer" target="_blank">https://urldefense.proofpoint.com/v2/url?u=https-3A__ \
www.google.com_maps_search_s-2B215-2BEuston-2BRoad-2C-2BLondon-2C-2BNW1-2B2-3Fentry-3D \
gmail-26source-3Dg&amp;d=DwMFaQ&amp;c=D7ByGjS34AllFgecYw0iC6Zq7qlm8uclZFI0SqQnqBo&amp; \
r=oH2yp0ge1ecj4oDX0XM7vQ&amp;m=friR8ykiZ-NWYdX6SrbT_ogNXEVR-4ixdkrhy5khQjA&amp;s=xU3F4xE2ugQuDWHZ4GtDn9mPBCKcJJOI0PYScsSNjSg&amp;e=</a>&gt;BE.<br>
 &gt; <br>
&gt;        -- The Wellcome Sanger Institute is operated by Genome Research Limited, \
a charity<br> &gt;        registered in England with number 1021457 and a company \
registered in England with<br> &gt;        number 2742969, whose registered office is \
<a href="https://www.google.com/maps/search/215+Euston+Road,+London,+NW1+2BE?entry=gmail&amp;source=g">215 \
Euston Road, London, NW1 2BE</a>.<br> &gt; <br>
&gt; -- The Wellcome Sanger Institute is operated by Genome Research Limited, a \
charity <br> &gt; registered in England with number 1021457 and a company registered \
in England with number <br> &gt; 2742969, whose registered office is <a \
href="https://www.google.com/maps/search/215+Euston+Road,+London,+NW1+2BE?entry=gmail&amp;source=g">215 \
Euston Road, London, NW1 2BE</a>.<br> <br>
</blockquote></div>
</blockquote></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic