[prev in list] [next in list] [prev in thread] [next in thread] 

List:       r-help
Subject:    Re: [R] Relative Cumulative Frequency of Event Occurence
From:       Burhan ul haq <ulhaqz () gmail ! com>
Date:       2013-11-30 6:58:20
Message-ID: CADw4Cksz7YFsYZeyVmvJWiiu+4sbLNkPGufCNZFZwgTiOxLH5A () mail ! gmail ! com
[Download RAW message or body]

Hi Arun,

Thanks again. Comment noted :)

Amazing use of regular expressions in your solutions. Any reference, or
book you would recommend.


Cheers !


On Fri, Nov 29, 2013 at 10:56 PM, arun <smartpink111@yahoo.com> wrote:

> Hi Burhan,
> 
> No problem.  One suggestion in this code would be:
> with(df.1, cumsum(E.Occur==TRUE)/(seq_len(nrow(df.1))))  ##==TRUE is not
> needed
> identical( with(df.1, cumsum(E.Occur)/(seq_len(nrow(df.1)))),
> with(df.1, cumsum(E.Occur==TRUE)/(seq_len(nrow(df.1)))) )
> 
> 
> is.logical(TRUE)
> #[1] TRUE
> 
> 
> is.logical("Yes")
> #[1] FALSE
> A.K.
> 
> 
> 
> 
> 
> 
> On Friday, November 29, 2013 12:36 PM, Burhan ul haq <ulhaqz@gmail.com>
> wrote:
> 
> Hi Arun,
> 
> Thanks a lot. It works perfectly.
> 
> Here is the complete code - for all those who are interested to see "Rel
> Cum Freq oscillating to reach the Expected Value"
> 
> # Bernouilli Trial where:
> v.fly=c("G","B") # Outcome is Green or Blue fly
> n=100 # No of Events / Trials
> v.smp = seq(1:n) # Event Id
> v.fst = sample(v.fly,n,rep=T) # Simulating First Draw
> v.sec = sample(v.fly,n,rep=T)  # Simulating Second Draw
> df.1 = data.frame(sample = v.smp, fst=v.fst, sec = v.sec) # Clumping in a
> DF
> df.1$E.Occur = with(df.1, ifelse(fst==sec,TRUE,FALSE)) # Event Occurs, if
> color is same in both the the draws
> df.1$Rel.Freq = with(df.1, cumsum(E.Occur==TRUE)/(seq_len(nrow(df.1)))) #
> Relative Frequency
> df.1$Rel.Freq = round(df.1$Rel.Freq,2)
> 
> ggplot(df.1,
> aes(x=sample,y=Rel.Freq))+geom_line(col="green",size=2)+geom_abline(intercept=0.5,slope=0)+geom_point(col="blue")+labs(x="Sample
>  No",y="Relative Cum Freq",title="Rel Cum Freq approaching 0.5 Value") +
> annotate("text",x=60,y=0.53,label="Probability of 0.5")
> 
> 
> 
> Cheers !
> 
> 
> 
> On Thu, Nov 28, 2013 at 9:40 PM, arun <smartpink111@yahoo.com> wrote:
> 
> HI,
> > From the dput() version of df.1, it looks like you want:
> > cumsum(df.1[,4]=="Yes")/seq_len(nrow(df.1))
> > [1] 0.0000000 0.5000000 0.3333333 0.2500000 0.4000000 0.3333333 0.4285714
> > [8] 0.5000000 0.4444444 0.5000000
> > 
> > 
> > A.K.
> > 
> > 
> > 
> > On Thursday, November 28, 2013 11:26 AM, Burhan ul haq <ulhaqz@gmail.com>
> wrote:
> > Hi,
> > 
> > My objective is to calculate "Relative (Cumulative) Frequency of Event
> > Occurrence" - something as follows:
> > 
> > Sample.Number 1st.Fly 2nd.Fly  Did.E.occur? Relative.Cum.Frequency.of.E
> > 1 G B No 0.000
> > 2 B B Yes 0.500
> > 3 B G No 0.333
> > 4 G B No 0.250
> > 5 G G Yes 0.400
> > 6 G B No 0.333
> > 7 B B Yes 0.429
> > 8 G G Yes 0.500
> > 9 G B No 0.444
> > 10 B B Yes 0.500
> > 
> > Please refer to the code below:
> > ##############################################################
> > # 1.
> > v.fly=c("G","B") # Outcome is Green or Blue fly
> > 
> > # 2.
> > n=10 # No of Events / Trials
> > 
> > # 3.
> > v.smp = seq(1:n) # Event Id
> > 
> > # 4.
> > v.fst = sample(v.fly,n,rep=T) # Simulating First Draw
> > 
> > # 5.
> > v.sec = sample(v.fly,n,rep=T)  # Simulating Second Draw
> > 
> > # 6.
> > df.1 = data.frame(sample = v.smp, fst=v.fst, sec = v.sec) # Clumping in a
> DF
> > 
> > # 7.
> > df.1$E.Occur = with(df.1, ifelse(fst==sec,TRUE,FALSE)) # Event Occurs, if
> > color is same in both the the draws
> > 
> > # 8.
> > df.1$Rel.Freq = with(df.1, cumsum(E.occur)/(E.Occur)) # Relative Frequency
> > > > This line does NOT work, and needs to fix the denominator part
> > ##############################################################
> > 
> > Problem is with #8, specifically the part:
> > cumsum(E.occur)/(E.Occur)
> > 
> > The denominator E.Occur is a fixed value, instead of a moving count. I
> have
> > tried nrow(), length() but none provides a moving version of row count, as
> > cumsum does for the "True" values, occurring so far.
> > 
> > > dput(df.1)
> > structure(list(Sample.Number = 1:10, X1st.Fly = c("G", "B", "B",
> > "G", "G", "G", "B", "G", "G", "B"), X2nd.Fly = c("B", "B", "G",
> > "B", "G", "B", "B", "G", "B", "B"), Did.E.occur. = c("No", "Yes",
> > "No", "No", "Yes", "No", "Yes", "Yes", "No", "Yes"),
> > Relative.Cum.Frequency.of.E = c(0,
> > 0.5, 0.333, 0.25, 0.4, 0.333, 0.429, 0.5, 0.444, 0.5)), .Names =
> > c("Sample.Number",
> > "X1st.Fly", "X2nd.Fly", "Did.E.occur.", "Relative.Cum.Frequency.of.E"
> > ), class = "data.frame", row.names = c(NA, -10L))
> > 
> > 
> > Cheers !
> > 
> > [[alternative HTML version deleted]]
> > 
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> > 
> > 
> 

	[[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic