[prev in list] [next in list] [prev in thread] [next in thread] 

List:       python-list
Subject:    Re: Update a specific element in all a list of N lists
From:       Friedrich Rentsch <anthra.norell () bluewin ! ch>
Date:       2021-12-19 12:41:34
Message-ID: d77abddc-ab35-4854-4b7c-7ca14ca5bdb6 () bluewin ! ch
[Download RAW message or body]



On 12/16/21 3:00 PM, hanan lamaazi wrote:
> Dear All,
>
> I really need your assistance,
>
> I have a dataset with 1005000 rows and 25 columns,
>
> The main column that I repeatedly use are Time, ID, and Reputation
>
> First I sliced the data based on the time, and I append the sliced data in
> a list called "df_list". So I get 201 lists with 25 columns
>
> The main code is starting for here:
>
> for elem in df_list:
>
> {do something.....}
>
> {Here I'm trying to calculate the outliers}
>
> Out.append(outliers)
>
> Now my problem is that I need to locate those outliers in the df_list and
> then update another column with is the "Reputation"
>
> Note that the there is a duplicated IDs but at different time slot
>
> example is ID = 1 is outliers, I need to select all ID = 1 in the list and
> update their reputation column
>
> I tried those solutions:
> 1)
>
> grp = data11.groupby(['ID'])
>          for i in GlobalNotOutliers.ID:
>              data11.loc[grp.get_group(i).index, 'Reput'] += 1
>
>          for j in GlobalOutliers.ID:
>              data11.loc[grp.get_group(j).index, 'Reput'] -= 1
>
>
> It works for a dataframe but not for a list
>
> 2)
>
> for elem in df_list:
>
> elem.loc[elem['ID'].isin(Outlier['ID'])]
>
>
> It doesn't select the right IDs, it gives the whole values in elem
>
> 3) Here I set the index using IDs:
>
> for i in Outlier.index:
>      for elem in df_list:
>          print(elem.Reput)
>          if i in elem.index:
> #             elem.loc[elem[i] , 'Reput'] += 1
>              m = elem.iloc[i, :]
>              print(m)
>
>
> It gives this error:
>
> IndexError: single positional indexer is out-of-bounds
>
>
> I'm greatly thankful to anyone who can help me,

I'd suggest you group your records by date and put each group into a 
dict whose key is date. Collecting each record into its group, append to 
it the index of the respective record in the original list. Then go 
through all your groups, record by record, finding outliers. The last 
item in the record is the index of the record in the original list 
identifying the record you want to update. Something like this:

     dictionary = {}
     for i, record in enumerate (original_list):
         date = record [DATE_INDEX]
         if date in dictionary:
             dictionary [date].append ((record, i))
         else:
             dictionary[date] = [(record, i)]

     reputation_indexes = set ()
     for date, records in dictionary.items ():
         for record, i in records:
             if has_outlier (record):
                 reputation_indexes.add (i)

     for i in reputation_idexes:
         update_reputation (original_list [i])

Frederic



-- 
https://mail.python.org/mailman/listinfo/python-list

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic