[prev in list] [next in list] [prev in thread] [next in thread] 

List:       wekalist
Subject:    Re: [Wekalist] Real-world datasets with skewed class distributions
From:       Paul <paul.m.nz () gmail ! com>
Date:       2006-10-29 18:49:47
Message-ID: a5ed8b040610291049sa0a4e17mc0e0b39e017b7f13 () mail ! gmail ! com
[Download RAW message or body]

> >Does anyone know or is willing to contribute my students real-world
> >(not artificial) data sets of very skewed applications such as fraud
> >or spam detection? Multi-class files are ok too.

I've found most of the imbalanced datasets that I've used by reviewing
existing literature and finding out what others have used.

For instance... Forestcover (from UCI website) is a huge dataset that
has been used for imbalanced machine learning by selecting 2 classes
from the original 7 (one small in number, the other large).

/paul

-- 
Try Torpark; a small portable, open-source, built on Firefox browser
that enables anonymous browsing. Requires no installation :
http://www.torrify.com/

_______________________________________________
Wekalist mailing list
Wekalist@list.scms.waikato.ac.nz
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic