'[Lucene-hadoop Wiki] Update of "Hbase/Matrix" by udanax'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       hadoop-commits
Subject:    [Lucene-hadoop Wiki] Update of "Hbase/Matrix" by udanax
From:       Apache Wiki <wikidiffs () apache ! org>
Date:       2007-12-29 9:00:24
Message-ID: 20071229090024.16966.45288 () eos ! apache ! org
[Download RAW message or body]

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for \
change notification.

The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/Matrix

The comment on the change is:
I'll re-write after some transformation.

------------------------------------------------------------------------------
- [[TableOfContents(4)]]
+ deleted
  
-  * It's a designing stage.
- 
- ----
- == LINA, a Framework for Large-scale Sparse Linear Algebra ==
- 
- Using Hbase's Row,Column(Qualifier) two dimensional space, we are able to store \
                large sparse matrix.
- [[BR]]The Auto-partitioned sparsity sub-structure will be efficiently managed and \
                serviced by Hbase.
- 
- Row or Column operations can be done in linear time and algorithms such as \
                structured Gaussian elimination
- [[BR]]or iterative methods run in '''O(~-the number of non-zero elements in the \
                matrix-~)''' time, Hbase Providing Both Vertical and Horizontal \
                access.
- 
- Therefore, using iterative algorithms on a parallel processing platform like \
                !MapReduce on Hadoop, 
- [[BR]]we should be able to implement '''High Performance''' and '''World's \
                Largest''' Matrix Computations.
- 
- === Applications by LINA ===
- 
- It will support a wide variety of applications in the domain of Physics, Linear \
                Algebra, Relational Algebra, 
- [[BR]]Statistics, Graphics Rendering, Computational Dynamics and others.
- 
-  * Scientific simulation and modeling 
-   * Matrix-vector/matrix-matrix multiply 
-   * Soving linear systems 
-   * Collaborative filtering/recommendation systems
-  * Information retrieval 
-   * Sorting 
-   * Finding eigenvalues and eigenvectors 
-  * Computer graphics and computational geometry 
-   * Matrix multiply 
-   * computing matrix determinate 
- 
- === Initial Contributors ===
- 
-  * [:udanax:Edward Yoon] (R&D center, NHN corp.)
-   * I have profited by '''Prof. Samuel Kim''', '''Dr. Yongchan Park'''
- 
- == Storing and manipulating numeric, sparse matrices on Hadoop + Hbase ==
- 
- To store sparse matrix in hbase, I will use an abstracted table with a single \
                column family, and tune existing partitioning conditions. 
- The extendable columnfamily feature combine matrices that have the same row \
                dimension.
- 
- === Scheduling Algorithm ===
- 
- The scheduling algorithm is designed for driving the parallel execution of the \
                factorization on on a MapReduce model.
- 
- NOTE : indirect and irregular, the inefficient data access ....
- 
- NOTE : blocking factor........ code generator...... 
- 
-  * Needs
-   * Total Row/Column Key Count in the matrix table header.
-   * Big floating point number, Big integer calculator
- 
- {{{
-                                               
- +--------------------+               +------+ | +--+ 
- |  .          .      |               |.     | | |. |     
- |       .      .     |               |     .| | | .|     
- |                    |               +------+ | +--+     
- |                    |    ---->   ------------+-----------
- |                    |              +-----+   |+-------+          
- | .  .       .     . |              |.  . |   ||.     .|
- |                    |              +-----+   |+-------+        
- +--------------------+                        |                                     \
                
- 
- }}}
- 
- 
- === Sums on the MapReduce Tasks of Hadoop ===
- 
- == Examples ==
- 
- === Collaborative Filtering Via Ensembles of Matrix Factorizations ===
- 
- No work, no grub.
- 
- === Fast Page-Rank Computation Via Sparse Linear System ===
- 
- No work, no grub.
- 
- === Latent Semantic Analysis Via Singular Value Decomposition ===
- 
- No work, no grub.
- 
- == References ==
- 
-  * High performance numerical libraries in Java, Bjørn-Ove Heimsund
-  * Parallel Conjugate Gradients Assignment, a parallel implementation of the \
                conjugate gradient algorithm
-  * ScaLAPACK, a library of high-performance linear algebra routines for \
                distributed-memory message-passing MIMD computers 
-  * [http://bebop.cs.berkeley.edu/oski/ OSKI], optimized sparse kernel interface \
                (OSKI) library
-   * [http://icl.cs.utk.edu/iclprojects/pages/files/sans/yelick-bebop.pdf Automatic \
                Performance Tuning of Sparse Matrix Kernels], August 7, 2002
-  * Scheduling algorithms for parallel Gaussian elimination with communication \
                costs, Amoura, A.K.; Bampis, E.; Konig, J.-C.
-  * Google's Page-Rank and Beyond: The Science of Search Engine Rankigs, Amy N. \
                Langville; Carl D. Meyer
- 


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic