[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-commits
Subject: [Lucene-hadoop Wiki] Update of "Hbase/Matrix" by udanax
From: Apache Wiki <wikidiffs () apache ! org>
Date: 2007-12-29 9:00:24
Message-ID: 20071229090024.16966.45288 () eos ! apache ! org
[Download RAW message or body]
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for \
change notification.
The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/Matrix
The comment on the change is:
I'll re-write after some transformation.
------------------------------------------------------------------------------
- [[TableOfContents(4)]]
+ deleted
- * It's a designing stage.
-
- ----
- == LINA, a Framework for Large-scale Sparse Linear Algebra ==
-
- Using Hbase's Row,Column(Qualifier) two dimensional space, we are able to store \
large sparse matrix.
- [[BR]]The Auto-partitioned sparsity sub-structure will be efficiently managed and \
serviced by Hbase.
-
- Row or Column operations can be done in linear time and algorithms such as \
structured Gaussian elimination
- [[BR]]or iterative methods run in '''O(~-the number of non-zero elements in the \
matrix-~)''' time, Hbase Providing Both Vertical and Horizontal \
access.
-
- Therefore, using iterative algorithms on a parallel processing platform like \
!MapReduce on Hadoop,
- [[BR]]we should be able to implement '''High Performance''' and '''World's \
Largest''' Matrix Computations.
-
- === Applications by LINA ===
-
- It will support a wide variety of applications in the domain of Physics, Linear \
Algebra, Relational Algebra,
- [[BR]]Statistics, Graphics Rendering, Computational Dynamics and others.
-
- * Scientific simulation and modeling
- * Matrix-vector/matrix-matrix multiply
- * Soving linear systems
- * Collaborative filtering/recommendation systems
- * Information retrieval
- * Sorting
- * Finding eigenvalues and eigenvectors
- * Computer graphics and computational geometry
- * Matrix multiply
- * computing matrix determinate
-
- === Initial Contributors ===
-
- * [:udanax:Edward Yoon] (R&D center, NHN corp.)
- * I have profited by '''Prof. Samuel Kim''', '''Dr. Yongchan Park'''
-
- == Storing and manipulating numeric, sparse matrices on Hadoop + Hbase ==
-
- To store sparse matrix in hbase, I will use an abstracted table with a single \
column family, and tune existing partitioning conditions.
- The extendable columnfamily feature combine matrices that have the same row \
dimension.
-
- === Scheduling Algorithm ===
-
- The scheduling algorithm is designed for driving the parallel execution of the \
factorization on on a MapReduce model.
-
- NOTE : indirect and irregular, the inefficient data access ....
-
- NOTE : blocking factor........ code generator......
-
- * Needs
- * Total Row/Column Key Count in the matrix table header.
- * Big floating point number, Big integer calculator
-
- {{{
-
- +--------------------+ +------+ | +--+
- | . . | |. | | |. |
- | . . | | .| | | .|
- | | +------+ | +--+
- | | ----> ------------+-----------
- | | +-----+ |+-------+
- | . . . . | |. . | ||. .|
- | | +-----+ |+-------+
- +--------------------+ | \
-
- }}}
-
-
- === Sums on the MapReduce Tasks of Hadoop ===
-
- == Examples ==
-
- === Collaborative Filtering Via Ensembles of Matrix Factorizations ===
-
- No work, no grub.
-
- === Fast Page-Rank Computation Via Sparse Linear System ===
-
- No work, no grub.
-
- === Latent Semantic Analysis Via Singular Value Decomposition ===
-
- No work, no grub.
-
- == References ==
-
- * High performance numerical libraries in Java, Bjørn-Ove Heimsund
- * Parallel Conjugate Gradients Assignment, a parallel implementation of the \
conjugate gradient algorithm
- * ScaLAPACK, a library of high-performance linear algebra routines for \
distributed-memory message-passing MIMD computers
- * [http://bebop.cs.berkeley.edu/oski/ OSKI], optimized sparse kernel interface \
(OSKI) library
- * [http://icl.cs.utk.edu/iclprojects/pages/files/sans/yelick-bebop.pdf Automatic \
Performance Tuning of Sparse Matrix Kernels], August 7, 2002
- * Scheduling algorithms for parallel Gaussian elimination with communication \
costs, Amoura, A.K.; Bampis, E.; Konig, J.-C.
- * Google's Page-Rank and Beyond: The Science of Search Engine Rankigs, Amy N. \
Langville; Carl D. Meyer
-
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic