[prev in list] [next in list] [prev in thread] [next in thread]
List: abiword-dev
Subject: NLP Inner Product Using OTS
From: Nadav Rotem <nadavrotem () mail ! ru>
Date: 2003-11-25 13:48:13
[Download RAW message or body]
The Inner product of two text is defined as the number of topics they
share. One of my professors is doing research in this field and needed a
matrix of the inner product of cunks of text. Here is a quick example,
in a Bash script, of how to use OTS to generate this list of topics.
Usage of the script:
[nadav@gringo articles]$ ./inner.sh sacbee1.txt sacbee2.txt
<sacbee1.txt,sacbee2.txt>= 0
[nadav@gringo articles]$ ./inner.sh test1.txt test2.txt
<test1.txt,test2.txt>= 3
>From your c Code you can get the list of topics through this call:
word = ots_word_in_list(Doc->ImpWords,i);
-Nadav
["inner.sh" (inner.sh)]
#!/bin/bash
KeyA=`ots --about $1 | cut -f2 -d'"' | sed -e 's/\,/ /g'`
KeyB=`ots --about $2 | cut -f2 -d'"' | sed -e 's/\,/ /g'`
C=0;
for wordA in $KeyA; do
for wordB in $KeyB; do
if [ $wordA = $wordB ];
then
let C=C+1;
fi
done
done
echo '<'$1,$2'>'= $C
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic