[prev in list] [next in list] [prev in thread] [next in thread] 

List:       abiword-dev
Subject:    NLP Inner Product Using OTS
From:       Nadav Rotem <nadavrotem () mail ! ru>
Date:       2003-11-25 13:48:13
[Download RAW message or body]

The Inner product of two text is defined as the number of topics they
share. One of my professors is doing research in this field and needed a
matrix of the inner product of cunks of text. Here is a quick example,
in a Bash script, of how to use OTS to generate this list of topics. 

Usage of the script:

[nadav@gringo articles]$ ./inner.sh sacbee1.txt sacbee2.txt
<sacbee1.txt,sacbee2.txt>= 0

[nadav@gringo articles]$ ./inner.sh test1.txt test2.txt
<test1.txt,test2.txt>= 3
   
>From your c Code you can get the list of topics through this call:
word = ots_word_in_list(Doc->ImpWords,i);

-Nadav

["inner.sh" (inner.sh)]

#!/bin/bash

KeyA=`ots --about $1 | cut -f2 -d'"' | sed -e 's/\,/ /g'`
KeyB=`ots --about $2 | cut -f2 -d'"' | sed -e 's/\,/ /g'`

C=0;
for wordA in $KeyA; do
for wordB in $KeyB; do

	if [ $wordA = $wordB ];
	then
		let C=C+1;		
	fi

done
done


echo '<'$1,$2'>'= $C


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic