Re: [math-fun] Prime question

2 Nov 2007


      I'm amazed that Google is not using the "text behind image" feature of  
PDF files to handle this.  Modern OCR programs produce text behind the  
page image, which is then searchable and selectable for cut and paste.   
I wonder why this obviously useful idea is not used in their scanning  
and OCR.  I guess they want to be the only people who can search and  
index the pages.

On Nov 2, 2007, at 9:27 AM, Henry Baker wrote:
...
At 03:39 AM 11/2/2007, Joshua Zucker wrote:
...
http://mathworld.wolfram.com/CunninghamChain.html and
http://hjem.get2net.dk/jka/math/Cunningham_Chain_records.htm
might be a good enough starting point for learning about these things.
Enjoy,
--Joshua Zucker
On the upper right hand corner of this page, there is a link called  
"Download PDF 9.5M".  I clicked on it, and downloaded what appears to  
be the entire book of 308 pages.  You can also click on "View plain  
text", which will give you some idea of how the character recognition  
program is working (used for indexing).  Since algebraic equation  
recognition isn't doing so well, yet, the equations get trashed.  It  
might be nice to be able to download both the pdf and the ascii text,  
so that you can search the book yourself, as most books have  
completely useless indices (the major exception being Knuth's).
http://books.google.com/books? 
id=aC0PAAAAIAAJ&printsec=frontcover&dq=intitle:quaternions+inauthor: 
tait&as_brr=0
Good luck!
_______________________________________________
math-fun mailing list
math-fun@mailman.xmission.com
http://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Re: [math-fun] Prime question

Tom Knight