Overview of implementation
- Create ASP.NET web site
- Connect to Content Database for SharePoint
- Execute query
- Display data on page
Microsoft Visual Studio 2005 (You could conceptually use Visual Studio 2003, or notepad also)Implementation Open up Microsoft Visual Studio 2005 and create a new web. Create a connection to your content database. If you don't know which one that is, just look through your SharePoint database until you find one that has a "Docs" table.
The following query can be used to get the data. Add any file extensions that you want to index to the query. There are files that are in the document library that you don't even see, so I recommend explicitly specifying the file extensions you want to include in the indexing.
The table we are interested in is the Docs table. It has meta data and binary file content for all files in all Document Libraries in both SPS and WSS.
SELECT Docs.DirName + '/' + Docs.LeafName AS URL FROM Docs WHERE ((Docs.Type = 0) and Docs.LeafName not like 'template.%') AND ((Docs.LeafName LIKE '%.doc%') OR (Docs.LeafName LIKE '%.ppt%') OR (Docs.LeafName LIKE '%.xls%') OR (Docs.LeafName LIKE '%.pdf%') OR (Docs.LeafName LIKE '%.vsd%') OR (Docs.LeafName LIKE '%.txt%') )
You will probably want to specify this in code so that the following can be prepended to the url
<a href="http://hostnameHere/">http://hostnameHere/</A> Bind results to a GridView or some other control that has paging built in.
- Don't use the pagers and provide our own links to all the pages in the GridView
If you choose option 1 and write your own pager and want each page to be equal in the GSA results to start with, I strongly recommend that the pager also have direct links to all the pages. The reason is that if you only have a next button for example, GSA will see that page 20 is two hops away from what you originally wanted it to index. GSA will still include it in the index, but gives it an extrememely low page rank (basically zero) for pages over about 10 hops away. Each page from 0 to 10 get a smaller page rank, so page 10 to 20 for example have a near zero page rank.