Crawl config.xml

From TeamWeaverWiki

The file repo_config.xml in the \teamweaverIS-backend\WEB-INF\conf of your TeamWeaverIS backend allows you to group repositories to be crawled (c.f. repo_config.xml) into "crawls" which can be triggered by the TeamWeaverIS backend.

You need at least one crawl to be defined in order to be able to crawl data sources. Defining multiple crawls makes sense, if you would like to group repositories to treat them differently - e.g. one group of repositories to be crawled only once a week while another one is crawled every night.

Example `crawl_config.xml`

<?xml version="1.0" encoding="UTF-8"?>
<crawl_config>
	<crawlInfo>
		<crawlId>0</crawlId>
		<crawlName>nightly crawl</crawlName>
		<updateFrequency>3600000</updateFrequency>
		<repositoryIds>
			<repositoryId>1</repositoryId>
			<repositoryId>2</repositoryId>
			<repositoryId>3</repositoryId>
		</repositoryIds>		
	</crawlInfo>
</crawl_config>

Documentation of parameters

crawl_config.xml contains <crawlInfo> entries for each single crawl that should be possible to start from the command line ("crawl.bat 0")
The parameters inside the <crawlInfo> element are as follows:
- <crawlId>0</crawlId> - unique numeric id of the crawl which is used from the command line to reference the crawl
- <crawlName>nightly crawl</crawlName> - human readable description for logging/documentation purposes
- <updateFrequency>3600000</updateFrequency> - parameter is currently not used (OPTIONAL)
- <repositoryIds><repositoryId>1</repositoryId><repositoryId>2</repositoryId><repositoryId>3</repositoryId></repositoryIds> - insert one <repositoryId> element for each repository to be included in that crawl (use the Id define int repo_config.xml for referencing)

Main

Tools

Support

Quick links

Crawl config.xml

From TeamWeaverWiki

Example `crawl_config.xml`

Documentation of parameters

Main

Tools

Support

Quick links

Crawl config.xml

From TeamWeaverWiki

Example crawl_config.xml

Documentation of parameters

Example `crawl_config.xml`