NVO
Home Simple Query Advanced Query Import Tutorial Help
 National Virtual Observatory

SkyQuery Limitations

SkyQuery is a facility that allows users to access individual astronomical catalogs, as well as to compare them finding positional cross-matches subject to any other conditions or constraints the user wishes to define based on the data in the catalogs. The catalogs/databases that are available for use are shown on the Query Screen on the left side, under the title Nodes. We call them SkyNodes.

Users should be aware of the fact that queries between SkyNodes (including MyData) are always limited to a maximum of 1,000,000 rows. We apply this restriction so Web access is possible and big queries don't swamp the systems.

What does this 1,000,000 rows limit mean?
- Single node queries will be limited to 1,000,000 rows.
- Cross-matches between query sets that contain more than 1,000,000 objects are likely to be incomplete.

Why?
The way SkyQuery works is as follows:
- First, SkyNodes are queried for the number of rows that meet the query constraints.
- Then a query plan is created in such a way that the smallest SkyNode is executed first and this sends the results to the next in size to do the first cross-match and so on.
- If the first SkyNode has more than 1,000,000 objects meeting the WHERE condition, the first cut will be applied here. This is likely to happen when the REGION constraint covers a big area with a lot of objects, OR/AND other conditions in the WHERE statement are not very restrictive.
- An additional cut may happen when the cross-match is performed and the results are sent to the next node. During the cross-match process, each object from the prior node is compared to the current catalog looking for matches. If the prior node provided about 1,000,000 rows and a one-to-one match is expected, the xmatch table might end up with more than 1,000,000 rows depending on how restrictive is the confidence level in "XMATCH () < confidence_level" and the catalog sigma. http://www.skyquery.net/Sky/SkySite/help/algo.aspx

Conclusion
If you expect to have a big overlay between catalogs, use constrains (PARAMETER ranges or REGIONS) that keep the number of objects small. How small depends on how many objects you expect to cross-match per object.
We are already working on a parallel framework capable of doing full catalog-to-catalog cross-matches.
Thank you for your patience!


nsf
Sponsored all or in part by the National Science Foundation under Cooperative Agreement AST0122449 with The Johns Hopkins University.
Developed in collaboration with the International Virtual Observatory Alliance.
Contact the NVO Help Desk to report problems and suggestions.
Last Modified: Wednesday, July 09, 2008 at 11:58:55 AM  $Name: $ Revision 1.1.1.1
IVOA