Data Discovery and Query with the Butler¶
Contact author(s): Alex Drlica-Wagner, Melissa Graham
Last verified to run: 2024-12-02
LSST Science Pipelines version: Weekly 2024_42
Container Size: medium
Targeted learning level: intermediate
Description: Learn how to discover data and apply query constraints with the Butler.
Skills: Use the Butler registry, dataIds, and spatial and temporal constraints.
LSST Data Products: calexps, deepCoadds, sources
Packages: lsst.daf.butler
Credit: Elements of this tutorial were originally developed by Alex Drlica-Wagner in the context of the LSST Stack Club.
Get Support: Find DP0-related documentation and resources at dp0.lsst.io. Questions are welcome as new topics in the Support - Data Preview 0 Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
In the introductory Butler tutorial, we learned how to access DP0 data given a specific data identifier (dataId
). In this tutorial, we will explore how to use the Butler to find available data sets that match different sets of criteria (i.e., perform spatial and temporal searches). As a reminder, full Butler documentation can be found in the documentation for lsst.dat.butler. For this notebook in particular, you might find this set of Frequently Asked Questions for the LSST Science Pipelines middleware to be useful.
1.1 Package Imports¶
Import general python packages and several packages from the LSST Science Pipelines, including the Butler package and AFW Display, which can be used to display images.
More details and techniques regarding image display with afwDisplay
can be found in the rubin-dp0
GitHub Organization's tutorial-notebooks repository.
# Generic python packages
import numpy as np
import pylab as plt
import astropy.time
# LSST Science Pipelines (Stack) packages
import lsst.daf.butler as dafButler
import lsst.afw.display as afwDisplay
afwDisplay.setDefaultBackend('matplotlib')
# Set a standard figure size to use
plt.rcParams['figure.figsize'] = (8.0, 8.0)
1.2. Create an instance of the Butler¶
Create an instance of the Butler pointing to the DP0.2 data by specifying the dp02
configuration and the 2.2i/runs/DP0.2
collection.
butler = dafButler.Butler('dp02', collections='2.2i/runs/DP0.2')
2. Explore the DP0 data repository¶
Butler repositories have both a database component and a file-like storage component. The database component can be accessed through the Butler registry, while file-like storage can be local (i.e., pointing to a directory on the local file system) or remote (i.e., pointing to cloud storage resources). DP0 uses Simple Storage Service (S3) buckets, which are public cloud storage resources that are similar to file folders. The S3 buckets store objects, which consist of data and its descriptive metadata.
2.1. The Butler registry¶
The database side of a data repository is called a registry
.
The registry contains entries for all data products, and organizes them by collections, dataset types, and data IDs.
We can access a registry client directly as part of our Butler object:
registry = butler.registry
Optional: learn more about the registry by uncommenting the following line.
# help(registry)
2.1.1. Querying collections¶
Collections are lightweight groups of datasets such as the set of raw images for a particular instrument, self-consistent calibration datasets, and the outputs of a processing run.
For DP0.2, we use the 2.2i/runs/DP0.2
collection, which we specified when creating our instance of the Butler.
It is possible to access other collections, which can be queried with butler.collections.query
.
More about collections can be found in the lsst.daf.butler documentation and in the middleware FAQ.
Risk reminder: for DP0 there are no read/write restrictions on the Butler repository.
The above risk means all users can see everything in the Butler, including intermediate processing steps, test runs, staff repositories, and other user repositories. This fact makes the butler.collections.query functionality less useful for data discovery than it will be in the future, due to the sheer number of Butler collections exposed to users.
Optional: print a giant list of every collection that exists.
# for c in sorted(butler.collections.query('*')):
# print(c)
The only collection users need is the DP0.2 collection, which is named "2.2i/runs/DP0.2." Use butler.collections.query_info
to learn more about the contents of that collection. This will produce a long list of collections that are children of the DP0.2 parent collection.
butler.collections.query_info('2.2i/runs/DP0.2')
[CollectionInfo(name='2.2i/runs/DP0.2', type=<CollectionType.CHAINED: 3>, doc='', children=('2.2i/runs/DP0.2/w_2022_22/PREOPS-1209/20220601T202430Z', '2.2i/runs/DP0.2/w_2022_22/PREOPS-1209/20220602T165151Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step7/20220501T161443Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_1/20220512T201515Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_10/20220514T021220Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_11/20220514T051259Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_12/20220514T081014Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_13/20220514T110509Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_14/20220514T140812Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_15/20220514T171751Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_16/20220514T201929Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_17/20220514T232440Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_17/20220516T033516Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_18/20220515T023234Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_19/20220515T054618Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_2/20220513T015911Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_20/20220515T144944Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_3/20220513T050000Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_4/20220515T113417Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_5/20220513T110250Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_6/20220513T140400Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_7/20220513T170841Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_8/20220513T201047Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step6_9/20220513T231122Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_52/20220511T193037Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_51/20220511T172849Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_50/20220511T152602Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_49/20220511T224921Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_48/20220511T115326Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_47/20220511T093215Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_46/20220511T072551Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_45/20220511T051607Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_44/20220511T030333Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_43/20220511T010324Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_42/20220510T225220Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_41/20220510T204315Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_40/20220510T182122Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_39/20220510T165619Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_38/20220510T144157Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_37/20220510T122823Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_36/20220510T102009Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_35/20220510T082734Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_34/20220510T062419Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_30/20220511T210005Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_12/20220507T050423Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_1/20220512T001721Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_1/20220503T191629Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_10/20220506T225821Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_11/20220507T020003Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_13/20220507T224408Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_14/20220508T012319Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_15/20220508T042652Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_16/20220508T072602Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_17/20220508T103019Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_18/20220508T124926Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_19/20220508T155145Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_2/20220505T095134Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_2/20220505T200125Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_20/20220508T185905Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_21/20220508T221147Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_22/20220509T011312Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_23/20220509T041228Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_24/20220509T071757Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_25/20220509T102239Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_26/20220509T132805Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_27/20220509T155625Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_28/20220509T180251Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_29/20220509T200842Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_3/20220505T200225Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_31/20220509T234815Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_32/20220510T020028Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_33/20220510T041257Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_4/20220505T233000Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_5/20220506T040307Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_6/20220506T080709Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_7/20220506T121157Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_8/20220506T160546Z', '2.2i/runs/DP0.2/v23_0_2/PREOPS-905/step5_9/20220506T195012Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_38/20220429T022751Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_1/20220419T041129Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_10/20220429T200139Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_11/20220430T013535Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_12/20220430T070958Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_13/20220422T152623Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_14/20220423T111506Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_15/20220423T171322Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_16/20220423T220848Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_17/20220424T030657Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_18/20220424T081202Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_19/20220424T131819Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_2/20220419T082436Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_20/20220424T182451Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_21/20220425T033108Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_22/20220425T090248Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_23/20220425T143818Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_24/20220425T200112Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_25/20220426T012543Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_26/20220426T070107Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_27/20220426T123155Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_28/20220426T180134Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_29/20220426T233505Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_3/20220419T123328Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_30/20220427T050650Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_31/20220427T103906Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_32/20220427T161652Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_33/20220427T215556Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_34/20220428T033022Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_35/20220428T091545Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_36/20220428T145517Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_37/20220428T203907Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_39/20220429T081628Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_4/20220419T163046Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_40/20220429T135816Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_5/20220419T204203Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_6/20220420T005103Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_7/20220420T173758Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_8/20220420T233604Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step4_9/20220421T052720Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220418T184335Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220418T180525Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220418T172818Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220418T165935Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220418T165845Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_9/20220405T164242Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_9/20220330T170001Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_7/20220330T145650Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_5/20220329T232416Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220410T153911Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220410T130944Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_32/20220407T030512Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_20/20220406T145057Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_20/20220331T124128Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_19/20220331T110357Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_18/20220406T130306Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_18/20220331T084313Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_17/20220406T111701Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_17/20220331T063809Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_16/20220406T095810Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_16/20220331T050027Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_15/20220406T081059Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_15/20220331T025327Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_14/20220406T063431Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_14/20220331T004509Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_13/20220406T045526Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_13/20220330T224603Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_12/20220405T182718Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_12/20220330T204043Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_11/20220330T190852Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_27/20220322T202545Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_4/20220325T122210Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_29/20220325T223049Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_26/20220325T204503Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_25/20220325T190019Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_24/20220325T174013Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_23/20220325T155617Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_22/20220325T141557Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_2/20220325T104104Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_1/20220326T032806Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_8/20220316T205925Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_6/20220318T134046Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_27/20220316T171018Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_1/20220317T233937Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_6/20220316T192214Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_10/20220315T122041Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_8/20220309T160814Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_6/20220309T170243Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_4/20220315T051405Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_4/20220309T042312Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_31/20220314T212509Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_31/20220307T050825Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_30/20220314T194916Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_30/20220306T160359Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_3/20220315T012904Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_3/20220309T040429Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_29/20220314T204515Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_29/20220306T172459Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_28/20220314T185100Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_28/20220306T041600Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_27/20220306T050001Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_26/20220305T162808Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_25/20220305T164554Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_24/20220305T052326Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_23/20220305T044606Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_22/20220304T174139Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_21/20220314T170839Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_21/20220304T163449Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_2/20220308T164354Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_10/20220310T050847Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_1/20220308T153907Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_11/20220220T032646Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_12/20220218T172932Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_13/20220220T142359Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_13/20220302T190942Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_14/20220219T051120Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_15/20220221T015820Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_15/20220302T200340Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_16/20220219T164501Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_16/20220302T210859Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_17/20220220T035232Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_18/20220220T153612Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_19/20220221T031952Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_19/20220302T215536Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_20/20220225T211807Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_20/20220303T145311Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_5/20220218T163030Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_7/20220219T040206Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_9/20220219T154502Z', '2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_9/20220302T181001Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220124T155505Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220123T195452Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220123T001043Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220122T070559Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T234250Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T191925Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T160922Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T103211Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T081743Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T041501Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T031850Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220121T003836Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220120T161430Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220120T012920Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220112T143133Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220112T114604Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220112T084603Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220112T054912Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220112T010001Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T213932Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T175016Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T095828Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T083238Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T071223Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T063729Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T060953Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T054319Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T051546Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T044925Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220111T041856Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220110T231306Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220107T204134Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220107T163916Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220104T111833Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220104T085126Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220103T202705Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220104T062817Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220103T103400Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220103T010607Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220102T153518Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220102T060840Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220101T211007Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220101T102006Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20220101T014359Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211231T121620Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211231T053432Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211230T230823Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211230T164503Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211230T102720Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211230T042014Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211224T153913Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211224T102944Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211224T053402Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211223T204234Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211223T030157Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211222T182615Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211222T122903Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211222T050757Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211222T004329Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211221T170207Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211221T162728Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211221T021010Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211220T214212Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211220T180923Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211220T040027Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211220T000005Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211219T205411Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211219T150249Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211219T033213Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T214417Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T041605Z', '2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T002844Z', '2.2i/raw/DR6/WFD', '2.2i/calib/DM-30694', '2.2i/calib/gen2', '2.2i/calib/DM-30694/unbounded', 'skymaps', 'refcats/PREOPS-301', '2.2i/truth_summary'), parents=None, dataset_types=None)]
There is also a way to list collections that match a string or regular expression.
Optional: print a list collections with the word 'calib' in them.
# for c in sorted(butler.collections.query("*calib*")):
# print(c)
2.1.2. queryDatasetTypes¶
As shown in the introductory Butler notebook, useful DP0.2 datasetTypes for images include deepCoadd, calexp, and goodSeeingDiff_differenceExp, while useful datasetTypes for catalogs include sourceTable, objectTable, diaObjectTable_tract, etc. See the DP0.2 Data Products Definitions Document for more details about the DP0.2 data sets.
The queryDatasetTypes function allows users to explore available datasetTypes.
Notice: as described in the documentation page for
queryDatasetTypes
, this method will report all datasetTypes that have been registered with a data repository, even if there aren’t any datasets of that type actually present.
The queryDatasetTypes function is a useful tool when you know the name of the dataset type already, and want to see how it’s defined (i.e., what kind of dataId it accepts).
Optional: print a giant list of all the available dataset types.
# for dt in sorted(registry.queryDatasetTypes()):
# print(dt)
You can provide a string or regular expression to queryDatasetTypes to list a subset of datasetTypes. For example, the following cell lists datasetTypes with names including '_tract', which indicates that this dataset can be queried by tract.
for dt in sorted(registry.queryDatasetTypes('*_tract')):
print(dt)
DatasetType('diaObjectTable_tract', {skymap, tract}, DataFrame) DatasetType('diaSourceTable_tract', {skymap, tract}, DataFrame) DatasetType('diff_matched_truth_summary_objectTable_tract', {skymap, tract}, DataFrame) DatasetType('forcedSourceOnDiaObjectTable_tract', {skymap, tract}, DataFrame) DatasetType('forcedSourceTable_tract', {skymap, tract}, DataFrame) DatasetType('match_ref_truth_summary_objectTable_tract', {skymap, tract}, DataFrame) DatasetType('match_target_truth_summary_objectTable_tract', {skymap, tract}, DataFrame) DatasetType('matched_truth_summary_objectTable_tract', {skymap, tract}, DataFrame) DatasetType('objectTable_tract', {skymap, tract}, DataFrame)
Optional: list all the dataset types associated with deepCoadds.
# for dt in sorted(registry.queryDatasetTypes('deepCoadd*')):
# print(dt)
Note that queryDatasetTypes returns a generator function, which is a special kind of function that can be iterated. It contains objects that can be looped over like a list, as shown in the code cells above. Generator functions are used because they do not store their contents in memory, making them more suitable for really large data sets, like LSST.
2.1.3. getDatasetType¶
If you want to retrieve a single datasetType rather than a generator function, you can use getDatasetType
. Below, we get the datasetType associated with deepCoadd.
dt_deepCoadd = registry.getDatasetType('deepCoadd')
print(dt_deepCoadd)
DatasetType('deepCoadd', {band, skymap, tract, patch}, ExposureF)
2.1.4. query_dimension_records¶
As described in the documentation for queryDimensionRecords
,
this method provies a way to inspect metadata tables.
Option: print the different metadata elements that are available to be inspected.
# for a in butler.dimensions.getStaticElements():
# print(a)
A call to query_dimension_records
will return a set of fields, depending on the element.
Use butler.dimensions.schema
to print a list of the fields that would be returned for a given element (in this case "detector").
print(butler.dimensions['detector'].schema)
detector: instrument: string id: int full_name: string name_in_raft: string raft: string A string name for a group of detectors with an instrument- dependent interpretation. purpose: string Role of the detector; typically one of "SCIENCE", "WAVEFRONT", or "GUIDE", though instruments may define additional values.
Use the butler.query_dimension_records
method to return some detector metadata available for DP0 images (LSSTCam-imSim), for detectors 6, 7, and 8 only.
butler.query_dimension_records('detector', where="instrument='LSSTCam-imSim' "
"AND detector.id IN (6..8)")
[detector.RecordClass(instrument='LSSTCam-imSim', id=6, full_name='R01_S20', name_in_raft='S20', raft='R01', purpose='SCIENCE'), detector.RecordClass(instrument='LSSTCam-imSim', id=7, full_name='R01_S21', name_in_raft='S21', raft='R01', purpose='SCIENCE'), detector.RecordClass(instrument='LSSTCam-imSim', id=8, full_name='R01_S22', name_in_raft='S22', raft='R01', purpose='SCIENCE')]
Option: use the query_dimension_records
method to return the exposure, visit, and detector metadata available for DP0 visit number 971990 and detector 0. (Use "[0]" to print only the first item from the list.)
for dim in ['exposure', 'visit', 'detector']:
print(butler.query_dimension_records(dim, where='visit = 971990 and detector = 0', limit=-1)[0])
print()
exposure: instrument: 'LSSTCam-imSim' id: 971990 physical_filter: 'z_sim_1.4' obs_id: '971990' exposure_time: 30.0 dark_time: 30.0 observation_type: 'science' observation_reason: 'imsim' day_obs: 20251201 seq_num: 0 group_name: '971990' group_id: 971990 target_name: 'UNKNOWN' science_program: '971990' tracking_ra: 70.37699524983329 tracking_dec: -37.17573628348882 sky_angle: 292.6518874660149 zenith_angle: 19.469782456426444 timespan: Timespan(begin=astropy.time.Time(2461012.0, -0.36678038888930553, scale='tai', format='jd'), end=astropy.time.Time(2461012.0, -0.36643317129629627, scale='tai', format='jd')) visit: instrument: 'LSSTCam-imSim' id: 971990 physical_filter: 'z_sim_1.4' visit_system: 1 name: '971990' day_obs: 20251201 exposure_time: 30.0 target_name: 'UNKNOWN' observation_reason: 'imsim' science_program: '971990' zenith_angle: 19.469782456426444 region: ConvexPolygon([UnitVector3d(0.23777267380834483, 0.7544426088009132, -0.6117846889353334), UnitVector3d(0.23561428564848888, 0.7624248774034458, -0.6026559671277576), UnitVector3d(0.23489129024598346, 0.7650539083853652, -0.599598698323182), UnitVector3d(0.24441246595191043, 0.7719782401906571, -0.5867811714425084), UnitVector3d(0.24856139513406264, 0.7716376940531965, -0.5854848434974846), UnitVector3d(0.2524556778516015, 0.7713013494543077, -0.5842605232689029), UnitVector3d(0.27578892562829005, 0.768993552725668, -0.5767056306012182), UnitVector3d(0.2796539793141923, 0.7685616339568179, -0.5754186881422636), UnitVector3d(0.2934704959089931, 0.7593653588952546, -0.580723100746656), UnitVector3d(0.29426926501388223, 0.7565200998125113, -0.5840230631129711), UnitVector3d(0.2949957827088376, 0.7538756416625922, -0.5870681434823516), UnitVector3d(0.2957168905600652, 0.7512201931774088, -0.5901014675460341), UnitVector3d(0.29993684521247804, 0.7349912179357472, -0.608133043372347), UnitVector3d(0.29039711533056606, 0.7280680544882611, -0.6209560559664141), UnitVector3d(0.28628631986818487, 0.7285155658703246, -0.6223385038232585), UnitVector3d(0.27465017049482865, 0.7297171630289703, -0.6261630345430116), UnitVector3d(0.2707646722601988, 0.7300937569840328, -0.6274150127856101), UnitVector3d(0.2590626697627845, 0.731155302238704, -0.6311089106830992), UnitVector3d(0.25516285417086804, 0.7314837244091129, -0.6323159643532253), UnitVector3d(0.24135746274622868, 0.7406734671751954, -0.627016259916262), UnitVector3d(0.2405974378501948, 0.7436273951325181, -0.6238037913545792), UnitVector3d(0.23989510383043633, 0.7463454425811203, -0.6208210849323466)]) timespan: Timespan(begin=astropy.time.Time(2461012.0, -0.36678038888930553, scale='tai', format='jd'), end=astropy.time.Time(2461012.0, -0.36643317129629627, scale='tai', format='jd')) detector: instrument: 'LSSTCam-imSim' id: 0 full_name: 'R01_S00' name_in_raft: 'S00' raft: 'R01' purpose: 'SCIENCE'
2.2. Use a dataId with query_datasets¶
The dataId is a dictionary-like identifier for a data product (more information can be found in the lsst.daf.butler documentation).
Each DatasetType
(i.e., calexp
, deepCoadd
, objectTable
, etc.) uses a different set of keys in its dataId, which are also called "dimensions".
Use the registry to get the DatasetType for a specific named dataset, in this case a calexp
, and list its dimensions.
dt = registry.getDatasetType('calexp')
print("Name:", dt.name)
print("Dimensions:", dt.dimensions)
print("Storage Class:", dt.storageClass)
Name: calexp Dimensions: {band, instrument, detector, physical_filter, visit_system, visit} Storage Class: ExposureF
The dataId contains both implied and required keys.
For example, the value of band
is implied by the visit
, because a single visit refers to a single exposure at a single pointing in a single band.
In other tutorial notebooks, we have seen how to access a specific data product using a fully specified dataId. A query for a fully specified dataId should return one unique entry (however, see the FAQ entry about duplicate results from chained collections).
As described in the documentation page for query_datasets
, this method returns datasetRefs
, which can be passed directly to a call to butler.get()
in order to retrieve the desired data (see next section).
datasetType = 'calexp'
dataId = {'visit': 192350, 'detector': 175}
datasetRefs = butler.query_datasets(datasetType, data_id=dataId)
for i, ref in enumerate(datasetRefs):
print(ref.dataId)
print("band:", ref.dataId['band'])
{instrument: 'LSSTCam-imSim', detector: 175, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} band: i
A dataId can be represented as regular Python dict
object, but when they are returned from the Butler
the DataCoordinate
class is used instead.
The value of a single key, in this case band, can also be printed by specifying the key name.
Note that when we instantiated the Butler pointing to the DP0.2 collection (i.e., at the beginning of this notebook), we implicitly specified that we are interested in data associated with the LSSTCam-imSim
instrument, since that is the only instrument contained in the DP0.2 collection.
It is also possible to query for all data products that match a partially specified dataId.
For example, in the following cell we use a partially specified dataId to select all the calexp
data associated with visit=192350.
This search will return a separate entry for each CCD detector that was processed for this visit (this visit happens to be close to the edge of the simulated footprint, so only 187 detectors were processed).
We'll print information about a few of them.
(The following cell will fail and return an error if the query is requesting a DatasetRef
for data that does not exist.)
datasetType = 'calexp'
dataId = {'visit': 192350}
datasetRefs = butler.query_datasets(datasetType, data_id=dataId)
for i, ref in enumerate(datasetRefs):
print(ref.dataId)
if i > 5:
print('...')
break
print(f"Found {len(datasetRefs)} detectors")
{instrument: 'LSSTCam-imSim', detector: 0, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 1, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 2, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 3, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 4, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 5, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 6, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} ... Found 187 detectors
The LSST Science Camera has 189 science detectors, total, but this visit is very near the edge of the DC2 simulation region (only 300 square degrees of sky), so not all detectors are within the dataset.
2.2.1 Limiting the number of results¶
You can limit the number of query results that are returned by passing a "limit=N" argument to butler.query_datasets
. If you pass "limit=N", the query will return N results. If you instead add a minus sign in front of the number ("limit=-N"), the query will still return N results, but it will tell you that there are more results that were not returned.
datasetRefs = butler.query_datasets(datasetType, data_id=dataId, limit=-2)
for i, ref in enumerate(datasetRefs):
print(ref.dataId)
lsst.daf.butler._butler WARNING: More datasets are available than the requested limit of 2.
{instrument: 'LSSTCam-imSim', detector: 0, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 1, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1}
2.3. Use butler.get() with a datasetRef¶
One of the beauties of the Butler is that there is no need to know exactly where the data live in order to access it. In previous notebooks we've seen how to pass a dataId to Butler.get to return an instance of the appropriate object. When you already have a datasetRef, it is faster and more efficient to pass the datasetRef to Butler.get. Use Butler.get to retrieve the calexps from the previous query, and then get the detector Id from the calexp's properties.
for i, ref in enumerate(datasetRefs):
calexp = butler.get(ref)
print(' calexp.detector.getId(): ', calexp.detector.getId())
calexp.detector.getId(): 0 calexp.detector.getId(): 1
Optional: display the calexps retrieved from the Butler.
# for i, ref in enumerate(datasetRefs):
# calexp = butler.get(ref)
# print(i, ' calexp.detector.getId(): ', calexp.detector.getId())
# fig = plt.figure()
# display = afwDisplay.Display(frame=fig)
# display.scale('asinh', 'zscale')
# display.mtv(calexp.image)
# plt.show()
3. Query the DP0 data repository¶
As TAP is the recommended way to query the catalogs, the following basic, temporal, and spatial query examples are all for images (calexps).
3.1. Basic image queries¶
Our example above demonstrated a very simple use of query_datasets
, but additional query terms can also be used, such as band and visit.
When a query term is an equality, it can be specified as an argument like band=''
.
When a query term is an inequality, it can be specified with where
.
More details on Butler queries can be found here in the lsst.daf.butler documentation.
In the following cell, we query for a list of all calexps corresponding to i-band observations of a single detector over a range of visits.
datasetRefs = butler.query_datasets(dataset_type='calexp', band='i', detector=175,
where='visit > 192000 and visit < 193000')
for i, ref in enumerate(datasetRefs):
print(ref.dataId)
{instrument: 'LSSTCam-imSim', detector: 175, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 175, visit: 192351, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 175, visit: 192353, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 175, visit: 192354, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 175, visit: 192355, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 175, visit: 192356, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 175, visit: 192357, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 175, visit: 192358, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1}
Optional: use the datasetRefs to retrieve and display the first two calexp images.
# for i, ref in enumerate(datasetRefs[:2]):
# calexp = butler.get(ref)
# fig = plt.figure()
# display = afwDisplay.Display(frame=fig)
# display.scale('asinh', 'zscale')
# display.mtv(calexp.image)
# plt.show()
3.1.1. Optional: retrieving temporal and spatial metadata¶
As a precursor to doing temporal and spatial queries below, we demonstrate the temporal and spatial metadata that can be retrieved for calexps via the Butler.
Retrieving only the metadata (calexp.visitInfo, calexp.bbox, or calexp.wcs) can be faster than retreiving the full calexp and then extracting the metadata from it.
Print the full visitInfo record for the first datasetRef returned by the previous query.
visitInfo = butler.get('calexp.visitInfo', dataId=datasetRefs[0].dataId)
print(visitInfo)
VisitInfo(exposureTime=30, darkTime=30, date=2022-09-15T09:07:20.140899826, UT1=nan, ERA=2.28122 rad, boresightRaDec=(53.6003547253, -32.7089000169), boresightAzAlt=(245.0397119707, +83.7529317472), boresightAirmass=1.00506, boresightRotAngle=3.47434 rad, rotType=1, observatory=-30.2446N, -70.7494E 2663, weather=Weather(nan, nan, 40), instrumentLabel='LSSTCam-imSim', id=192350, focusZ=nan, observationType='SKYEXP', scienceProgram='', observationReason='', object='', hasSimulatedContent=false)
Print information from the visitInfo for all query results: the date, exposure time, and the boresight (the pointing coordinates of the telescope).
for i, ref in enumerate(datasetRefs):
visitInfo = butler.get('calexp.visitInfo', dataId=ref.dataId)
print(i, visitInfo.date, visitInfo.exposureTime, visitInfo.boresightRaDec)
0 DateTime("2022-09-15T09:07:20.140899826", TAI) 30.0 (53.6003547253, -32.7089000169) 1 DateTime("2022-09-15T09:08:00.748900099", TAI) 30.0 (52.3636782613, -35.9958442960) 2 DateTime("2022-09-15T09:09:22.915099862", TAI) 30.0 (50.8415002597, -42.6478842974) 3 DateTime("2022-09-15T09:10:01.795099843", TAI) 30.0 (54.5372572359, -40.1834782290) 4 DateTime("2022-09-15T09:10:40.761799838", TAI) 30.0 (56.2862239385, -35.0230765514) 5 DateTime("2022-09-15T09:11:19.641799818", TAI) 30.0 (58.4357468320, -33.0205890574) 6 DateTime("2022-09-15T09:12:02.755099887", TAI) 30.0 (55.7954543284, -41.4556164216) 7 DateTime("2022-09-15T09:12:41.635099867", TAI) 30.0 (55.4352486275, -39.3319483539)
Print spatial information from the bounding box (bbox) and world coordinate system (wcs) metadata: the four corners of the image. Note that the XY coordinates of the image corners must be converted to RA, Dec using the wcs.pixelToSky method.
for i, ref in enumerate(datasetRefs):
bbox = butler.get('calexp.bbox', dataId=ref.dataId)
wcs = butler.get('calexp.wcs', dataId=ref.dataId)
corners_xy = bbox.getCorners()
tmp = ''
for corn in corners_xy:
radec = wcs.pixelToSky(corn.x, corn.y)
tmp += f'({radec.getRa().asDegrees():.4f}, {radec.getDec().asDegrees():.4f}) '
print(i, tmp)
0 (52.9633, -33.8997) (53.2206, -33.9742) (53.1326, -34.1836) (52.8748, -34.1089) 1 (52.1884, -37.2926) (52.4725, -37.2974) (52.4670, -37.5191) (52.1821, -37.5143) 2 (51.1828, -43.9284) (51.4842, -43.8655) (51.5706, -44.0783) (51.2683, -44.1414) 3 (55.0753, -41.4227) (55.3518, -41.3328) (55.4699, -41.5361) (55.1928, -41.6262) 4 (56.4921, -36.3167) (56.7660, -36.2679) (56.8258, -36.4844) (56.5513, -36.5333) 5 (58.6988, -34.3065) (58.9634, -34.2488) (59.0323, -34.4632) (58.7671, -34.5209) 6 (56.5957, -42.6167) (56.8548, -42.4955) (57.0168, -42.6825) (56.7573, -42.8040) 7 (56.0375, -40.5519) (56.3048, -40.4529) (56.4331, -40.6520) (56.1653, -40.7513)
Optional: view the help documentation for a datasetRef.
# help(ref)
Tip! In the case where metadata is desired for many images at once, you can avoid the time-consuming use of butler.get for individual dataIds by using the "with_dimension_records" keyword when retrieving datasetRefs.
datasetRefs = butler.query_datasets(dataset_type='calexp', band='i', detector=175,
where='visit > 192000 and visit < 193000', with_dimension_records=True)
for ref in datasetRefs:
print(ref)
calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=8a953c03-21bd-4878-bfa6-94dbf628ea81) calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192351, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=86a9abef-fc5f-46f0-bc2a-e9d9939dc22f) calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192353, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=e314c313-9298-4f8e-8800-58617f945ee0) calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192354, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=39208c7f-62d2-4e2c-bf5c-f8e628f3e89b) calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192355, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=d907c683-0856-4f3a-ade8-96c454698a2a) calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192356, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=308754f0-4d48-4be8-8df9-27408fdab3c1) calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192357, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=8f31446c-8351-4264-a844-a693f7e54fbc) calexp@{instrument: 'LSSTCam-imSim', detector: 175, visit: 192358, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T144437Z id=9907f51e-74ec-4049-be12-0178bccfb7b9)
Print the id, exposure time, timespan, and bounding box for each datasetRef:
for i, ref in enumerate(datasetRefs):
record = ref.dataId.records["visit"]
print(i, record.id, record.exposure_time, record.timespan, record.region.getBoundingBox())
0 192350 30.0 [2022-09-15T09:07:05, 2022-09-15T09:07:35) Box([0.8933399435361895, 0.9772575716677872], [-0.6061470685727063, -0.5355792836436872]) 1 192351 30.0 [2022-09-15T09:07:45, 2022-09-15T09:08:15) Box([0.8746909495312298, 0.9522135881631125], [-0.6594148394681437, -0.5968504795232021]) 2 192353 30.0 [2022-09-15T09:09:07, 2022-09-15T09:09:37) Box([0.8404055595335118, 0.9351141173916367], [-0.7791241542029297, -0.7094892741294359]) 3 192354 30.0 [2022-09-15T09:09:46, 2022-09-15T09:10:16) Box([0.9052743096160711, 0.9988246313418908], [-0.7370553056323268, -0.6655943815542785]) 4 192355 30.0 [2022-09-15T09:10:25, 2022-09-15T09:10:55) Box([0.9409751100908125, 1.0244567661686226], [-0.6453842818240588, -0.5770618372743993]) 5 192356 30.0 [2022-09-15T09:11:04, 2022-09-15T09:11:34) Box([0.9788809681881151, 1.0614534217123552], [-0.6108848328050508, -0.5416874215742732]) 6 192357 30.0 [2022-09-15T09:11:47, 2022-09-15T09:12:17) Box([0.9256414487011633, 1.021934922099574], [-0.7596270666837058, -0.6874489672466871]) 7 192358 30.0 [2022-09-15T09:12:26, 2022-09-15T09:12:56) Box([0.9212122431756704, 1.0140976088181841], [-0.722388612802868, -0.6505484697455468])
3.2. Temporal image queries¶
The following examples show how to query for data sets that include a desired coordinate and observation date.
Since we only need to get the date and time that the exposure was taken, we can start by retrieving only the visitInfo associated with the calexp specified by our dataId (this will be faster than retrieving the full calexp).
dataId = {'visit': 192350, 'detector': 175}
visitInfo = butler.get('calexp.visitInfo', dataId=dataId)
print(visitInfo)
VisitInfo(exposureTime=30, darkTime=30, date=2022-09-15T09:07:20.140899826, UT1=nan, ERA=2.28122 rad, boresightRaDec=(53.6003547253, -32.7089000169), boresightAzAlt=(245.0397119707, +83.7529317472), boresightAirmass=1.00506, boresightRotAngle=3.47434 rad, rotType=1, observatory=-30.2446N, -70.7494E 2663, weather=Weather(nan, nan, 40), instrumentLabel='LSSTCam-imSim', id=192350, focusZ=nan, observationType='SKYEXP', scienceProgram='', observationReason='', object='', hasSimulatedContent=false)
To query for dimension records or datasets that overlap an arbitrary time range, we can use the bind
argument to pass times through to where
.
Using bind
to define an alias for a variable saves us from having to string-format the times into the where
expression.
Note that a dafButler.Timespan
will accept a begin
or end
value that is equal to None
if it is unbounded on that side.
Use bind
and where
, along with astropy.time, to query for calexps that were obtained within +/- 10 minutes of the calexp defined by the dataId above.
time = astropy.time.Time(visitInfo.date.toPython())
minute = astropy.time.TimeDelta(60, format="sec")
timespan = dafButler.Timespan(time - 10*minute, time + 10*minute)
print(time)
print(minute)
print(timespan)
2022-09-15 09:07:20.140900 60.0 [2022-09-15T08:57:57, 2022-09-15T09:17:57)
datasetRefs = butler.query_datasets("calexp",
where="visit.timespan OVERLAPS my_timespan",
bind={"my_timespan": timespan})
for i, ref in enumerate(datasetRefs):
print(ref.dataId)
if i > 6:
print('...')
break
print(f"Found {len(list(datasetRefs))} calexps")
{instrument: 'LSSTCam-imSim', detector: 0, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 0, visit: 192354, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 0, visit: 192355, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 0, visit: 192356, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 0, visit: 192357, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 0, visit: 192358, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 0, visit: 192359, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', detector: 1, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} ... Found 1636 calexps
How many unique visits were obtained within the DC2 area within +/- 10 minutes of the calexp defined above?
temp = []
for i, ref in enumerate(datasetRefs):
temp.append(ref.dataId['visit'])
unique_visitIds = np.unique(np.sort(np.asarray(temp, dtype='int')))
print('Number of unique visits: ', len(unique_visitIds))
print('visitIds for the unique visits: ', unique_visitIds)
del temp
Number of unique visits: 13 visitIds for the unique visits: [192347 192348 192350 192351 192352 192353 192354 192355 192356 192357 192358 192359 192360]
The maximum number of visits that can be executed in 20 minutes is about 34 visits. The reason why we find only 13 is because for DC2 the minion_1016 observing strategy baseline simulation was used, and minion_1016 simulates observations over the whole sky but DC2 covers only 300 square degrees.
dataId = {'visit': 192350, 'detector': 175}
visitInfo = butler.get('calexp.visitInfo', dataId=dataId)
print(visitInfo)
VisitInfo(exposureTime=30, darkTime=30, date=2022-09-15T09:07:20.140899826, UT1=nan, ERA=2.28122 rad, boresightRaDec=(53.6003547253, -32.7089000169), boresightAzAlt=(245.0397119707, +83.7529317472), boresightAirmass=1.00506, boresightRotAngle=3.47434 rad, rotType=1, observatory=-30.2446N, -70.7494E 2663, weather=Weather(nan, nan, 40), instrumentLabel='LSSTCam-imSim', id=192350, focusZ=nan, observationType='SKYEXP', scienceProgram='', observationReason='', object='', hasSimulatedContent=false)
If we query for deepCoadd
datasets with a visit
+detector
dataId, we'll get just the deepCoadd objects that overlap that observation and have the same band (because a visit implies a band).
This is a very simple spatial query for data that overlaps other data.
refs = butler.query_datasets("deepCoadd", data_id=dataId, order_by='tract')
for ref in refs:
print(ref)
deepCoadd@{band: 'i', skymap: 'DC2', tract: 4024, patch: 47} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_17/20220220T035232Z id=4ce0e04c-6ff2-47ec-8cab-f62c2ae3b629) deepCoadd@{band: 'i', skymap: 'DC2', tract: 4024, patch: 48} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_17/20220220T035232Z id=88d5831d-f06e-4509-90ef-f09368de7765) deepCoadd@{band: 'i', skymap: 'DC2', tract: 4225, patch: 2} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_19/20220221T031952Z id=b7d97586-7797-4766-8d95-b9d1a72cf0b8) deepCoadd@{band: 'i', skymap: 'DC2', tract: 4225, patch: 3} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_19/20220221T031952Z id=0dc74194-599c-45ee-affd-3ffb0f684061) deepCoadd@{band: 'i', skymap: 'DC2', tract: 4225, patch: 9} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_19/20220221T031952Z id=12260e6c-5ff6-4f8c-8e79-8d38fa178d36) deepCoadd@{band: 'i', skymap: 'DC2', tract: 4225, patch: 10} [sc=ExposureF] (run=2.2i/runs/DP0.2/v23_0_1/PREOPS-905/step3_19/20220221T031952Z id=55cca055-8f22-440f-8202-8ab153107393)
Use the corners of the calexp
and of the overlapping deepCoadd
patches to trace the image edges in a plot, and show the overlap.
In the plot below, the deepCoadd
tract and patch (4024, 48) does not quite overlap with the original calexp
.
That is because the visit and detector region from the calexp
WCS and bounding box is not quite the same as the region in the database, which is based on the raw WCS with some padding.
calexp_wcs = butler.get('calexp.wcs', dataId=dataId)
calexp_bbox = butler.get('calexp.bbox', dataId=dataId)
calexp_corners_ra = []
calexp_corners_dec = []
for corn in calexp_bbox.getCorners():
radec = calexp_wcs.pixelToSky(corn.x, corn.y)
calexp_corners_ra.append(radec.getRa().asDegrees())
calexp_corners_dec.append(radec.getDec().asDegrees())
calexp_corners_ra.append(calexp_corners_ra[0])
calexp_corners_dec.append(calexp_corners_dec[0])
fig = plt.figure(figsize=(6, 6))
plt.plot(calexp_corners_ra, calexp_corners_dec, ls='solid', color='grey', label='visit detector')
for r, ref in enumerate(set(butler.query_datasets("deepCoadd", data_id=dataId))):
deepCoadd_dataId = ref.dataId
str_tract_patch = '(' + str(ref.dataId['tract']) + ', ' + str(ref.dataId['patch'])+')'
deepCoadd_wcs = butler.get('deepCoadd.wcs', dataId=deepCoadd_dataId)
deepCoadd_bbox = butler.get('deepCoadd.bbox', dataId=deepCoadd_dataId)
deepCoadd_corners_ra = []
deepCoadd_corners_dec = []
for corn in deepCoadd_bbox.getCorners():
radec = deepCoadd_wcs.pixelToSky(corn.x, corn.y)
deepCoadd_corners_ra.append(radec.getRa().asDegrees())
deepCoadd_corners_dec.append(radec.getDec().asDegrees())
deepCoadd_corners_ra.append(deepCoadd_corners_ra[0])
deepCoadd_corners_dec.append(deepCoadd_corners_dec[0])
plt.plot(deepCoadd_corners_ra, deepCoadd_corners_dec, ls='solid', lw=1, label=str_tract_patch)
plt.xlabel('RA')
plt.ylabel('Dec')
plt.legend(loc='upper left', ncol=3)
plt.show()
Figure 1: The bounding box of one detector from a single visit image (a
calexp
) is drawn in gray, and six of the nearestdeepCoadd
patches are drawn in colors (as in legend), all but one overlapping thecalexp
.
3.3.2. User-defined spatial constraints on images¶
Often one wants to know what images overlap a given point on the sky. Such spatial queries can be accomplished using the "region OVERLAPS POINT(ra, dec)" syntax (e.g., see this Butler queries documentation). Let us see how this works.
Specify the desired sky coordinate for the search. Below, the RA and Dec could be user-specified, but here we use the telescope boresight accessed from the calexp visitInfo retrieved above.
ra, dec = visitInfo.boresightRaDec
Specify a small timespan for the query:
small_timespan = dafButler.Timespan(time - minute, time + minute)
Pass the RA and Dec to the query_datasets command, using "visit_detector_region.region OVERLAPS POINT(ra, dec)" in the "where" clause of the query. Also apply the same timespan constraints as above.
datasetRefs = butler.query_datasets("calexp", where="visit.timespan OVERLAPS my_timespan AND \
visit_detector_region.region OVERLAPS POINT(ra, dec)",
bind={"my_timespan": small_timespan, "ra": ra.asDegrees(), "dec": dec.asDegrees()})
print(datasetRefs)
print(f"\nFound {len(datasetRefs)} calexps")
[DatasetRef(DatasetType('calexp', {band, instrument, detector, physical_filter, visit_system, visit}, ExposureF), {instrument: 'LSSTCam-imSim', detector: 94, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1}, run='2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T041605Z', id=1a773da5-9db5-4413-8457-c194bfe52b28)] Found 1 calexps
Thus, with the above query, we have uniquely recovered the visit for our desired temporal and spatial constraints.
3.3.3. A note about "dimensions"¶
You may have wondered how we knew to use "visit_detector_region.region" in the above query. Constraints provided in the "where" clause of butler.query_datasets
can be based on any of the attributes associated with each dimension
or element
(elements can be thought of as joins or combinations of dimension traits).
See more about querying dimension records here.
First, list the available dimensions and elements:
print("Dimensions:\n", butler.dimensions.getStaticDimensions())
print("\nElements:\n", butler.dimensions.getStaticElements())
Dimensions: {band, healpix1, healpix2, healpix3, healpix4, healpix5, healpix6, healpix7, healpix8, healpix9, healpix10, healpix11, healpix12, healpix13, healpix14, healpix15, healpix16, healpix17, htm1, htm2, htm3, htm4, htm5, htm6, htm7, htm8, htm9, htm10, htm11, htm12, htm13, htm14, htm15, htm16, htm17, htm18, htm19, htm20, htm21, htm22, htm23, htm24, instrument, skymap, detector, physical_filter, subfilter, tract, visit_system, exposure, patch, visit} Elements: {band, healpix1, healpix2, healpix3, healpix4, healpix5, healpix6, healpix7, healpix8, healpix9, healpix10, healpix11, healpix12, healpix13, healpix14, healpix15, healpix16, healpix17, htm1, htm2, htm3, htm4, htm5, htm6, htm7, htm8, htm9, htm10, htm11, htm12, htm13, htm14, htm15, htm16, htm17, htm18, htm19, htm20, htm21, htm22, htm23, htm24, instrument, skymap, detector, physical_filter, subfilter, tract, visit_system, exposure, patch, visit, visit_definition, visit_detector_region}
Note that these are the same except for the additional two entries at the end of the "elements" list.
For an example dimension and element, use their "schema" attribute to see what fields are associated with them:
print(butler.dimensions["exposure"].schema)
print("\n", butler.dimensions.elements["visit_detector_region"].schema)
exposure: instrument: string id: int physical_filter: string obs_id: string exposure_time: float Duration of the exposure with shutter open (seconds). dark_time: float Duration of the exposure with shutter closed (seconds). observation_type: string The observation type of this exposure (e.g. dark, bias, science). observation_reason: string The reason this observation was taken. (e.g. science, filter scan, unknown). day_obs: int Day of observation as defined by the observatory (YYYYMMDD format). seq_num: int Counter for the observation within a larger sequence. Context of the sequence number is observatory specific. Can be a global counter or counter within day_obs. group_name: string String group identifier associated with this exposure by the acquisition system. group_id: int Integer group identifier associated with this exposure by the acquisition system. target_name: string Object of interest for this observation or survey field name. science_program: string Observing program (survey, proposal, engineering project) identifier. tracking_ra: float Tracking ICRS Right Ascension of boresight in degrees. Can be NULL for observations that are not on sky. tracking_dec: float Tracking ICRS Declination of boresight in degrees. Can be NULL for observations that are not on sky. sky_angle: float Angle of the instrument focal plane on the sky in degrees. Can be NULL for observations that are not on sky, or for observations where the sky angle changes during the observation. zenith_angle: float Angle in degrees from the zenith at the start of the exposure. timespan: timespan visit_detector_region: instrument: string detector: int visit: int region: region
These "fields" contain information that can be used in query constraints.
3.3.4. query_data_ids¶
The query_data_ids
method is a less general approach that will return the combinations of dimensions that could be used to identify datasets. The documentation page for query_data_ids
outlines when not to use it.
Use it to find the dataIds overlapping the small timespan defined above:
for i, data_id in enumerate(butler.query_data_ids("visit", where="visit.timespan OVERLAPS my_timespan",
bind={"my_timespan": small_timespan})):
print(data_id)
{instrument: 'LSSTCam-imSim', visit: 192352, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', visit: 192351, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} {instrument: 'LSSTCam-imSim', visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1}
3.4. Catalog spatial and temporal queries¶
The recommended method for querying and retrieving catalog data is to use the TAP service, as demonstrated in other tutorials. However, it is also possible to query catalog data using the same spatial and temporal constraints as used above for images.
The Butler's spatial reasoning is designed to work well for regions the size of full data products, like detector- or patch-level images and catalogs, and it's a poor choice for smaller-scale searches.
The following search is a bit slow in part because query_datasets
searches for all src
datasets that overlap a larger region and then filters the results down to the specified region.
for i, src_ref in enumerate(butler.query_datasets("source", band="i",
where="visit.timespan OVERLAPS my_timespan AND \
visit_detector_region.region OVERLAPS POINT(ra, dec)",
bind={"my_timespan": small_timespan, "ra": ra.asDegrees(), "dec": dec.asDegrees()})):
print(src_ref)
sources = butler.get(src_ref)
print('Number of sources: ', len(sources))
if i > 2:
print('...')
break
source@{instrument: 'LSSTCam-imSim', detector: 94, visit: 192350, band: 'i', physical_filter: 'i_sim_1.4', visit_system: 1} [sc=DataFrame] (run=2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211218T041605Z id=bccc6d47-0712-483c-9f3f-ef4c8c0b1919) Number of sources: 2290
Show the contents of the last source table retrieved from the Butler. Notice that both the rows and the columns of the table are truncated.
sources
coord_ra | coord_dec | parent | calib_detected | calib_psf_candidate | calib_psf_used | calib_psf_reserved | deblend_nChild | deblend_deblendedAsPsf | deblend_psfCenter_x | ... | ext_photometryKron_KronFlux_apCorr | ext_photometryKron_KronFlux_apCorrErr | ext_photometryKron_KronFlux_flag_apCorr | base_ClassificationExtendedness_value | base_ClassificationExtendedness_flag | base_FootprintArea_value | calib_astrometry_used | calib_photometry_used | calib_photometry_reserved | ccdVisitId | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||||||||
103267170389065729 | 0.934055 | -0.568410 | 0 | False | False | False | False | 0 | False | NaN | ... | 1.040659 | 0.0 | False | NaN | True | 67 | False | False | False | 192350094 |
103267170389065730 | 0.934241 | -0.568464 | 0 | False | False | False | False | 0 | False | NaN | ... | 1.041038 | 0.0 | False | NaN | True | 138 | False | False | False | 192350094 |
103267170389065731 | 0.934969 | -0.568675 | 0 | False | False | False | False | 0 | False | NaN | ... | 1.042161 | 0.0 | False | NaN | True | 85 | False | False | False | 192350094 |
103267170389065732 | 0.935404 | -0.568800 | 0 | True | False | False | False | 0 | False | NaN | ... | 1.042557 | 0.0 | False | NaN | True | 286 | False | False | False | 192350094 |
103267170389065733 | 0.935944 | -0.568956 | 0 | False | False | False | False | 0 | False | NaN | ... | 1.042766 | 0.0 | False | NaN | True | 58 | False | False | False | 192350094 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
103267170389068014 | 0.936796 | -0.573223 | 103267170389067564 | False | False | False | False | 0 | True | 3876.0 | ... | 1.039705 | 0.0 | False | NaN | True | 547 | False | False | False | 192350094 |
103267170389068015 | 0.933325 | -0.572254 | 103267170389067576 | False | False | False | False | 0 | True | 707.0 | ... | 1.029616 | 0.0 | False | 1.0 | False | 385 | False | False | False | 192350094 |
103267170389068016 | 0.933342 | -0.572257 | 103267170389067576 | False | False | False | False | 0 | True | 722.0 | ... | 1.029702 | 0.0 | False | NaN | True | 345 | False | False | False | 192350094 |
103267170389068017 | 0.934576 | -0.572626 | 103267170389067578 | False | False | False | False | 0 | True | 1857.0 | ... | 1.034726 | 0.0 | False | 1.0 | False | 234 | False | False | False | 192350094 |
103267170389068018 | 0.934570 | -0.572630 | 103267170389067578 | False | False | False | False | 0 | True | 1854.0 | ... | 1.034697 | 0.0 | False | 1.0 | False | 234 | False | False | False | 192350094 |
2290 rows × 451 columns
4. Explore metadata for butler-retrieved data products¶
Data retrieved from the butler is enriched with metadata.
The following provides a cursory overview of how to explore this metadata for image and catalog data retrieved via the butler.
In some cases these options were already covered in the sections above, but have been gathered here for easy reference.
4.1. Image data¶
Retrieve a calexp
for a given visit and detector.
calexp = butler.get('calexp', dataId={'visit': 192350, 'detector': 94})
Get the information (metadata) available for this calexp.
calexp_info = calexp.getInfo()
Option: uncomment the following cell, put the cursor after the period, and press the tab key.
A pop-up window will display the methods available for calexp_info
.
# calexp_info.
Option: alternatively, print all options that would display in the pop-up window from the above cell.
# [m for m in dir(calexp_info) if not m.startswith('_')]
Obtain the visit information and summary statistics for this calexp.
visit_info = calexp_info.getVisitInfo()
summary_info = calexp_info.getSummaryStats()
Print the summary statistics for this visit.
summary_info
ExposureSummaryStats(version=0, psfSigma=1.6224822824659335, psfArea=38.96138265025744, psfIxx=2.6227677300688836, psfIyy=2.6421747649857115, psfIxy=0.0049247354220286026, ra=53.60067870252808, dec=-32.708777698927484, pixelScale=nan, zenithDistance=6.089820821456726, expTime=nan, zeroPoint=31.850734490895796, skyBg=3510.3904351890087, skyNoise=70.74531194603915, meanVar=5077.996807775005, raCorners=[53.51650658575399, 53.77062575371195, 53.685124787470684, 53.430466675033024], decCorners=[-32.566885411487135, -32.64026617626658, -32.85061564707703, -32.77706403154314], astromOffsetMean=0.006518872808538737, astromOffsetStd=0.003270360827952619, nPsfStar=74, psfStarDeltaE1Median=-0.0004757175604770205, psfStarDeltaE2Median=-0.0028587643089524606, psfStarDeltaE1Scatter=0.009978372216148206, psfStarDeltaE2Scatter=0.011134220890580537, psfStarDeltaSizeMedian=0.0012709535429201724, psfStarDeltaSizeScatter=0.010947698252411391, psfStarScaledDeltaSizeScatter=0.004160744279008074, psfTraceRadiusDelta=nan, psfApFluxDelta=nan, psfApCorrSigmaScaledDelta=nan, maxDistToNearestPsf=nan, effTime=nan, effTimePsfSigmaScale=nan, effTimeSkyBgScale=nan, effTimeZeroPointScale=nan, magLim=nan)
Option: explore other aspects of the metadata, for example, detector information.
# [m for m in dir(calexp_info.getDetector()) if not m.startswith('_')]
4.2. Catalog data¶
Retrieve sources from the src
table for a given visit and detector.
src_cat = butler.get('src', dataId={'visit': 192350, 'detector': 94})
Option: uncomment the following cell, put the cursor after the period, and press the tab key. A pop-up window will display the options available for the retrieved data product.
# src_cat.
Option: uncomment and execute the following cell in order to print the table retrieved (it will be truncated).
# src_cat
Get the names of the columns available in the schema for this data product.
columns = src_cat.schema.getNames()
Option: uncomment and execute the following cell in order to list all columns (it is a very long list).
# columns
List only the first part of all column names, before the underscore. This shows the different types of columns available.
src_cat.getSchema().getNames(topOnly=True)
{'base', 'calib', 'coord', 'deblend', 'detect', 'ext', 'id', 'parent', 'sky'}
To view the information for only a given element, use the find
method.
src_cat.getSchema().find('id')
SchemaItem(key=Key<L>(offset=0, nElements=1), field=Field['L'](name="id", doc="unique ID"))