Image Queries with the TAP Service¶
Contact author: Melissa Graham
Last verified to run: 2024-07-28
LSST Science Piplines version: Weekly 2024_16
Container Size: medium
Targeted learning level: intermediate
Description: Retrieve and display images using the ObsTAP service.
Skills: Query for image data via the TAP service; retrieve image data with PyVO and Datalink; display and save images.
LSST Data Products: calexp, deepCoadd
Packages: lsst.rsp, lsst.afw, pyvo.dal.adhoc, urllib.request
Credit: Originally developed by Melissa Graham and the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.
Get Support: Find DP0-related documentation and resources at dp0.lsst.io. Questions are welcome as new topics in the Support - Data Preview 0 category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
The simulted images for Rubin's Data Preview 0 (DP0) are available via both the data butler and the TAP (Table Access Protocol) service.
Use of the butler to query and retrieve images is covered in other tutorials, and is generally the recommended way to access DP0 images while working within the Notebook Aspect of the Rubin Science Platform, especially if any image reprocessing is being done. However, image access via the TAP service is also possible. It is how images are accessed via the Portal and API Aspects of the Rubin Science Platform.
This tutorial demonstrates how to discover the existence of image
data with the TAP service (Section 2),
how to write ADQL queries to retrieve image metadata (Section 3),
and how to use the PyVO's Datalink package for image data retrieval (Section 4).
It also demonstrates how to display the full retrieved images with matplotlib
,
afwDisplay
, and Firefly
.
Techniques for image display and manipulation, and the creation of image cutouts,
are covered in other DP0.2 tutorials.
1.1. Known issue with the sky region (s_region
)¶
The sky region (s_region
) refers to image metadata that contains the region of sky covered by an image.
For the Rubin data, the region is provided as a four-corner polygon in International Celestial Reference System (ICRS) coordinates of right ascension (RA) and declination (Dec), in decimal degrees. This piece of metadata enables the common image query of "does this image contain this point".
The s_region
metadata is of type string, and is formatted like:
POLYGON ICRS 61.693243 -36.780375 61.702475 -37.034112 62.015405 -37.026409 62.005102 -36.772698
Known Issue: The s_region
for individual detectors of single-visit images (e.g., raw, calexp
images)
has some padding and is larger than the actual image boundary.
Due to this issue, queries that request all images containing a given coordinate will return images for which the coordinate is in the padding zone, and not in the image.
For processed visit images (e.g., calexp
s), the s_region
parameter is defined at the time when the visit is defined for observation.
It uses the camera geometry and the telescope boresight, and has some padding.
Think of it as the anticipated outer boundaries that will contain the sky region imaged by that specific detector,
and not based on the image WCS or astrometric solution.
Although the image metadata will evolve from what is available for DP0.2, and query by WCS boundaries should become
possible, users should be prepared for cases where the coordinate is within the s_region
but is not covered by image pixel data.
For example, detector defects, lost amplifiers, etc. would not affect the s_region
.
This issue is demonstrated in Section 4.1.
1.2. DAL Service Errors¶
This tutorial notebook runs a lot of different TAP queries.
Users might encounter the following error.
DALServiceError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Typically this is fixed by rerunning the cell with the TAP service fetch statement. If that doesn't fix it, try restarting the kernel and clearing the notebook, and re-executing the cells. (In the menu, click on "Kernel" and then "Restart Kernal and Clear Outputs of All Cells...").
1.3. Import packages¶
Import packages a few standard python packages, lsst packages for image display and data access, astropy packages for working with images, and a module from the pvyo.dal package for data access.
import os
import numpy as np
import matplotlib.pyplot as plt
from urllib.request import urlretrieve
import lsst.afw.display as afwDisplay
import lsst.geom as geom
from lsst.rsp import get_tap_service
from lsst.afw.image import ExposureF
import astropy.visualization as vis
from astropy.io import fits
from astropy.wcs import WCS
from astropy.coordinates import SkyCoord
from astropy import units as u
from pyvo.dal.adhoc import DatalinkResults
1.4. Define functions and paramters¶
Instantiate the RSP's TAP service.
service = get_tap_service("tap")
Define auth_session
in order to authorize the TAP service to
access LSST images via Datalink in Section 4.
auth_session = service._session
Define a point near the center of the Data Preview 0 area with coordintes Right Ascension (RA) = 62 degrees and Declination (Dec) = -37 degrees. Use several different formats for the different use cases throughout this tutorial
target_ra
,target_dec
: floats, to be used in calculationstarget_str_ra
,target_str_dec
: strings, to be inserted into query statementstargetPoint
: aSpherePoint
object from the LSST Science Pipelines'geom
packagetargetCoord
: an astropySkyCoord
class
target_ra = 62.0
target_dec = -37.0
target_str_ra = '62.0'
target_str_dec = '-37.0'
targetPoint = geom.SpherePoint(target_ra*geom.degrees,
target_dec*geom.degrees)
targetCoord = SkyCoord(ra=target_ra*u.degree,
dec=target_dec*u.degree,
frame='icrs')
Define a function sregion_to_vertices
to be used in Section 3.5.
def sregion_to_vertices(sregion: str):
"""Convert the s_region from the ObsCore table into two
arrays containing the x and y vertices, in order to plot
boxes using matplotlib.
Parameters
----------
str_polygon: `str`
String formatted with 10 space-separated items, e.g.,
the "s_region" column from the ObsCore table which has
the words "POLYGON ICRS" followed by 8 numbers:
"POLYGON ICRS # # # # # # # #".
Returns
-------
x_vertices: `np.ndarray`
The array of x-vertices.
y_vertices: `np.ndarray`
The array of y-vertices.
"""
temp = sregion.split(' ')
xvertices = []
yvertices = []
ix = 2
iy = 3
for c in range(4):
xvertices.append(float(temp[ix]))
yvertices.append(float(temp[iy]))
ix += 2
iy += 2
xvertices.append(xvertices[0])
yvertices.append(yvertices[0])
return xvertices, yvertices
query = "SELECT * FROM tap_schema.schemas"
results = service.search(query)
results.to_table().to_pandas()
description | schema_index | schema_name | utype | |
---|---|---|---|---|
0 | Data Preview 0.1 includes five tables based on... | 1 | dp01_dc2_catalogs | |
1 | Data Preview 0.2 contains the image and catalo... | 0 | dp02_dc2_catalogs | |
2 | ObsCore v1.1 attributes in ObsTAP realization | 2 | ivoa | |
3 | A TAP-standard-mandated schema to describe tab... | 100000 | tap_schema | |
4 | UWS Metadata | 120000 | uws |
The fact that catalogs are available with schema_name
= dp02_dc2_catalogs
is pretty clear in the above table.
Although it is not clear, DP0 images are stored in ObsTAP
, with schema_name
= ivoa
.
This will be clarified for future data releases.
del query, results
2.1. The IVOA schema¶
IVOA stands for International Virtual Observatory Alliance, and they work towards defining standards for astronomical data storage that enable wide data sharing. The LSST Science Pipelines use IVOA standards.
Retrieve all of the tables available in the ivoa
schema and display them as a pandas table.
query = "SELECT * " \
"FROM tap_schema.tables " \
"WHERE tap_schema.tables.schema_name = 'ivoa'"
results = service.search(query)
results.to_table().to_pandas()
description | schema_name | table_index | table_name | table_type | utype | |
---|---|---|---|---|---|---|
0 | Observation metadata in the ObsTAP relational ... | ivoa | 1 | ivoa.ObsCore | table |
There is only one ivoa
table available for DP0: ObsCore
.
del query, results
2.2. The ObsCore table¶
The LSST ObsCore
table is essentially a view into the images
stored in the LSST's data butler registry (e.g., DMTN-236).
Retrieve the names of the columns and their data types, descriptions and units, and display these properties as a pandas table.
The output of the following cell is a list of the image metadata
that is available in the ObsCore
table.
query = "SELECT column_name, datatype, description, unit " \
"FROM tap_schema.columns " \
"WHERE table_name = 'ivoa.ObsCore'"
results = service.search(query)
results.to_table().to_pandas()
column_name | datatype | description | unit | |
---|---|---|---|---|
0 | access_format | char | Content format of the dataset | |
1 | access_url | char | URL used to access dataset | |
2 | calib_level | int | Calibration level of the observation: in {0, 1... | |
3 | dataproduct_subtype | char | Data product specific type | |
4 | dataproduct_type | char | Data product (file content) primary type | |
5 | em_max | double | stop in spectral coordinates | m |
6 | em_min | double | start in spectral coordinates | m |
7 | em_res_power | double | Value of the resolving power along the spectra... | |
8 | em_xel | long | Number of elements along the spectral axis | |
9 | facility_name | char | The name of the facility, telescope, or space ... | |
10 | instrument_name | char | The name of the instrument used for the observ... | |
11 | lsst_band | char | Abstract filter band designation | |
12 | lsst_ccdvisitid | long | Identifier for visit+CCD; useful in JOINs | |
13 | lsst_detector | long | Identifier for CCD within the LSSTCam focal plane | |
14 | lsst_filter | char | Physical filter designation from the LSSTCam f... | |
15 | lsst_patch | long | Lower level of LSST coadd skymap hierarchy | |
16 | lsst_tract | long | Upper level of LSST coadd skymap hierarchy | |
17 | lsst_visit | long | Identifier for a specific LSSTCam pointing | |
18 | o_ucd | char | Nature of the observable axis | |
19 | obs_collection | char | Name of the data collection | |
20 | obs_id | char | Internal ID given by the ObsTAP service | |
21 | obs_publisher_did | char | ID for the Dataset given by the publisher | |
22 | obs_title | char | Brief description of dataset in free format | |
23 | pol_xel | long | Number of elements along the polarization axis | |
24 | s_dec | double | Central Spatial Position in ICRS; Declination | deg |
25 | s_fov | double | Estimated size of the covered region as the di... | deg |
26 | s_ra | double | Central Spatial Position in ICRS; Right ascension | deg |
27 | s_region | char | Sky region covered by the data product (expres... | |
28 | s_resolution | double | Spatial resolution of data as FWHM of PSF | arcsec |
29 | s_xel1 | long | Number of elements along the first coordinate ... | |
30 | s_xel2 | long | Number of elements along the second coordinate... | |
31 | t_exptime | double | Total exposure time | s |
32 | t_max | double | Stop time in MJD | d |
33 | t_min | double | Start time in MJD | d |
34 | t_resolution | double | Temporal resolution FWHM | s |
35 | t_xel | long | Number of elements along the time axis | |
36 | target_name | char | Object of interest |
del query, results
3. Image metadata¶
Explore the metadata for all images that overlap the target point defined in Section 1.4.
Now that the queries are for data (not just schemas), and might take longer, use asynchronous queries (i.e., use the job service as shown below).
The following query selects all columns from the ObsCore
table
for all images that contain the
ICRS
coordinate 62, -37 within the s_region
.
(In other words, where the statment that the s_region
contains
the point is True
or = 1
.)
query = "SELECT * FROM ivoa.ObsCore "\
"WHERE CONTAINS(POINT('ICRS', " + target_str_ra + \
", " + target_str_dec + "), s_region) = 1"
print(query)
SELECT * FROM ivoa.ObsCore WHERE CONTAINS(POINT('ICRS', 62.0, -37.0), s_region) = 1
Define the service job using the query, then run it.
job = service.submit_job(query)
job.run()
<pyvo.dal.tap.AsyncTAPJob at 0x7dfb3a269cd0>
Wait until the job is completed.
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED
If the job phase is "COMPLETED", move on to the cell below.
If the job phase is is "ERROR", uncomment the following cell and execute it to see more information about the error. The most common errors are ADQL syntax errors which can be fixed by editing the query and re-submitting the job.
# job.raise_if_error()
When the job phase is "COMPLETED", fetch (retrieve) the results, and print the number of rows returned. This is the number of images found that overlap the specified point.
results = job.fetch_result().to_table().to_pandas()
print(len(results))
1554
Option to view the large results table. Notice that the table is truncated both in rows and columns.
# results
3.1. Data product types¶
Print the unique values of the column dataproduct_type
(and the number of rows with that value).
column = 'dataproduct_type'
print('Unique values of: ', column)
values, counts = np.unique(results[column], return_counts=True)
for value, count in zip(values, counts):
print('%s (%i)' % (value, count))
del column, values, counts
Unique values of: dataproduct_type image (1554)
Above, notice that all rows have dataproduct_type
= "image".
This is as expected, because the ObsCore
table only holds images
3.2. Data product subtypes¶
Print the unique values of the column dataproduct_subtype
(and the number of rows with that value).
column = 'dataproduct_subtype'
print('Unique values of: ', column)
values, counts = np.unique(results[column], return_counts=True)
for value, count in zip(values, counts):
print('%s (%i)' % (value, count))
del column, values, counts
Unique values of: dataproduct_subtype lsst.calexp (510) lsst.deepCoadd_calexp (12) lsst.goodSeeingCoadd (12) lsst.goodSeeingDiff_differenceExp (510) lsst.raw (510)
Above, notice that there are five types of images.
The DP0.2 documentation contains more detailed information about the image subtypes which are summarized below.
calexp
: A reduced and calibrated image (exposure), with the background subtracted. One of the camera's 189 detectors from a processed visit image (PVI).deepCoadd_calexp
: An image created by combining (coadding)calexp
s to make one deeper image (also has had the background subtracted).goodSeeingCoadd
: An image created by combining thecalexp
s in the top 30% of best image quality (best "seeing"). This image is used as the template (reference) for difference image analysis.goodSeeingDiff_differenceExp
: A "difference image" created by subtracting thegoodSeeingCoadd
from thecalexp
.raw
: An unreduced, uncalibrated image (exposure).
Why are there 510 calexp
but only 12 deepCoadd
images that overlap coordinate 62, -37?
As the LSST makes many repeated visits to the same area of sky, there will be
many individual exposures (raw
, calexp
, and difference images) that overlap
a given coordinate.
In this case, 510 visits covered these coordinates.
The LSST Science Pipelines defines a sky map that divides the survey area into
"tracts" (large polygons) and "patches" (smaller squares about the size
of one detector).
Coadded images are stored as individual patches, and they all slightly
overlap at the edges and corners.
Any given point within the survey region will be contained by at least 1
patch, and thus by 6 deepCoadd
images (one per filter, ugrizy).
Thus, the fact that there are 12 overlapping deepCoadd
images means
that these coordinates are at the edge of a patch, overlapping with
another patch.
3.3. Calibration level¶
Print the unique values of the column calib_level
(and the number of rows with that value).
column = 'calib_level'
print('Unique values of: ', column)
values, counts = np.unique(results[column], return_counts=True)
for value, count in zip(values, counts):
print('%s (%i)' % (value, count))
del column, values, counts
Unique values of: calib_level 1 (510) 2 (510) 3 (534)
Above, notice that there are three calibration levels.
The term calib_level
is a bit of a misnomer, as it represents
the processing done to an image more than any calibrations applied.
The calexp
, deepCoadd
, and difference images have all
been photometrically and astrometrically calibrated.
The calib_level
numbers correspond to:
- Raw images (510).
- Calibrated exposures (510).
- Coadded images and difference images (510 + 12 + 12).
This becomes evident by printing the unqiue calib_level
values
for each of the dataproduct_subtype
values in turn, as done
in the following cell.
values, counts = np.unique(results['dataproduct_subtype'],
return_counts=True)
for value, count in zip(values, counts):
print('%s (%i)' % (value, count))
tx = np.where(results['dataproduct_subtype'] == value)[0]
values2, counts2 = np.unique(results['calib_level'][tx],
return_counts=True)
del tx
for value2, count2 in zip(values2, counts2):
print(' image type %s (%i)' % (value2, count2))
del values2, counts2
print(' ')
del values, counts
lsst.calexp (510) image type 2 (510) lsst.deepCoadd_calexp (12) image type 3 (12) lsst.goodSeeingCoadd (12) image type 3 (12) lsst.goodSeeingDiff_differenceExp (510) image type 3 (510) lsst.raw (510) image type 1 (510)
3.4. Band and filter¶
For calexp
s, print the unique values of lsst_band
(and the number of rows with that value).
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
values, counts = np.unique(results['lsst_band'][tx], return_counts=True)
for value, count in zip(values, counts):
print('lsst_band %s (%i)' % (value, count))
del tx, values, counts
lsst_band g (57) lsst_band i (128) lsst_band r (137) lsst_band u (34) lsst_band y (88) lsst_band z (66)
For calexp
s, print the unique values of lsst_filter
(and the number of rows with that value).
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
values, counts = np.unique(results['lsst_filter'][tx], return_counts=True)
for value, count in zip(values, counts):
print('lsst_filter %s (%i)' % (value, count))
del tx, values, counts
lsst_filter g_sim_1.4 (57) lsst_filter i_sim_1.4 (128) lsst_filter r_sim_1.4 (137) lsst_filter u_sim_1.4 (34) lsst_filter y_sim_1.4 (88) lsst_filter z_sim_1.4 (66)
Above, see that band and filter have slightly different formats but are, essentially, the six LSST filters: ugrizy.
Sort by the number of visits per filter and print the fraction of visits in that filter.
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
values, counts = np.unique(results['lsst_filter'][tx], return_counts=True)
ix = np.argsort(counts)
for value, count in zip(values[ix], counts[ix]):
print('lsst_filter %s (%4.2f)' % (value, float(count)/float(len(tx))))
del tx, values, counts, ix
lsst_filter u_sim_1.4 (0.07) lsst_filter g_sim_1.4 (0.11) lsst_filter z_sim_1.4 (0.13) lsst_filter y_sim_1.4 (0.17) lsst_filter i_sim_1.4 (0.25) lsst_filter r_sim_1.4 (0.27)
Above, the distribution of visits per filter clearly favors the r and i bands. The survey strategy used for DP0, and thus this exactly filter distribution, is not representative of the final strategy which will be used when operations start. However, the general trend of having more visits in the r and i bands, and spending the least amount of time in u-band, will persist.
For deepCoadd
s, print the unique values of lsst_band
(and the number of rows with that value).
tx = np.where(results['dataproduct_subtype'] == 'lsst.deepCoadd_calexp')[0]
values, counts = np.unique(results['lsst_band'][tx], return_counts=True)
for value, count in zip(values, counts):
print('lsst_band %s (%i)' % (value, count))
del tx, values, counts
lsst_band g (2) lsst_band i (2) lsst_band r (2) lsst_band u (2) lsst_band y (2) lsst_band z (2)
The above output of 2 deepCoadd
per filter overlapping the chosen coordinates
is as expected, as described in Section 3.2.
3.5. Visit identifiers¶
The visit identifiers apply to single-visit images only, such as raw
, calexp
, and difference images.
A "visit" is one observation with the camera, at a given time and sky pointing, in a given filter.
A "calexp" is the calibrated exposure from one of the LSST Science Camera's 189 detectors, for one visit.
First, print the number of calexp
s again.
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
print('number of calexps: ', len(tx))
number of calexps: 510
For one calexp
, print the lsst_visit
, lsst_detector
, and lsst_ccdvisitid
.
Notice that the ccdvisitid
is a combination of the visit id and the detector id.
The LSST Science Camera has 189 detectors.
The ccdvisitid
uniquely identifies a calexp
.
x = 0
print(results['lsst_visit'][tx[x]], results['lsst_detector'][tx[x]], results['lsst_ccdvisitid'][tx[x]])
del x
698740 112 698740112
Print the number of unique visits, and then number of unique ccdvisitid
s.
values, counts = np.unique(results['lsst_visit'][tx], return_counts=True)
print('number of unique visits: ', len(values))
del values, counts
number of unique visits: 443
values, counts = np.unique(results['lsst_ccdvisitid'][tx], return_counts=True)
print('number of unique ccdvisitid: ', len(values))
del values, counts
number of unique ccdvisitid: 510
As the ccdvisitid
uniquely identifies a given calexp
, the number of unique values of ccdvisitid
matches the number of calexp
s, as expected.
However, the number of unique visits returned is less.
This means that some of the calexp
s returned
by the query were obtained in the same visit.
It is not possible for a given point on the sky
to be contained by more than one detector from the same visit.
The underlying issue here is the same one as described in Section 1.1.
For visit images (e.g., calexp
s), the s_region
has padding around the
edges, and it overlaps with neighboring detectors.
Instances where the target coordinates are at the edge of a detector, and
within the padding of the neighboring detector's s_region
,
both calexp
s will be returned by the query in Section 4.1.1.
This issue is discussed further and visualized in Section 4.1.6, and might make more sense after working through that section.
del tx
3.6. Patch and tract¶
As described in Section 3.2, the LSST Science Pipelines defines a sky map that divides the survey area into "tracts" (large polygons) and "patches" (smaller squares about the size of one detector). Tracts are identified by a unique 4-digit integer (e.g., 3831). Patches are identified by a 2-digit number (e.g., 02), but it is not unique. All tracts would have a patch 01, 02, 03, and so on.
The concept of patch and tract applies only to the stacked images,
which are tesselated (joined) to make wide-area deepCoadds
.
In order to have a string that uniquely identifies a patch,
create a new column called tract_and_patch
that has a format like 3831_02
.
temp = []
for i in range(len(results)):
temp.append(str(results['lsst_tract'].loc[i]) + '_'
+ str(results['lsst_patch'].loc[i]))
results['tract_and_patch'] = np.asarray(temp, dtype='str')
del temp
For deepCoadd
s, print the unique values of tract_and_patch
(and the number of rows with that value).
tx = np.where(results['dataproduct_subtype'] == 'lsst.deepCoadd_calexp')[0]
values, counts = np.unique(results['tract_and_patch'][tx], return_counts=True)
for value, count in zip(values, counts):
print('tract and patch %s (%s)' % (value, count))
del tx, values, counts
tract and patch 3831_10 (6) tract and patch 3831_3 (6)
The above output of 2 unique patches overlapping the chosen coordinates, each with 6 images (one per filter), is as expected as described in Section 3.2.
3.7. Center and region¶
In the ObsCore
table, the image centers are held in the s_ra
and s_dec
columns.
The region covered by the image is held in the s_region
column,
which is a string that contains the corners of the image.
See Section 1.1 for a known issue with s_region
.
For the first image of subtype calexp
, print the central coordinates and their type,
and the coordinates of the corners from s_region
.
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
print(results['s_ra'][tx[0]], results['s_dec'][tx[0]])
print(type(results['s_ra'][tx[0]]), type(results['s_dec'][tx[0]]))
print(results['s_region'][tx[0]])
del tx
61.854056086259405 -36.90350302893365 <class 'numpy.float64'> <class 'numpy.float64'> POLYGON ICRS 61.693243 -36.780375 61.702475 -37.034112 62.015405 -37.026409 62.005102 -36.772698
Recall that in Section 1.4, a function was defined called sregion_to_vertices
,
which converts the s_region
into x-axis vertices and y-axis vertices,
to enable plotting image boundaries with matplotlib
.
Create a plot that draws the s_region
corners of the u-band calexp
s
and the deepCoadd
s which overlap coordinates 62, -37.
fig = plt.figure(figsize=(4, 4))
plt.plot(target_ra, target_dec, 'o', ms=5, mew=0, color='black')
tx = np.where((results['dataproduct_subtype'] == 'lsst.calexp')
& (results['lsst_band'] == 'u'))[0]
for x in tx:
xvals, yvals = sregion_to_vertices(results['s_region'][x])
plt.plot(xvals, yvals, lw=1, alpha=0.2, color='darkviolet')
del xvals, yvals
del tx
tx = np.where((results['dataproduct_subtype'] == 'lsst.deepCoadd_calexp')
& (results['lsst_band'] == 'u'))[0]
for x in tx:
xvals, yvals = sregion_to_vertices(results['s_region'][x])
plt.plot(xvals, yvals, lw=1.5, alpha=0.4, color='black')
center_ra = results['s_ra'][x]
center_dec = results['s_dec'][x]
plt.text(center_ra+0.07, center_dec, results['tract_and_patch'][x])
del xvals, yvals, center_ra, center_dec
del tx
plt.xlabel('Right Ascension [degrees]')
plt.gca().invert_xaxis()
plt.ylabel('Declination [degrees]')
plt.show()
Figure 1: Above, the u-band calexp
images's sky regions (s_region
; thin purple lines) and the
deepCoadd
sky regions (thicker black lines, labeled with tract_and_patch
at their
center) that overlap coordinate 62, -37 (black dot).
This plot uses the convention north-up, east-left (i.e., RA increases to the left).
The deepCoadd
images for all filters use the same sky map - in other words,
the boundaries of a tract and patch are independent of filter.
Although the s_region
for the u-band deepCoadd
was used to draw the boundaries
in the plot above, they would have been the same for any chosen filter.
Clean up by deleting the query, job, and results used above.
del query, job, results
4. Image pixel data¶
Retrive image data (headers and pixel data), for a calexp
(Section 4.1)
and a deepCoadd
(Section 4.2) and display them with a variety of
image display tools.
No demonstration is provided here for differences images because accessing
them is similar to accessing a calexp
; this is left as an exercise for
the user in Section 5.
No demonstration is provided here for raw images, as they have a multi-extension format which is more challenging, and covered in a separate tutorial.
4.1. Calexp¶
4.1.1. TAP query¶
query = "SELECT * FROM ivoa.ObsCore "\
"WHERE CONTAINS(POINT('ICRS', " + \
target_str_ra + ", " + target_str_dec + "), s_region) = 1 "\
"AND lsst_band = 'i' "\
"AND calib_level = 2"
print(query)
SELECT * FROM ivoa.ObsCore WHERE CONTAINS(POINT('ICRS', 62.0, -37.0), s_region) = 1 AND lsst_band = 'i' AND calib_level = 2
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
results = job.fetch_result()
print(len(results))
Job phase is COMPLETED 128
Option to view results as an astropy
table.
# results.to_table()
For this demonstration only one of the 128 images is needed,
so extract the first row into results_0
.
results_0 = results[0]
Print selected metadata:
- dataproduct subtype
- CCD visit identifier (
lsst_ccdvisitid
) - visit identifier and detector number
- central RA and Dec
- band (filter)
- exposure start time
t_min
(a modified julian date, MJD) s_region
(corner coordinates)- a unique
obs_id
in the TAP service
print(results_0['dataproduct_subtype'])
print(results_0['lsst_ccdvisitid'])
print(results_0['lsst_visit'], results_0['lsst_detector'])
print('%7.4f %8.4f' % (results_0['s_ra'], results_0['s_dec']))
print(results_0['lsst_band'])
print(results_0['t_min'])
print(results_0['s_region'])
print(results_0['obs_id'])
lsst.calexp 214437078 214437 78 61.9699 -36.8508 i 59870.20678561111 POLYGON ICRS 62.045708 -37.018191 62.177779 -36.787420 61.894356 -36.683272 61.761576 -36.913711 214437-R20_S20
4.1.2. Access via Datalink URL¶
Data is accessible via an access_url
and the format of the
data is stored in access_format
.
print(results_0['access_url'])
print(results_0['access_format'])
https://data.lsst.cloud/api/datalink/links?ID=butler%3A//dp02/ecb19e3f-7ecf-42c4-b936-d2cce91174a8 application/x-votable+xml;content=datalink
Above, the format indicates that the image is stored as a votable
which is accessible via Datalink.
Obtain the table dl_results
using Datalink and the auth_session
defined in Section 1.4.
dl_results = DatalinkResults.from_result_url(results_0['access_url'],
session=auth_session)
Option to view dl_results
as an astropy table.
# dl_results.to_table()
The dl_results
table holds a different access_url
that
provides direct access to the image - define it as image_url
and print it to see how it is different from the access_url
above.
image_url = dl_results['access_url'][0]
print(image_url)
https://storage.googleapis.com/butler-us-central1-panda-dev/dc2/2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211219T033213Z/calexp/20221017/i/i_sim_1.4/214437/calexp_LSSTCam-imSim_i_i_sim_1_4_214437_R20_S20_2_2i_runs_DP0_2_v23_0_0_rc5_PREOPS-905_20211219T033213Z.fits?AWSAccessKeyId=GOOG1ENMYOD3NEUADDZHXYRGOGPYONDFUNT3JMCBPNCIMQWIEXHOUYONSHRDQ&Signature=DXY3XR7Dr%2Bin7Ex1gy6MJ6qRKVA%3D&Expires=1722279570
4.1.3. Get header data with astropy¶
Image headers are another type of metadata, in addition to what was explored in Section 3.
Use the astropy.fits
package to open the image file using its URL
and print the hdulist
to see the available headers.
hdulist = fits.open(image_url)
for hdu in hdulist:
print(hdu.name)
IMAGE IMAGE MASK VARIANCE ARCHIVE_INDEX FilterLabel Detector TransformMap ExposureSummaryStats Detector SkyWcs PsfexPsf PsfexPsf PhotoCalib ChebyshevBoundedField ApCorrMap ChebyshevBoundedField
There are two headers named "IMAGE". The first contains information about the observation, such as the telescope, instrument, airmass, date, and exposure time.
Option to print the first IMAGE header of observation information.
# hdulist[0].header
The second IMAGE header contains information about the exposure, such as the dimensions, pixel scale, world coordinate system (WCS) transformations, and any scaling factors applied to the pixel fluxes.
Option to print the second IMAGE header of exposure information.
# hdulist[1].header
Define img_hdr
as the second IMAGE header with exposure information.
img_hdr = hdulist[1].header
4.1.4. Get image WCS from the header¶
Use the astropy WCS
package to define the WCS based on header data.
img_wcs = WCS(img_hdr)
Use the method footprint_contains
, which requires a SkyCoord
be passed, to find out if the image
contains the target coordinates, or if they were in the s_region
padding zone (see Section 1.1.).
print(img_wcs.footprint_contains(targetCoord))
False
Above, the False
return means that the target coordinates are inside the s_region
but outside the image boundaries.
Continue with this image to investigate further.
4.1.5. Get pixel data with astropy¶
Obtain the image_data
(pixel flux data) via the image_url
.
img_data = fits.getdata(image_url)
4.1.6. Display with matplotlib¶
Use matplotlib.imshow
to display the pixel data, using the WCS projection from the header to appropriately label the x (RA) and y (Dec) axes.
Define the greyscale limits (i.e., pixel flux values corresponding
to black and white) using the astropy.visualization
package's
ZScaleInterval
class.
zscale = vis.ZScaleInterval()
zlimits = zscale.get_limits(img_data)
print(zlimits)
(-138.25166, 269.47370252809003)
Plot the image pixel data, and project the WCS with grid lines.
fig, ax = plt.subplots(1, figsize=(5, 5))
plt.subplot(projection=img_wcs)
plt.imshow(img_data, cmap='gray',
vmin=zlimits[0], vmax=zlimits[1])
plt.axis('on')
plt.grid(color='grey', ls='solid')
ax.set_xticks([])
ax.set_yticks([])
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
Figure 2: Plotting the pixel data in X and Y, with grid lines of RA and Dec overplotted. However, the default labeling leaves the image's sky orientation too unclear.
Create a more informative plot, below.
Use the image WCS to convert the target coordinates and the
s_region
image boundaries from sky coordinates to
pixel coordinates, and overplot as an orange circle and
purple lines, respectively.
Draw 90"-long lines from the image center (from Section 4.1.1) in the north (N; yellow) and east (E; cyan) directions to provide both a sense of image scale and orientation.
fig, ax = plt.subplots(1, figsize=(5, 5))
plt.subplot(projection=img_wcs)
plt.imshow(img_data, cmap='gray',
vmin=zlimits[0], vmax=zlimits[1])
plt.axis('on')
plt.grid(color='grey', ls='solid')
targetPixels = img_wcs.world_to_pixel(targetCoord)
plt.plot(targetPixels[0], targetPixels[1], 'o', ms=10, mew=2,
color='None', mec='darkorange')
del targetPixels
xvals, yvals = sregion_to_vertices(results_0['s_region'])
xpix = []
ypix = []
for xval, yval in zip(xvals, yvals):
tempCoord = SkyCoord(ra=xval*u.degree, dec=yval*u.degree, frame='icrs')
tempPixels = img_wcs.world_to_pixel(tempCoord)
xpix.append(tempPixels[0])
ypix.append(tempPixels[1])
del tempCoord, tempPixels
plt.plot(xpix, ypix, lw=1, color='darkviolet')
del xvals, yvals, xpix, ypix
cra = results_0['s_ra'] * u.degree
cdec = results_0['s_dec'] * u.degree
offset = (90. / 3600.) * u.degree
cPix = img_wcs.world_to_pixel(SkyCoord(cra, cdec, frame='icrs'))
oraPix = img_wcs.world_to_pixel(SkyCoord(cra + offset, cdec, frame='icrs'))
odecPix = img_wcs.world_to_pixel(SkyCoord(cra, cdec + offset, frame='icrs'))
plt.plot([cPix[0], oraPix[0]], [cPix[1], oraPix[1]],
ls='solid', lw=1, color='cyan')
plt.plot([cPix[0], odecPix[0]], [cPix[1], odecPix[1]],
ls='solid', lw=1, color='yellow')
plt.text(oraPix[0], oraPix[1], 'E', color='cyan')
plt.text(odecPix[0], odecPix[1], 'N', color='yellow')
del cra, cdec, offset, cPix, oraPix, odecPix
ax.set_xticks([])
ax.set_yticks([])
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
Figure 3: The calexp
displayed in greyscale, with grey grid lines for RA (30s) and Dec (5') overplotted to demonstrate that individual exposures are obtained with camera rotation (as seen in Figure 1). The target coordinates (04h 08m 00s, -37d 00m 00s orange circle) are just off the edge of this calexp
, yet within the boundaries of the s_region
. In the center, 90"-long lines extend in the north (N; yellow) and east (E; cyan) directions.
Clean up.
del results_0, dl_results, image_url, hdulist
del img_hdr, img_wcs, img_data
del zscale, zlimits
4.1.7. Select a different image to save and display¶
For the first five images in results
, get the image WCS from the header and use
the footprint_contains
method to check if the target is within the image.
for r, result in enumerate(results):
if r < 5:
dl_results = DatalinkResults.from_result_url(result['access_url'],
session=auth_session)
image_url = dl_results['access_url'][0]
hdulist = fits.open(image_url)
img_hdr = hdulist[1].header
img_wcs = WCS(img_hdr)
print(r, img_wcs.footprint_contains(targetCoord))
del dl_results, image_url, hdulist, img_hdr, img_wcs
0 False 1 True 2 True 3 False 4 True
Above, the second image (with index r = 1
) does contain the target.
Get the image_url
for the second image in the results
list.
r = 1
result = results[r]
dl_results = DatalinkResults.from_result_url(result['access_url'],
session=auth_session)
image_url = dl_results['access_url'][0]
4.1.8. Save image locally¶
Use the urlretrieve
module to retrieve the image from
its URL and save it as a file in the home directory.
Print the filename where the image was saved.
filename = os.path.join(os.getenv("HOME"), 'dp02_02c_image_calexp.fits')
urlretrieve(image_url, filename)
print(filename)
/home/melissagraham/dp02_02c_image_calexp.fits
The image can be dowloaded by using the left-hand file browser and navigating to the home directory, right-clicking on the file, and selecting "Download".
If the left-hand file browser is not visible, in the main menu click "View" and then "File Browser".
4.1.9. Display with afwDisplay¶
If the image (with its header) has been saved locally, it
can be read in as an ExposureF
object.
calexp = ExposureF(filename)
The ExposureF
format is the same format that images come in from the data butler.
This format will work with the LSST Science Pipelines' afwDisplay
package.
Set afwDisplay
to use matplotlib
.
afwDisplay.setDefaultBackend('matplotlib')
Display the image using afwDisplay
, and center an orange circle on the target coordinates.
fig, ax = plt.subplots()
display = afwDisplay.Display(frame=fig)
display.scale('linear', 'zscale')
display.mtv(calexp.image)
calexp_wcs = calexp.getWcs()
targetPixels = calexp_wcs.skyToPixel(targetPoint)
display.dot('o', targetPixels.getX(), targetPixels.getY(),
size=100, ctype='orange')
del calexp_wcs, targetPixels
plt.xlabel('x pixels')
plt.ylabel('y pixels')
plt.show()
Figure 3: The calexp
displayed in grayscale, with different scaling compared to Figure 2. With afwDisplay
, a color bar is shown by default, and the axes are in pixel coordinates. A large orange circle centered on the target coordinates appears as an arc in the lower-right corner, because the target coordinates are just off the edge of the calexp
.
Clean up before moving on to the next subsection.
del query, job, results
del result, dl_results, image_url
del filename, calexp
query = "SELECT * FROM ivoa.ObsCore "\
"WHERE CONTAINS(POINT('ICRS', " + \
target_str_ra + ", " + target_str_dec + "), s_region) = 1 "\
"AND lsst_band = 'i' "\
"AND calib_level = 3 "\
"AND dataproduct_subtype = 'lsst.deepCoadd_calexp'"
print(query)
SELECT * FROM ivoa.ObsCore WHERE CONTAINS(POINT('ICRS', 62.0, -37.0), s_region) = 1 AND lsst_band = 'i' AND calib_level = 3 AND dataproduct_subtype = 'lsst.deepCoadd_calexp'
Submit the job to the TAP service and fetch the results.
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED
results = job.fetch_result()
print(len(results))
2
Use the first image in the results
table.
Print the patch and tract.
results_0 = results[0]
print(results_0['lsst_tract'])
print(results_0['lsst_patch'])
3831 3
Save the image as a fits file in the home directory.
dl_results = DatalinkResults.from_result_url(results_0['access_url'],
session=auth_session)
image_url = dl_results['access_url'][0]
filename = os.path.join(os.getenv("HOME"), 'dp02_02c_image_deepCoadd.fits')
urlretrieve(image_url, filename)
('/home/melissagraham/dp02_02c_image_deepCoadd.fits', <http.client.HTTPMessage at 0x7dfb14d97190>)
Load the deepCoadd
image as an ExposureF
.
deepCoadd = ExposureF(filename)
4.2.2. Display with Firefly¶
In Section 4.1, matplotlib
was used to display the image with
imshow()
and via afwDisplay
.
As another example of image display, reset the backend of
afwDisplay
to be "firefly", which is the RSP's interactive
image visualization tool
Executing the following cell will open a new tab for the Firefly interface.
When Firefly opens: click on its tab above, and drag it down towards the mid-right area of this interface for a side-by-side view.
afwDisplay.setDefaultBackend('firefly')
afw_display = afwDisplay.Display(frame=1)
Display the deepCoadd
in Firefly, and set the mask plane to fully transparent
so that only pixel data is visualized.
afw_display.mtv(deepCoadd)
afw_display.setMaskTransparency(100)
Mark the query coordinates on the image as an orange circle. It will appear in the upper right corner of the image in Firefly.
deepCoadd_wcs = deepCoadd.getWcs()
targetPixel = deepCoadd_wcs.skyToPixel(targetPoint)
afw_display.dot('o', targetPixel.getX(), targetPixel.getY(),
size=40, ctype='orange')
To finish, clean up.
del deepCoadd_wcs, targetPixel
del query, job, results
del results_0, dl_results, image_url
del deepCoadd, filename
5. Exercises for the learner¶
(1) Follow the process in Section 4.1 to retrieve and visualize a difference image.
(2) Use the ccdvisitid
to obtain all detected sources in a calexp
. Use the TAP service to retrieve them and plot them onto the calexp. (Alternatively, use the tract and patch to obtain all detect objects in a deepCoadd
.)
Hint for (2):
SELECT ccdVisitId,coord_dec,coord_ra
FROM dp02_dc2_catalogs.Source
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),CIRCLE('ICRS', 62, -37, 0.25))=1
AND (ccdVisitId = 214437078)```