Image Queries with the TAP Service¶

Rubin Observatory logo, a graphical representation of turning stars into data.
Contact author: Melissa Graham
Last verified to run: 2024-07-28
LSST Science Piplines version: Weekly 2024_16
Container Size: medium
Targeted learning level: intermediate

Description: Retrieve and display images using the ObsTAP service.

Skills: Query for image data via the TAP service; retrieve image data with PyVO and Datalink; display and save images.

LSST Data Products: calexp, deepCoadd

Packages: lsst.rsp, lsst.afw, pyvo.dal.adhoc, urllib.request

Credit: Originally developed by Melissa Graham and the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.

Get Support: Find DP0-related documentation and resources at dp0.lsst.io. Questions are welcome as new topics in the Support - Data Preview 0 category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.

1. Introduction¶

The simulted images for Rubin's Data Preview 0 (DP0) are available via both the data butler and the TAP (Table Access Protocol) service.

Use of the butler to query and retrieve images is covered in other tutorials, and is generally the recommended way to access DP0 images while working within the Notebook Aspect of the Rubin Science Platform, especially if any image reprocessing is being done. However, image access via the TAP service is also possible. It is how images are accessed via the Portal and API Aspects of the Rubin Science Platform.

This tutorial demonstrates how to discover the existence of image data with the TAP service (Section 2), how to write ADQL queries to retrieve image metadata (Section 3), and how to use the PyVO's Datalink package for image data retrieval (Section 4). It also demonstrates how to display the full retrieved images with matplotlib, afwDisplay, and Firefly. Techniques for image display and manipulation, and the creation of image cutouts, are covered in other DP0.2 tutorials.

1.1. Known issue with the sky region (s_region)¶

The sky region (s_region) refers to image metadata that contains the region of sky covered by an image.

For the Rubin data, the region is provided as a four-corner polygon in International Celestial Reference System (ICRS) coordinates of right ascension (RA) and declination (Dec), in decimal degrees. This piece of metadata enables the common image query of "does this image contain this point".

The s_region metadata is of type string, and is formatted like:

POLYGON ICRS 61.693243 -36.780375 61.702475 -37.034112 62.015405 -37.026409 62.005102 -36.772698

Known Issue: The s_region for individual detectors of single-visit images (e.g., raw, calexp images) has some padding and is larger than the actual image boundary.

Due to this issue, queries that request all images containing a given coordinate will return images for which the coordinate is in the padding zone, and not in the image.

For processed visit images (e.g., calexps), the s_region parameter is defined at the time when the visit is defined for observation. It uses the camera geometry and the telescope boresight, and has some padding. Think of it as the anticipated outer boundaries that will contain the sky region imaged by that specific detector, and not based on the image WCS or astrometric solution.

Although the image metadata will evolve from what is available for DP0.2, and query by WCS boundaries should become possible, users should be prepared for cases where the coordinate is within the s_region but is not covered by image pixel data. For example, detector defects, lost amplifiers, etc. would not affect the s_region.

This issue is demonstrated in Section 4.1.

1.2. DAL Service Errors¶

This tutorial notebook runs a lot of different TAP queries.

Users might encounter the following error.

DALServiceError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Typically this is fixed by rerunning the cell with the TAP service fetch statement. If that doesn't fix it, try restarting the kernel and clearing the notebook, and re-executing the cells. (In the menu, click on "Kernel" and then "Restart Kernal and Clear Outputs of All Cells...").

1.3. Import packages¶

Import packages a few standard python packages, lsst packages for image display and data access, astropy packages for working with images, and a module from the pvyo.dal package for data access.

In [1]:
import os
import numpy as np
import matplotlib.pyplot as plt
from urllib.request import urlretrieve

import lsst.afw.display as afwDisplay
import lsst.geom as geom
from lsst.rsp import get_tap_service
from lsst.afw.image import ExposureF

import astropy.visualization as vis
from astropy.io import fits
from astropy.wcs import WCS
from astropy.coordinates import SkyCoord
from astropy import units as u

from pyvo.dal.adhoc import DatalinkResults

1.4. Define functions and paramters¶

Instantiate the RSP's TAP service.

In [2]:
service = get_tap_service("tap")

Define auth_session in order to authorize the TAP service to access LSST images via Datalink in Section 4.

In [3]:
auth_session = service._session

Define a point near the center of the Data Preview 0 area with coordintes Right Ascension (RA) = 62 degrees and Declination (Dec) = -37 degrees. Use several different formats for the different use cases throughout this tutorial

  • target_ra, target_dec : floats, to be used in calculations
  • target_str_ra, target_str_dec : strings, to be inserted into query statements
  • targetPoint : a SpherePoint object from the LSST Science Pipelines' geom package
  • targetCoord : an astropy SkyCoord class
In [4]:
target_ra = 62.0
target_dec = -37.0
target_str_ra = '62.0'
target_str_dec = '-37.0'
targetPoint = geom.SpherePoint(target_ra*geom.degrees,
                               target_dec*geom.degrees)
targetCoord = SkyCoord(ra=target_ra*u.degree,
                       dec=target_dec*u.degree,
                       frame='icrs')

Define a function sregion_to_vertices to be used in Section 3.5.

In [5]:
def sregion_to_vertices(sregion: str):
    """Convert the s_region from the ObsCore table into two
    arrays containing the x and y vertices, in order to plot
    boxes using matplotlib.

    Parameters
    ----------
    str_polygon: `str`
        String formatted with 10 space-separated items, e.g.,
        the "s_region" column from the ObsCore table which has
        the words "POLYGON ICRS" followed by 8 numbers:
        "POLYGON ICRS # # # # # # # #".

    Returns
    -------
    x_vertices: `np.ndarray`
        The array of x-vertices.
    y_vertices: `np.ndarray`
        The array of y-vertices.
    """
    temp = sregion.split(' ')
    xvertices = []
    yvertices = []
    ix = 2
    iy = 3
    for c in range(4):
        xvertices.append(float(temp[ix]))
        yvertices.append(float(temp[iy]))
        ix += 2
        iy += 2
    xvertices.append(xvertices[0])
    yvertices.append(yvertices[0])
    return xvertices, yvertices

2. Explore the TAP schema for images¶

Retrieve all available schemas and display them as a pandas table.

In [6]:
query = "SELECT * FROM tap_schema.schemas"
results = service.search(query)
results.to_table().to_pandas()
Out[6]:
description schema_index schema_name utype
0 Data Preview 0.1 includes five tables based on... 1 dp01_dc2_catalogs
1 Data Preview 0.2 contains the image and catalo... 0 dp02_dc2_catalogs
2 ObsCore v1.1 attributes in ObsTAP realization 2 ivoa
3 A TAP-standard-mandated schema to describe tab... 100000 tap_schema
4 UWS Metadata 120000 uws

The fact that catalogs are available with schema_name = dp02_dc2_catalogs is pretty clear in the above table.

Although it is not clear, DP0 images are stored in ObsTAP, with schema_name = ivoa. This will be clarified for future data releases.

In [7]:
del query, results

2.1. The IVOA schema¶

IVOA stands for International Virtual Observatory Alliance, and they work towards defining standards for astronomical data storage that enable wide data sharing. The LSST Science Pipelines use IVOA standards.

Retrieve all of the tables available in the ivoa schema and display them as a pandas table.

In [8]:
query = "SELECT * " \
        "FROM tap_schema.tables " \
        "WHERE tap_schema.tables.schema_name = 'ivoa'"
results = service.search(query)
results.to_table().to_pandas()
Out[8]:
description schema_name table_index table_name table_type utype
0 Observation metadata in the ObsTAP relational ... ivoa 1 ivoa.ObsCore table

There is only one ivoa table available for DP0: ObsCore.

In [9]:
del query, results

2.2. The ObsCore table¶

The LSST ObsCore table is essentially a view into the images stored in the LSST's data butler registry (e.g., DMTN-236).

Retrieve the names of the columns and their data types, descriptions and units, and display these properties as a pandas table.

The output of the following cell is a list of the image metadata that is available in the ObsCore table.

In [10]:
query = "SELECT column_name, datatype, description, unit " \
        "FROM tap_schema.columns " \
        "WHERE table_name = 'ivoa.ObsCore'"
results = service.search(query)
results.to_table().to_pandas()
Out[10]:
column_name datatype description unit
0 access_format char Content format of the dataset
1 access_url char URL used to access dataset
2 calib_level int Calibration level of the observation: in {0, 1...
3 dataproduct_subtype char Data product specific type
4 dataproduct_type char Data product (file content) primary type
5 em_max double stop in spectral coordinates m
6 em_min double start in spectral coordinates m
7 em_res_power double Value of the resolving power along the spectra...
8 em_xel long Number of elements along the spectral axis
9 facility_name char The name of the facility, telescope, or space ...
10 instrument_name char The name of the instrument used for the observ...
11 lsst_band char Abstract filter band designation
12 lsst_ccdvisitid long Identifier for visit+CCD; useful in JOINs
13 lsst_detector long Identifier for CCD within the LSSTCam focal plane
14 lsst_filter char Physical filter designation from the LSSTCam f...
15 lsst_patch long Lower level of LSST coadd skymap hierarchy
16 lsst_tract long Upper level of LSST coadd skymap hierarchy
17 lsst_visit long Identifier for a specific LSSTCam pointing
18 o_ucd char Nature of the observable axis
19 obs_collection char Name of the data collection
20 obs_id char Internal ID given by the ObsTAP service
21 obs_publisher_did char ID for the Dataset given by the publisher
22 obs_title char Brief description of dataset in free format
23 pol_xel long Number of elements along the polarization axis
24 s_dec double Central Spatial Position in ICRS; Declination deg
25 s_fov double Estimated size of the covered region as the di... deg
26 s_ra double Central Spatial Position in ICRS; Right ascension deg
27 s_region char Sky region covered by the data product (expres...
28 s_resolution double Spatial resolution of data as FWHM of PSF arcsec
29 s_xel1 long Number of elements along the first coordinate ...
30 s_xel2 long Number of elements along the second coordinate...
31 t_exptime double Total exposure time s
32 t_max double Stop time in MJD d
33 t_min double Start time in MJD d
34 t_resolution double Temporal resolution FWHM s
35 t_xel long Number of elements along the time axis
36 target_name char Object of interest
In [11]:
del query, results

3. Image metadata¶

Explore the metadata for all images that overlap the target point defined in Section 1.4.

Now that the queries are for data (not just schemas), and might take longer, use asynchronous queries (i.e., use the job service as shown below).

The following query selects all columns from the ObsCore table for all images that contain the ICRS coordinate 62, -37 within the s_region. (In other words, where the statment that the s_region contains the point is True or = 1.)

In [12]:
query = "SELECT * FROM ivoa.ObsCore "\
        "WHERE CONTAINS(POINT('ICRS', " + target_str_ra + \
        ", " + target_str_dec + "), s_region) = 1"
print(query)
SELECT * FROM ivoa.ObsCore WHERE CONTAINS(POINT('ICRS', 62.0, -37.0), s_region) = 1

Define the service job using the query, then run it.

In [13]:
job = service.submit_job(query)
job.run()
Out[13]:
<pyvo.dal.tap.AsyncTAPJob at 0x7dfb3a269cd0>

Wait until the job is completed.

In [14]:
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED

If the job phase is "COMPLETED", move on to the cell below.

If the job phase is is "ERROR", uncomment the following cell and execute it to see more information about the error. The most common errors are ADQL syntax errors which can be fixed by editing the query and re-submitting the job.

In [15]:
# job.raise_if_error()

When the job phase is "COMPLETED", fetch (retrieve) the results, and print the number of rows returned. This is the number of images found that overlap the specified point.

In [16]:
results = job.fetch_result().to_table().to_pandas()
print(len(results))
1554

Option to view the large results table. Notice that the table is truncated both in rows and columns.

In [17]:
# results

3.1. Data product types¶

Print the unique values of the column dataproduct_type (and the number of rows with that value).

In [18]:
column = 'dataproduct_type'
print('Unique values of: ', column)
values, counts = np.unique(results[column], return_counts=True)
for value, count in zip(values, counts):
    print('%s (%i)' % (value, count))
del column, values, counts
Unique values of:  dataproduct_type
image (1554)

Above, notice that all rows have dataproduct_type = "image". This is as expected, because the ObsCore table only holds images

3.2. Data product subtypes¶

Print the unique values of the column dataproduct_subtype (and the number of rows with that value).

In [19]:
column = 'dataproduct_subtype'
print('Unique values of: ', column)
values, counts = np.unique(results[column], return_counts=True)
for value, count in zip(values, counts):
    print('%s (%i)' % (value, count))
del column, values, counts
Unique values of:  dataproduct_subtype
lsst.calexp (510)
lsst.deepCoadd_calexp (12)
lsst.goodSeeingCoadd (12)
lsst.goodSeeingDiff_differenceExp (510)
lsst.raw (510)

Above, notice that there are five types of images.

The DP0.2 documentation contains more detailed information about the image subtypes which are summarized below.

  1. calexp : A reduced and calibrated image (exposure), with the background subtracted. One of the camera's 189 detectors from a processed visit image (PVI).

  2. deepCoadd_calexp : An image created by combining (coadding) calexps to make one deeper image (also has had the background subtracted).

  3. goodSeeingCoadd : An image created by combining the calexps in the top 30% of best image quality (best "seeing"). This image is used as the template (reference) for difference image analysis.

  4. goodSeeingDiff_differenceExp : A "difference image" created by subtracting the goodSeeingCoadd from the calexp.

  5. raw : An unreduced, uncalibrated image (exposure).

Why are there 510 calexp but only 12 deepCoadd images that overlap coordinate 62, -37?

As the LSST makes many repeated visits to the same area of sky, there will be many individual exposures (raw, calexp, and difference images) that overlap a given coordinate. In this case, 510 visits covered these coordinates.

The LSST Science Pipelines defines a sky map that divides the survey area into "tracts" (large polygons) and "patches" (smaller squares about the size of one detector). Coadded images are stored as individual patches, and they all slightly overlap at the edges and corners. Any given point within the survey region will be contained by at least 1 patch, and thus by 6 deepCoadd images (one per filter, ugrizy).

Thus, the fact that there are 12 overlapping deepCoadd images means that these coordinates are at the edge of a patch, overlapping with another patch.

3.3. Calibration level¶

Print the unique values of the column calib_level (and the number of rows with that value).

In [20]:
column = 'calib_level'
print('Unique values of: ', column)
values, counts = np.unique(results[column], return_counts=True)
for value, count in zip(values, counts):
    print('%s (%i)' % (value, count))
del column, values, counts
Unique values of:  calib_level
1 (510)
2 (510)
3 (534)

Above, notice that there are three calibration levels.

The term calib_level is a bit of a misnomer, as it represents the processing done to an image more than any calibrations applied. The calexp, deepCoadd, and difference images have all been photometrically and astrometrically calibrated.

The calib_level numbers correspond to:

  1. Raw images (510).
  2. Calibrated exposures (510).
  3. Coadded images and difference images (510 + 12 + 12).

This becomes evident by printing the unqiue calib_level values for each of the dataproduct_subtype values in turn, as done in the following cell.

In [21]:
values, counts = np.unique(results['dataproduct_subtype'],
                           return_counts=True)
for value, count in zip(values, counts):
    print('%s (%i)' % (value, count))
    tx = np.where(results['dataproduct_subtype'] == value)[0]
    values2, counts2 = np.unique(results['calib_level'][tx],
                                 return_counts=True)
    del tx
    for value2, count2 in zip(values2, counts2):
        print('  image type %s (%i)' % (value2, count2))
    del values2, counts2
    print(' ')
del values, counts
lsst.calexp (510)
  image type 2 (510)
 
lsst.deepCoadd_calexp (12)
  image type 3 (12)
 
lsst.goodSeeingCoadd (12)
  image type 3 (12)
 
lsst.goodSeeingDiff_differenceExp (510)
  image type 3 (510)
 
lsst.raw (510)
  image type 1 (510)
 

3.4. Band and filter¶

For calexps, print the unique values of lsst_band (and the number of rows with that value).

In [22]:
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
values, counts = np.unique(results['lsst_band'][tx], return_counts=True)
for value, count in zip(values, counts):
    print('lsst_band %s (%i)' % (value, count))
del tx, values, counts
lsst_band g (57)
lsst_band i (128)
lsst_band r (137)
lsst_band u (34)
lsst_band y (88)
lsst_band z (66)

For calexps, print the unique values of lsst_filter (and the number of rows with that value).

In [23]:
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
values, counts = np.unique(results['lsst_filter'][tx], return_counts=True)
for value, count in zip(values, counts):
    print('lsst_filter %s (%i)' % (value, count))
del tx, values, counts
lsst_filter g_sim_1.4 (57)
lsst_filter i_sim_1.4 (128)
lsst_filter r_sim_1.4 (137)
lsst_filter u_sim_1.4 (34)
lsst_filter y_sim_1.4 (88)
lsst_filter z_sim_1.4 (66)

Above, see that band and filter have slightly different formats but are, essentially, the six LSST filters: ugrizy.

Sort by the number of visits per filter and print the fraction of visits in that filter.

In [24]:
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
values, counts = np.unique(results['lsst_filter'][tx], return_counts=True)
ix = np.argsort(counts)
for value, count in zip(values[ix], counts[ix]):
    print('lsst_filter %s (%4.2f)' % (value, float(count)/float(len(tx))))
del tx, values, counts, ix
lsst_filter u_sim_1.4 (0.07)
lsst_filter g_sim_1.4 (0.11)
lsst_filter z_sim_1.4 (0.13)
lsst_filter y_sim_1.4 (0.17)
lsst_filter i_sim_1.4 (0.25)
lsst_filter r_sim_1.4 (0.27)

Above, the distribution of visits per filter clearly favors the r and i bands. The survey strategy used for DP0, and thus this exactly filter distribution, is not representative of the final strategy which will be used when operations start. However, the general trend of having more visits in the r and i bands, and spending the least amount of time in u-band, will persist.

For deepCoadds, print the unique values of lsst_band (and the number of rows with that value).

In [25]:
tx = np.where(results['dataproduct_subtype'] == 'lsst.deepCoadd_calexp')[0]
values, counts = np.unique(results['lsst_band'][tx], return_counts=True)
for value, count in zip(values, counts):
    print('lsst_band %s (%i)' % (value, count))
del tx, values, counts
lsst_band g (2)
lsst_band i (2)
lsst_band r (2)
lsst_band u (2)
lsst_band y (2)
lsst_band z (2)

The above output of 2 deepCoadd per filter overlapping the chosen coordinates is as expected, as described in Section 3.2.

3.5. Visit identifiers¶

The visit identifiers apply to single-visit images only, such as raw, calexp, and difference images.

A "visit" is one observation with the camera, at a given time and sky pointing, in a given filter.

A "calexp" is the calibrated exposure from one of the LSST Science Camera's 189 detectors, for one visit.

First, print the number of calexps again.

In [26]:
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
print('number of calexps: ', len(tx))
number of calexps:  510

For one calexp, print the lsst_visit, lsst_detector, and lsst_ccdvisitid.

Notice that the ccdvisitid is a combination of the visit id and the detector id. The LSST Science Camera has 189 detectors. The ccdvisitid uniquely identifies a calexp.

In [27]:
x = 0
print(results['lsst_visit'][tx[x]], results['lsst_detector'][tx[x]], results['lsst_ccdvisitid'][tx[x]])
del x
698740 112 698740112

Print the number of unique visits, and then number of unique ccdvisitids.

In [28]:
values, counts = np.unique(results['lsst_visit'][tx], return_counts=True)
print('number of unique visits: ', len(values))
del values, counts
number of unique visits:  443
In [29]:
values, counts = np.unique(results['lsst_ccdvisitid'][tx], return_counts=True)
print('number of unique ccdvisitid: ', len(values))
del values, counts
number of unique ccdvisitid:  510

As the ccdvisitid uniquely identifies a given calexp, the number of unique values of ccdvisitid matches the number of calexps, as expected.

However, the number of unique visits returned is less. This means that some of the calexps returned by the query were obtained in the same visit. It is not possible for a given point on the sky to be contained by more than one detector from the same visit.

The underlying issue here is the same one as described in Section 1.1. For visit images (e.g., calexps), the s_region has padding around the edges, and it overlaps with neighboring detectors. Instances where the target coordinates are at the edge of a detector, and within the padding of the neighboring detector's s_region, both calexps will be returned by the query in Section 4.1.1.

This issue is discussed further and visualized in Section 4.1.6, and might make more sense after working through that section.

In [30]:
del tx

3.6. Patch and tract¶

As described in Section 3.2, the LSST Science Pipelines defines a sky map that divides the survey area into "tracts" (large polygons) and "patches" (smaller squares about the size of one detector). Tracts are identified by a unique 4-digit integer (e.g., 3831). Patches are identified by a 2-digit number (e.g., 02), but it is not unique. All tracts would have a patch 01, 02, 03, and so on.

The concept of patch and tract applies only to the stacked images, which are tesselated (joined) to make wide-area deepCoadds.

In order to have a string that uniquely identifies a patch, create a new column called tract_and_patch that has a format like 3831_02.

In [31]:
temp = []
for i in range(len(results)):
    temp.append(str(results['lsst_tract'].loc[i]) + '_'
                + str(results['lsst_patch'].loc[i]))
results['tract_and_patch'] = np.asarray(temp, dtype='str')
del temp

For deepCoadds, print the unique values of tract_and_patch (and the number of rows with that value).

In [32]:
tx = np.where(results['dataproduct_subtype'] == 'lsst.deepCoadd_calexp')[0]
values, counts = np.unique(results['tract_and_patch'][tx], return_counts=True)
for value, count in zip(values, counts):
    print('tract and patch %s (%s)' % (value, count))
del tx, values, counts
tract and patch 3831_10 (6)
tract and patch 3831_3 (6)

The above output of 2 unique patches overlapping the chosen coordinates, each with 6 images (one per filter), is as expected as described in Section 3.2.

3.7. Center and region¶

In the ObsCore table, the image centers are held in the s_ra and s_dec columns. The region covered by the image is held in the s_region column, which is a string that contains the corners of the image.

See Section 1.1 for a known issue with s_region.

For the first image of subtype calexp, print the central coordinates and their type, and the coordinates of the corners from s_region.

In [33]:
tx = np.where(results['dataproduct_subtype'] == 'lsst.calexp')[0]
print(results['s_ra'][tx[0]], results['s_dec'][tx[0]])
print(type(results['s_ra'][tx[0]]), type(results['s_dec'][tx[0]]))
print(results['s_region'][tx[0]])
del tx
61.854056086259405 -36.90350302893365
<class 'numpy.float64'> <class 'numpy.float64'>
POLYGON ICRS 61.693243 -36.780375 61.702475 -37.034112 62.015405 -37.026409 62.005102 -36.772698

Recall that in Section 1.4, a function was defined called sregion_to_vertices, which converts the s_region into x-axis vertices and y-axis vertices, to enable plotting image boundaries with matplotlib.

Create a plot that draws the s_region corners of the u-band calexps and the deepCoadds which overlap coordinates 62, -37.

In [34]:
fig = plt.figure(figsize=(4, 4))
plt.plot(target_ra, target_dec, 'o', ms=5, mew=0, color='black')

tx = np.where((results['dataproduct_subtype'] == 'lsst.calexp')
              & (results['lsst_band'] == 'u'))[0]
for x in tx:
    xvals, yvals = sregion_to_vertices(results['s_region'][x])
    plt.plot(xvals, yvals, lw=1, alpha=0.2, color='darkviolet')
    del xvals, yvals
del tx

tx = np.where((results['dataproduct_subtype'] == 'lsst.deepCoadd_calexp')
              & (results['lsst_band'] == 'u'))[0]
for x in tx:
    xvals, yvals = sregion_to_vertices(results['s_region'][x])
    plt.plot(xvals, yvals, lw=1.5, alpha=0.4, color='black')
    center_ra = results['s_ra'][x]
    center_dec = results['s_dec'][x]
    plt.text(center_ra+0.07, center_dec, results['tract_and_patch'][x])
    del xvals, yvals, center_ra, center_dec
del tx

plt.xlabel('Right Ascension [degrees]')
plt.gca().invert_xaxis()
plt.ylabel('Declination [degrees]')
plt.show()
Image

Figure 1: Above, the u-band calexp images's sky regions (s_region; thin purple lines) and the deepCoadd sky regions (thicker black lines, labeled with tract_and_patch at their center) that overlap coordinate 62, -37 (black dot). This plot uses the convention north-up, east-left (i.e., RA increases to the left).

The deepCoadd images for all filters use the same sky map - in other words, the boundaries of a tract and patch are independent of filter. Although the s_region for the u-band deepCoadd was used to draw the boundaries in the plot above, they would have been the same for any chosen filter.

Clean up by deleting the query, job, and results used above.

In [35]:
del query, job, results

4. Image pixel data¶

Retrive image data (headers and pixel data), for a calexp (Section 4.1) and a deepCoadd (Section 4.2) and display them with a variety of image display tools.

No demonstration is provided here for differences images because accessing them is similar to accessing a calexp; this is left as an exercise for the user in Section 5.

No demonstration is provided here for raw images, as they have a multi-extension format which is more challenging, and covered in a separate tutorial.

4.1. Calexp¶

4.1.1. TAP query¶

In [36]:
query = "SELECT * FROM ivoa.ObsCore "\
        "WHERE CONTAINS(POINT('ICRS', " + \
        target_str_ra + ", " + target_str_dec + "), s_region) = 1 "\
        "AND lsst_band = 'i' "\
        "AND calib_level = 2"
print(query)
SELECT * FROM ivoa.ObsCore WHERE CONTAINS(POINT('ICRS', 62.0, -37.0), s_region) = 1 AND lsst_band = 'i' AND calib_level = 2
In [37]:
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
results = job.fetch_result()
print(len(results))
Job phase is COMPLETED
128

Option to view results as an astropy table.

In [38]:
# results.to_table()

For this demonstration only one of the 128 images is needed, so extract the first row into results_0.

In [39]:
results_0 = results[0]

Print selected metadata:

  • dataproduct subtype
  • CCD visit identifier (lsst_ccdvisitid)
  • visit identifier and detector number
  • central RA and Dec
  • band (filter)
  • exposure start time t_min (a modified julian date, MJD)
  • s_region (corner coordinates)
  • a unique obs_id in the TAP service
In [40]:
print(results_0['dataproduct_subtype'])
print(results_0['lsst_ccdvisitid'])
print(results_0['lsst_visit'], results_0['lsst_detector'])
print('%7.4f %8.4f' % (results_0['s_ra'], results_0['s_dec']))
print(results_0['lsst_band'])
print(results_0['t_min'])
print(results_0['s_region'])
print(results_0['obs_id'])
lsst.calexp
214437078
214437 78
61.9699 -36.8508
i
59870.20678561111
POLYGON ICRS 62.045708 -37.018191 62.177779 -36.787420 61.894356 -36.683272 61.761576 -36.913711
214437-R20_S20

4.1.2. Access via Datalink URL¶

Data is accessible via an access_url and the format of the data is stored in access_format.

In [41]:
print(results_0['access_url'])
print(results_0['access_format'])
https://data.lsst.cloud/api/datalink/links?ID=butler%3A//dp02/ecb19e3f-7ecf-42c4-b936-d2cce91174a8
application/x-votable+xml;content=datalink

Above, the format indicates that the image is stored as a votable which is accessible via Datalink.

Obtain the table dl_results using Datalink and the auth_session defined in Section 1.4.

In [42]:
dl_results = DatalinkResults.from_result_url(results_0['access_url'],
                                             session=auth_session)

Option to view dl_results as an astropy table.

In [43]:
# dl_results.to_table()

The dl_results table holds a different access_url that provides direct access to the image - define it as image_url and print it to see how it is different from the access_url above.

In [44]:
image_url = dl_results['access_url'][0]
print(image_url)
https://storage.googleapis.com/butler-us-central1-panda-dev/dc2/2.2i/runs/DP0.2/v23_0_0_rc5/PREOPS-905/20211219T033213Z/calexp/20221017/i/i_sim_1.4/214437/calexp_LSSTCam-imSim_i_i_sim_1_4_214437_R20_S20_2_2i_runs_DP0_2_v23_0_0_rc5_PREOPS-905_20211219T033213Z.fits?AWSAccessKeyId=GOOG1ENMYOD3NEUADDZHXYRGOGPYONDFUNT3JMCBPNCIMQWIEXHOUYONSHRDQ&Signature=DXY3XR7Dr%2Bin7Ex1gy6MJ6qRKVA%3D&Expires=1722279570

4.1.3. Get header data with astropy¶

Image headers are another type of metadata, in addition to what was explored in Section 3.

Use the astropy.fits package to open the image file using its URL and print the hdulist to see the available headers.

In [45]:
hdulist = fits.open(image_url)
for hdu in hdulist:
    print(hdu.name)
IMAGE
IMAGE
MASK
VARIANCE
ARCHIVE_INDEX
FilterLabel
Detector
TransformMap
ExposureSummaryStats
Detector
SkyWcs
PsfexPsf
PsfexPsf
PhotoCalib
ChebyshevBoundedField
ApCorrMap
ChebyshevBoundedField

There are two headers named "IMAGE". The first contains information about the observation, such as the telescope, instrument, airmass, date, and exposure time.

Option to print the first IMAGE header of observation information.

In [46]:
# hdulist[0].header

The second IMAGE header contains information about the exposure, such as the dimensions, pixel scale, world coordinate system (WCS) transformations, and any scaling factors applied to the pixel fluxes.

Option to print the second IMAGE header of exposure information.

In [47]:
# hdulist[1].header

Define img_hdr as the second IMAGE header with exposure information.

In [48]:
img_hdr = hdulist[1].header

4.1.4. Get image WCS from the header¶

Use the astropy WCS package to define the WCS based on header data.

In [49]:
img_wcs = WCS(img_hdr)

Use the method footprint_contains, which requires a SkyCoord be passed, to find out if the image contains the target coordinates, or if they were in the s_region padding zone (see Section 1.1.).

In [50]:
print(img_wcs.footprint_contains(targetCoord))
False

Above, the False return means that the target coordinates are inside the s_region but outside the image boundaries.

Continue with this image to investigate further.

4.1.5. Get pixel data with astropy¶

Obtain the image_data (pixel flux data) via the image_url.

In [51]:
img_data = fits.getdata(image_url)

4.1.6. Display with matplotlib¶

Use matplotlib.imshow to display the pixel data, using the WCS projection from the header to appropriately label the x (RA) and y (Dec) axes.

Define the greyscale limits (i.e., pixel flux values corresponding to black and white) using the astropy.visualization package's ZScaleInterval class.

In [52]:
zscale = vis.ZScaleInterval()
zlimits = zscale.get_limits(img_data)
print(zlimits)
(-138.25166, 269.47370252809003)

Plot the image pixel data, and project the WCS with grid lines.

In [53]:
fig, ax = plt.subplots(1, figsize=(5, 5))
plt.subplot(projection=img_wcs)
plt.imshow(img_data, cmap='gray',
           vmin=zlimits[0], vmax=zlimits[1])
plt.axis('on')
plt.grid(color='grey', ls='solid')
ax.set_xticks([])
ax.set_yticks([])
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
Image

Figure 2: Plotting the pixel data in X and Y, with grid lines of RA and Dec overplotted. However, the default labeling leaves the image's sky orientation too unclear.

Create a more informative plot, below.

Use the image WCS to convert the target coordinates and the s_region image boundaries from sky coordinates to pixel coordinates, and overplot as an orange circle and purple lines, respectively.

Draw 90"-long lines from the image center (from Section 4.1.1) in the north (N; yellow) and east (E; cyan) directions to provide both a sense of image scale and orientation.

In [54]:
fig, ax = plt.subplots(1, figsize=(5, 5))
plt.subplot(projection=img_wcs)
plt.imshow(img_data, cmap='gray',
           vmin=zlimits[0], vmax=zlimits[1])
plt.axis('on')
plt.grid(color='grey', ls='solid')

targetPixels = img_wcs.world_to_pixel(targetCoord)
plt.plot(targetPixels[0], targetPixels[1], 'o', ms=10, mew=2,
         color='None', mec='darkorange')
del targetPixels

xvals, yvals = sregion_to_vertices(results_0['s_region'])
xpix = []
ypix = []
for xval, yval in zip(xvals, yvals):
    tempCoord = SkyCoord(ra=xval*u.degree, dec=yval*u.degree, frame='icrs')
    tempPixels = img_wcs.world_to_pixel(tempCoord)
    xpix.append(tempPixels[0])
    ypix.append(tempPixels[1])
    del tempCoord, tempPixels
plt.plot(xpix, ypix, lw=1, color='darkviolet')
del xvals, yvals, xpix, ypix

cra = results_0['s_ra'] * u.degree
cdec = results_0['s_dec'] * u.degree
offset = (90. / 3600.) * u.degree
cPix = img_wcs.world_to_pixel(SkyCoord(cra, cdec, frame='icrs'))
oraPix = img_wcs.world_to_pixel(SkyCoord(cra + offset, cdec, frame='icrs'))
odecPix = img_wcs.world_to_pixel(SkyCoord(cra, cdec + offset, frame='icrs'))
plt.plot([cPix[0], oraPix[0]], [cPix[1], oraPix[1]],
         ls='solid', lw=1, color='cyan')
plt.plot([cPix[0], odecPix[0]], [cPix[1], odecPix[1]],
         ls='solid', lw=1, color='yellow')
plt.text(oraPix[0], oraPix[1], 'E', color='cyan')
plt.text(odecPix[0], odecPix[1], 'N', color='yellow')
del cra, cdec, offset, cPix, oraPix, odecPix

ax.set_xticks([])
ax.set_yticks([])
plt.xlabel('X')
plt.ylabel('Y')
plt.show()
Image

Figure 3: The calexp displayed in greyscale, with grey grid lines for RA (30s) and Dec (5') overplotted to demonstrate that individual exposures are obtained with camera rotation (as seen in Figure 1). The target coordinates (04h 08m 00s, -37d 00m 00s orange circle) are just off the edge of this calexp, yet within the boundaries of the s_region. In the center, 90"-long lines extend in the north (N; yellow) and east (E; cyan) directions.

Clean up.

In [55]:
del results_0, dl_results, image_url, hdulist
del img_hdr, img_wcs, img_data
del zscale, zlimits

4.1.7. Select a different image to save and display¶

For the first five images in results, get the image WCS from the header and use the footprint_contains method to check if the target is within the image.

In [56]:
for r, result in enumerate(results):
    if r < 5:
        dl_results = DatalinkResults.from_result_url(result['access_url'],
                                                     session=auth_session)
        image_url = dl_results['access_url'][0]
        hdulist = fits.open(image_url)
        img_hdr = hdulist[1].header
        img_wcs = WCS(img_hdr)
        print(r, img_wcs.footprint_contains(targetCoord))
        del dl_results, image_url, hdulist, img_hdr, img_wcs
0 False
1 True
2 True
3 False
4 True

Above, the second image (with index r = 1) does contain the target.

Get the image_url for the second image in the results list.

In [57]:
r = 1
result = results[r]
dl_results = DatalinkResults.from_result_url(result['access_url'],
                                             session=auth_session)
image_url = dl_results['access_url'][0]

4.1.8. Save image locally¶

Use the urlretrieve module to retrieve the image from its URL and save it as a file in the home directory. Print the filename where the image was saved.

In [58]:
filename = os.path.join(os.getenv("HOME"), 'dp02_02c_image_calexp.fits')
urlretrieve(image_url, filename)
print(filename)
/home/melissagraham/dp02_02c_image_calexp.fits

The image can be dowloaded by using the left-hand file browser and navigating to the home directory, right-clicking on the file, and selecting "Download".

If the left-hand file browser is not visible, in the main menu click "View" and then "File Browser".

4.1.9. Display with afwDisplay¶

If the image (with its header) has been saved locally, it can be read in as an ExposureF object.

In [59]:
calexp = ExposureF(filename)

The ExposureF format is the same format that images come in from the data butler. This format will work with the LSST Science Pipelines' afwDisplay package.

Set afwDisplay to use matplotlib.

In [60]:
afwDisplay.setDefaultBackend('matplotlib')

Display the image using afwDisplay, and center an orange circle on the target coordinates.

In [61]:
fig, ax = plt.subplots()
display = afwDisplay.Display(frame=fig)
display.scale('linear', 'zscale')
display.mtv(calexp.image)

calexp_wcs = calexp.getWcs()
targetPixels = calexp_wcs.skyToPixel(targetPoint)
display.dot('o', targetPixels.getX(), targetPixels.getY(),
            size=100, ctype='orange')
del calexp_wcs, targetPixels

plt.xlabel('x pixels')
plt.ylabel('y pixels')
plt.show()
Image

Figure 3: The calexp displayed in grayscale, with different scaling compared to Figure 2. With afwDisplay, a color bar is shown by default, and the axes are in pixel coordinates. A large orange circle centered on the target coordinates appears as an arc in the lower-right corner, because the target coordinates are just off the edge of the calexp.

Clean up before moving on to the next subsection.

In [62]:
del query, job, results
del result, dl_results, image_url
del filename, calexp

4.2. deepCoadd¶

Follow the same process as in Section 4.1, but instead of a calexp obtain a deepCoadd image.

4.2.1. Query, retrieve, and save¶

Create the query for i-band deepCoadd images at 62, -37.

In [63]:
query = "SELECT * FROM ivoa.ObsCore "\
        "WHERE CONTAINS(POINT('ICRS', " + \
        target_str_ra + ", " + target_str_dec + "), s_region) = 1 "\
        "AND lsst_band = 'i' "\
        "AND calib_level = 3 "\
        "AND dataproduct_subtype = 'lsst.deepCoadd_calexp'"
print(query)
SELECT * FROM ivoa.ObsCore WHERE CONTAINS(POINT('ICRS', 62.0, -37.0), s_region) = 1 AND lsst_band = 'i' AND calib_level = 3 AND dataproduct_subtype = 'lsst.deepCoadd_calexp'

Submit the job to the TAP service and fetch the results.

In [64]:
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
Job phase is COMPLETED
In [65]:
results = job.fetch_result()
print(len(results))
2

Use the first image in the results table. Print the patch and tract.

In [66]:
results_0 = results[0]
print(results_0['lsst_tract'])
print(results_0['lsst_patch'])
3831
3

Save the image as a fits file in the home directory.

In [67]:
dl_results = DatalinkResults.from_result_url(results_0['access_url'],
                                             session=auth_session)
image_url = dl_results['access_url'][0]
filename = os.path.join(os.getenv("HOME"), 'dp02_02c_image_deepCoadd.fits')
urlretrieve(image_url, filename)
Out[67]:
('/home/melissagraham/dp02_02c_image_deepCoadd.fits',
 <http.client.HTTPMessage at 0x7dfb14d97190>)

Load the deepCoadd image as an ExposureF.

In [68]:
deepCoadd = ExposureF(filename)

4.2.2. Display with Firefly¶

In Section 4.1, matplotlib was used to display the image with imshow() and via afwDisplay.

As another example of image display, reset the backend of afwDisplay to be "firefly", which is the RSP's interactive image visualization tool

Executing the following cell will open a new tab for the Firefly interface.

When Firefly opens: click on its tab above, and drag it down towards the mid-right area of this interface for a side-by-side view.

In [69]:
afwDisplay.setDefaultBackend('firefly')
afw_display = afwDisplay.Display(frame=1)

Display the deepCoadd in Firefly, and set the mask plane to fully transparent so that only pixel data is visualized.

In [70]:
afw_display.mtv(deepCoadd)
afw_display.setMaskTransparency(100)

Mark the query coordinates on the image as an orange circle. It will appear in the upper right corner of the image in Firefly.

In [71]:
deepCoadd_wcs = deepCoadd.getWcs()
targetPixel = deepCoadd_wcs.skyToPixel(targetPoint)
afw_display.dot('o', targetPixel.getX(), targetPixel.getY(),
                size=40, ctype='orange')

To finish, clean up.

In [72]:
del deepCoadd_wcs, targetPixel
del query, job, results
del results_0, dl_results, image_url
del deepCoadd, filename

5. Exercises for the learner¶

(1) Follow the process in Section 4.1 to retrieve and visualize a difference image.

(2) Use the ccdvisitid to obtain all detected sources in a calexp. Use the TAP service to retrieve them and plot them onto the calexp. (Alternatively, use the tract and patch to obtain all detect objects in a deepCoadd.)

Hint for (2):

SELECT ccdVisitId,coord_dec,coord_ra
FROM dp02_dc2_catalogs.Source
WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec),CIRCLE('ICRS', 62, -37, 0.25))=1
AND (ccdVisitId = 214437078)```
In [ ]: