01. Introduction to DP03#

DP03_01_Introduction_to_DP03

Introduction to the DP0.3
Contact author(s): Bob Abel, Douglas Tucker, and Melissa Graham
Last verified to run: 2024-01-31
LSST Science Piplines version: Weekly 2024_04
Container size: medium
Targeted learning level: beginner

Description: An overview of the contents of the DP0.3 moving object catalogs.

Skills: Use the TAP service and ADQL to access the DP0.3 tables.

LSST Data Products: TAP dp03_catalogs.

Packages: lsst.rsp.get_tap_service

Credit: The DP0.3 data set was generated by members of the Rubin Solar System Pipelines and Commissioning teams, with help from the LSST Solar System Science Collaboration, in particular: Pedro Bernardinelli, Jake Kurlander, Joachim Moeyens, Samuel Cornwall, Ari Heinze, Steph Merritt, Lynne Jones, Siegfried Eggl, Meg Schwamb, and Mario Jurić.

Mario Jurić provided essential assistance in the initial stages of creating this notebook. The notebook authors would also like to acknowledge help from Leanne Guy, Pedro Bernardinelli, Sarah Greenstreet, Megan Schwamb, Brian Rogers, Niall McElroy, and Jake Vanderplass, among others.

Get Support: Find DP0.3-related documentation and resources at dp0-3.lsst.io. Questions are welcome as new topics in the Support - Data Preview 0 Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.

1. Introduction¶

This notebook demonstrates how to access the simulated Data Preview 0.3 (DP0.3) data set in the Rubin Science Platform.

For the DP0.3 simulation, only moving objects were simulated, and only catalogs were created (there are no images). The DP0.3 simulation is entirely independent of and separate from the DP0.2 simulation.

DP0.3 is a hybrid catalog that contains both real and simulated Solar System objects (asteroids, near-earth objects, Trojans, trans-Neptunian objects, comets, and even a simulated spaceship... but no major planets nor the Moon). See the DP0.3 documentation for more information about how the hybrid catalog was created.

In Rubin Operations, these tables would be constantly changing, updated every day with the results of the previous night's observations. However, for DP0.3, both a static 1-year catalog and a static 10-year catalog have been simulated.

Notice: To re-iterate, for DP0.3, there are tables available for both the 1-year and 10-year catalogs. For the remainder of this notebook, though, unless otherwise noted, we will just consider the tables for the 10-year catalog. Users are encouraged to explore the tables for the 1-year catalog on their own.

1.1. Package Imports¶

Import general python packages and the Rubin TAP service utilities.

In [1]:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colormaps
import pandas as pd
from lsst.rsp import get_tap_service

2. Create the Rubin SSO TAP Service client¶

The DP0.3 data sets are available via the Table Access Protocol (TAP) service, and can be accessed with ADQL (Astronomical Data Query Language) statements.

TAP provides standardized access to catalog data for discovery, search, and retrieval. Full documentation for TAP is provided by the International Virtual Observatory Alliance (IVOA). ADQL is similar to SQL (Structured Query Langage). The documentation for ADQL includes more information about syntax and keywords.

Get an instance of the SSO TAP service, and assert that it exists.

Notice: The DP0.3 TAP service is called ssotap (whereas tap is used for DP0.2).

In [2]:

service = get_tap_service("ssotap")
assert service is not None

2.1. What `ssotap` schemas are available?¶

Create an ADQL query to select all (*) available schemas, and use the TAP service to execute the query.

Use to_table to convert to an astropy table, which will display the results in a user-friendly way.

In [3]:

query = "SELECT * FROM tap_schema.schemas"
results = service.search(query).to_table()
results

Out[3]:

Table length=4

description	schema_index	schema_name	utype
str512	int32	str64	str512
Data Preview 0.3 (ten-year version): Contains the catalog products of a Solar System Science Collaboration simulation of the results of SSO analysis of the wide-fast-deep data from the full LSST ten-year dataset.	0	dp03_catalogs_10yr
Data Preview 0.3 (one-year version): Contains the catalog products of a Solar System Science Collaboration simulation of the results of SSO analysis of the wide-fast-deep data, based on analyzing only the first year of LSST observations.	1	dp03_catalogs_1yr
A TAP-standard-mandated schema to describe tablesets in a TAP 1.1 service	100000	tap_schema
UWS Metadata	120000	uws

2.2 What DP0.3 tables are available?¶

There are four tables for the 1-year and 10-year simulation: MPCORB, SSObject, SSSource, and DiaSource.

Find descriptions of the tables and their schema, plus information and advice about accessing and querying the DP0.3 tables (including which columns are currently unpopulated), in the DP0.3 data products definitions documentation .

This tutorial will explore the tables individually, with table joins to be demonstrated in future tutorials (see also this advice on DP0.3 table joins).

Select all available tables and display information about them.

In [4]:

query = "SELECT * FROM tap_schema.tables " \
        "WHERE tap_schema.tables.schema_name LIKE 'dp03_catalogs%' " \
        "ORDER BY table_index ASC"
results = service.search(query).to_table()
results

Out[4]:

Table length=8

description	schema_name	table_index	table_name	table_type	utype
str512	str512	int32	str64	str8	str512
LSST-computed per-object quantities. 1:1 relationship with MPCORB. Recomputed daily, upon MPCORB ingestion.	dp03_catalogs_1yr	1	dp03_catalogs_1yr.SSObject	table
LSST-computed per-object quantities. 1:1 relationship with MPCORB. Recomputed daily, upon MPCORB ingestion.	dp03_catalogs_10yr	1	dp03_catalogs_10yr.SSObject	table
The orbit catalog produced by the Minor Planet Center. Ingested daily. O(10M) rows by survey end. The columns are described at https://minorplanetcenter.net//iau/info/MPOrbitFormat.html	dp03_catalogs_1yr	2	dp03_catalogs_1yr.MPCORB	table
The orbit catalog produced by the Minor Planet Center. Ingested daily. O(10M) rows by survey end. The columns are described at https://minorplanetcenter.net//iau/info/MPOrbitFormat.html	dp03_catalogs_10yr	2	dp03_catalogs_10yr.MPCORB	table
Table to store 'difference image sources'; - sources detected at SNR >=5 on difference images.	dp03_catalogs_1yr	3	dp03_catalogs_1yr.DiaSource	table
Table to store 'difference image sources' - sources detected at SNR >=5 on difference images.	dp03_catalogs_10yr	3	dp03_catalogs_10yr.DiaSource	table
LSST-computed per-source quantities. 1:1 relationship with DIASource. Recomputed daily, upon MPCORB ingestion.	dp03_catalogs_1yr	4	dp03_catalogs_1yr.SSSource	table
LSST-computed per-source quantities. 1:1 relationship with DiaSource. Recomputed daily, upon MPCORB ingestion.	dp03_catalogs_10yr	4	dp03_catalogs_10yr.SSSource	table

3. The `MPCORB` table¶

During Rubin Operations, Solar System Processing will occur in the daytime, after a night of observing. It will link together the difference-image detections of moving objects and report discoveries to the Minor Planet Center (MPC), as well as compute derived properties (magnitudes, phase-curve fits, coordinates in various systems).

The MPC will calculate the orbital parameters and these results will be passed back to Rubin, and stored and made available to users as the MPCORB table (the other derived properties are stored in the other three tables explored below). The DP0.3 MPCORB table is a simulation of what this data product will be like after 10 years of LSST.

Notice: The MPC contains all reported moving objects in the Solar System, and is not limited to those detected by LSST. Thus, the MPCORB table will have more rows than the SSObject table.

Notice: For DP0.3, there was no fitting done by the MPC and the MPCORB table is the orbital elements used in the simulation (the MPCORB table is a truth table).

For more information about Rubin's plans for Solar System Processing, see Section 3.2.2 of the Data Products Definitions Document. Note that there remain differences between Table 4 of the DPDD, which contain the anticipated schema for the moving object tables, and the DP0.3 table schemas.

3.1. Size¶

Use the TAP service to count of the number of rows in the MPCORB table.

In [5]:

results = service.search("SELECT COUNT(*) FROM dp03_catalogs_10yr.MPCORB")
results.to_table()

Out[5]:

Table length=1

COUNT1
int64
14462388

There are 14.5 million rows in the MPCORB table.

3.2. Columns¶

Use the TAP service to query for the column information from MPCORB.

Print the results as a pandas table.

In [6]:

results = service.search("SELECT column_name, datatype, description, "
                         "       unit from TAP_SCHEMA.columns "
                         "WHERE table_name = 'dp03_catalogs_10yr.MPCORB'")
results.to_table().to_pandas()

Out[6]:

	column_name	datatype	description	unit
0	arc	float	MPCORB: Arc (days), for single-opposition objects	d
1	arcEnd	char	MPCORB: Year of last observation (for multi-op...
2	arcStart	char	MPCORB: Year of first observation (for multi-o...
3	computer	char	MPCORB: Computer name
4	e	double	MPCORB: Orbital eccentricity
5	epoch	double	MPCORB: Epoch (in MJD, .0 TT)	d
6	flags	int	MPCORB: 4-hexdigit flags. See https://minorpla...
7	fullDesignation	char	MPCORB: Readable designation
8	incl	double	MPCORB: Inclination to the ecliptic, J2000.0 (...	deg
9	lastIncludedObservation	float	MPCORB: Date of last observation included in o...	d
10	mpcDesignation	char	MPCORB: Number or provisional designation (in ...
11	mpcG	float	MPCORB: Slope parameter, G
12	mpcH	float	MPCORB: Absolute magnitude, H	mag
13	mpcNumber	int	MPC number (if the asteroid has been numbered;...
14	n	double	MPCORB: Mean daily motion (degrees per day)	deg/d
15	nobs	int	MPCORB: Number of observations
16	node	double	MPCORB: Longitude of the ascending node, J2000...	deg
17	nopp	int	MPCORB: Number of oppositions
18	peri	double	MPCORB: Argument of perihelion, J2000.0 (degrees)	deg
19	pertsLong	char	MPCORB: Precise indicator of perturbers (blank...
20	pertsShort	char	MPCORB: Coarse indicator of perturbers (blank ...
21	q	double	MPCORB: Perihelion distance (AU)	AU
22	reference	char	MPCORB: Reference
23	rms	float	MPCORB: r.m.s residual (")	arcsec
24	ssObjectId	long	LSST unique identifier (if observed by LSST)
25	tperi	double	MPCORB: MJD of pericentric passage	d
26	uncertaintyParameter	char	MPCORB: Uncertainty parameter, U

In some cases, the column descriptions cut off in the table above.

Execute the following to see, for example, the full description for the mpcDesignation column.

In [7]:

results['description'][10]

Out[7]:

'MPCORB: Number or provisional designation (in packed form)'

3.3. Retrieve a random subset¶

To retrieve a random subset of rows, make use of the fact that ssObjectId is a randomly assigned 64-bit long unsigned integer. Since ADQL interprets a 64-bit long unsigned integer as a 63-bit signed integer, these range from a very large negative integer value to a very large positive integer value. This will be fixed in the future so that all identifiers are positive numbers.

Notice: By using ssObjectId, the following query returns a random subset of MPCORB rows that are associated with a row in the SSObject table. In other words, this limits the query to only retrieve moving objects from the MPCORB table that have been detected by LSST.

First, figure out the full range of ssObjectId values by using the ADQL MIN and MAX functions.

In [8]:

results = service.search("SELECT min(ssObjectId), max(ssObjectId)"
                         "FROM dp03_catalogs_10yr.MPCORB")
results.to_table()

Out[8]:

Table length=1

min1	max2
int64	int64
-9223370383071521539	9223370875126069107

Define a search range for ssObjectId that would return no more than 1% of all objects in MPCORB. We will do this by estimating a new minimum ssObjectId that is 1% below the maximum ssObjectId for the full range of ssObjectId values.

Notice: Since the range of ssObjectId's (-9223370383071521539 --> +9223370875126069107) is much larger than the number of rows in the MPCORB table (14600302), we don't expect to get exactly 1% of the rows from MPCORB via this method, but we should get approximately 1%, as long as the ssObjectId values are distributed reasonably uniformly over their large range.

In [9]:

min_val = int(results[0].get('min1'))
max_val = int(results[0].get('max2'))
print('Full range: ', min_val, max_val)

min_val = int(max_val - 0.01*(max_val-min_val))
print('Search range: ', min_val, max_val)

Full range:  -9223370383071521539 9223370875126069107
Search range:  9038903462544093184 9223370875126069107

Execute the search, and retrieve all (*) columns from the MPCORB table.

Store the results in df as a pandas dataframe.

In [10]:

results = service.search("SELECT * FROM dp03_catalogs_10yr.MPCORB "
                         "WHERE ssObjectId BETWEEN 9038903462544093184 "
                         "AND 9223370875126069107")
df = results.to_table().to_pandas()

Display the resulting dataframe. Note that it is automatically truncated for both columns and rows (look for the "..." halfway down, and halfway across).

In [11]:

df

Out[11]:

	arc	arcEnd	arcStart	computer	e	epoch	flags	fullDesignation	incl	lastIncludedObservation	...	nopp	peri	pertsLong	pertsShort	q	reference	rms	ssObjectId	tperi	uncertaintyParameter
0	NaN				0.143340	60065.000000	0	2011 1062 T-3	6.230810	NaN	...	0	62.518599			1.975493		NaN	9154661686854389525	59969.017393
1	NaN				0.112557	60065.000000	0	2011 1093 T-2	6.322560	NaN	...	0	72.646946			2.061410		NaN	9059993893438746957	60310.434546
2	NaN				0.049899	60065.000000	0	2011 1184 T-2	4.306140	NaN	...	0	279.168196			2.426167		NaN	9170955388068648464	60191.630055
3	NaN				0.105047	60065.000000	0	2011 1186 T-3	16.188500	NaN	...	0	357.091672			2.757674		NaN	9077066921741880727	59093.291192
4	NaN				0.176216	60065.000000	0	2011 1464 T-2	4.495110	NaN	...	0	81.310304			2.121439		NaN	9053987333942294876	59808.773603
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
144467	NaN				0.919897	60255.126890	0	2011 LPCD750	106.184487	NaN	...	0	209.420137			6.642560		NaN	9133287149408159006	60251.935331
144468	NaN				0.527282	61444.848873	0	2011 LPCD823	11.480147	NaN	...	0	92.507010			16.379375		NaN	9140884185118476131	61444.842271
144469	NaN				0.536121	62656.471653	0	2011 LPCD889	117.420822	NaN	...	0	12.259819			15.988392		NaN	9189745973987900787	62656.437497
144470	NaN				0.602921	63093.436806	0	2011 LPCD940	44.898648	NaN	...	0	81.527019			15.572242		NaN	9181407122537090723	63092.423677
144471	NaN				0.655530	62223.336444	0	2011 LPCD941	177.067536	NaN	...	0	276.729975			12.396072		NaN	9211044985939642883	62223.322988

144472 rows × 27 columns

As we see, 144,472 rows were returned, which -- as expected -- is almost (but not exactly) 1% of the 14,462,388 rows in the MPCORB table.

Notice: There are several columns that currently contain NaN (not a number) values. For the simulated DP0.3 data these columns might be replaced in the near future, and for real data releases there will not be all-NaN columns. If desired, users can drop all-NaN columns with, e.g., df.dropna(axis=1, how='all', inplace=True). However the better practice is to understand the columns and retrieve only what you are going to use.

Optional: use the pandas dataframe info function to learn more about the values in the retrieved columns. Uncomment the following code cell (remove the # symbol) and execute the cell to see the info on df.

In [12]:

# df.info()

Optional: use the pandas fuction describe to display statistics for the numerical columns in the dataframe (count, mean, standard deviation, etc.).

In [13]:

# df.describe()

3.4. Plot histograms of selected columns¶

Wikipedia provides a decent beginner-level guide to orbital elements.

For a quick reference, distributions are shown below for five key orbital elements and the absolute $H$ magnitude (see Section 4.4 for a description of $H$).

In [14]:

fig, ax = plt.subplots(2, 3, figsize=(10, 6), sharey=False)
ax[0, 0].hist(df['e'], bins=100, log=True)
ax[0, 0].set_xlabel('Eccentricity')
ax[0, 0].set_ylabel('log(Number)')
ax[0, 1].hist(df['incl'], bins=100, log=True)
ax[0, 1].set_xlabel('Inclination (deg)')
ax[0, 1].set_ylabel('log(Number)')
ax[0, 2].hist(df['mpcH'], bins=100, log=True)
ax[0, 2].set_xlabel('Absolute Magnitude, H (mag)')
ax[0, 2].set_ylabel('log(Number)')
ax[1, 0].hist(df['node'], bins=50)
ax[1, 0].set_xlabel('Longitude of Ascending Node (deg)')
ax[1, 0].set_ylabel('Number')
ax[1, 0].set_ylim(0,3500)
ax[1, 1].hist(df['peri'], bins=50)
ax[1, 1].set_xlabel('Argument of Perihelion (deg)')
ax[1, 1].set_ylabel('Number')
ax[1, 1].set_ylim(0,3500)
ax[1, 2].hist(df['q'], bins=100, log=True)
ax[1, 2].set_xlabel('Perihelion Distance (au)')
ax[1, 2].set_ylabel('log(Number)')
fig.suptitle('Histograms for Key Orbital Elements')
fig.tight_layout()
plt.show()

4. The `SSObject` table¶

During Rubin Operations, Prompt Processing will occur during the night, detecting sources in difference images (DiaSources, see Section 6) and associating them into static-sky transients and variables (DiaObjects, not included in DP0.3).

The Solar System Processing which occurs in the daytime, after a night of observing, links together the DiaSources for moving objects into SSObjects. Whereas the MPCORB table contains the orbital elements for these moving objects, the SSObjects contains the Rubin-measured properties such as phase curve fits and absolute magnitudes.

Notice: no artifacts or spurious difference-image sources have been injected into the DP0.3 catalogs.

Notice: although there are columns for them, no u- or Y-band data were simulated for DP0.3. Unless noted, we will ignore u- or Y-band data for the remainder of this notebook.

4.1. Size¶

Use the ADQL count function to return the size.

In [15]:

results = service.search("SELECT COUNT(*) from dp03_catalogs_10yr.SSObject")
results.to_table().to_pandas()

Out[15]:

	COUNT1
0	4443479

The DP0.3 data set contains 4.4 million solar system objects detected by Rubin.

This is less than the 14.5 million objects in the MPCORB catalog. It is left as an exercise for the learner in Section 7 to determine the characteristics of those objects from the MPCORB table are missing from the SSObject table.

4.2. Columns¶

Option: display a list of column names, data types, descriptions, and units.

In [16]:

# results = service.search("SELECT column_name, datatype, description, "
#                          "unit from TAP_SCHEMA.columns "
#                          "WHERE table_name = 'dp03_catalogs_10yr.SSObject'")
# results.to_table().to_pandas()

4.3. Retrieve a random subset¶

Use essentially the same query as was used for the MPCORB table, above.

In [17]:

results = service.search("SELECT * FROM dp03_catalogs_10yr.ssObject "
                         "WHERE ssObjectId BETWEEN 9038903462544093184 "
                         "AND 9223370875126069107")
df = results.to_table().to_pandas()

In [18]:

df

Out[18]:

	arc	discoverySubmissionDate	firstObservationDate	flags	g_Chi2	g_G12	g_G12Err	g_H	g_H_gG12_Cov	g_HErr	...	y_H_yG12_Cov	y_HErr	y_Ndata	z_Chi2	z_G12	z_G12Err	z_H	z_H_zG12_Cov	z_HErr	z_Ndata
0	2225.923096	61634.00000	61640.42378	102	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	0	NaN	NaN	NaN	NaN	NaN	NaN	0
1	3148.112305	60308.00000	60311.20320	66	1.821222	0.538449	0.029092	16.201378	0.000087	0.003903	...	NaN	NaN	0	1.004799	0.570081	2.714864e-02	15.410084	6.308406e-05	4.074336e-03	68
2	3314.104736	60544.00000	60546.12723	66	1.481194	0.447420	0.102737	19.085361	0.001247	0.015581	...	NaN	NaN	0	0.496874	0.493693	1.690698e-01	18.293407	4.123830e-03	3.045361e-02	38
3	3436.664307	60362.00000	60369.30842	66	2.093236	0.617962	0.031888	17.350933	0.000167	0.006134	...	NaN	NaN	0	1.211593	0.586116	4.590447e-02	16.562313	2.647262e-04	7.582015e-03	79
4	3325.815674	60476.00000	60476.15480	66	2.019966	0.590387	0.057489	18.055870	0.000590	0.010950	...	NaN	NaN	0	0.913779	0.598814	6.471661e-02	17.270544	5.449418e-04	1.152790e-02	49
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
44606	2897.801514	60574.00000	60580.30096	98	0.261178	0.202494	0.468495	21.202513	0.042931	0.101005	...	NaN	NaN	0	NaN	NaN	NaN	NaN	NaN	NaN	0
44607	500.044525	62742.18036	62735.18036	2150	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	0	NaN	NaN	NaN	NaN	NaN	NaN	0
44608	3341.052979	60219.00000	60226.23987	102	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	0	NaN	NaN	NaN	NaN	NaN	NaN	0
44609	2105.805664	61488.00000	61493.23812	98	1.452305	0.541928	0.451856	20.889555	0.032904	0.080253	...	NaN	NaN	0	NaN	NaN	NaN	NaN	NaN	NaN	0
44610	1859.874756	61727.00000	61731.34561	70	NaN	NaN	NaN	NaN	NaN	NaN	...	NaN	NaN	0	0.859407	-767251.750000	9.258375e+11	36.474541	-1.229126e+18	1.327583e+06	6

44611 rows × 55 columns

There are 44611 rows.

Options: use the info or describe functions on the pandas dataframe to learn more about the retrieved results.

In [19]:

# df.info()

In [20]:

# df.describe()

4.4. Plot a color-color diagram¶

In the displayed dataframe above, it appears that for many SSObjects the phase-curve fits to derive absolute magnitudes were not successful, as the <band>_H (where <band> is a band u, g, r, i, z, or y) are NaN. (Recall in particular that no u- or y-band data were simulated for DP0.3.)

Before calculating and plotting the colors, drop all of the rows for which the phase-curve fits were not successful for g, r, i, and/or z bands.

In [21]:

df.dropna(subset=['g_H', 'r_H', 'i_H', 'z_H'], inplace=True)
df.reset_index(inplace=True)
print('Number of rows after dropping rows: %d' % len(df))

Number of rows after dropping rows: 30853

For Solar System objects, absolute magnitudes are defined to be for an object 1 au from the Sun and 1 au from the observer, and at a phase angle (the angle Sun-object-Earth) of 0 degrees.

Absolute magnitudes are derived by correcting for distance, fitting a function to the relationship between absolute magnitude and phase, and evaluating the function at a phase of 0 deg.

The process for fitting phase curves will be covered in another tutorial.

Calculate colors in the Rubin filters for the SSObjects that have absolute magnitudes.

In [22]:

df['gr'] = df['g_H'] - df['r_H']
df['ri'] = df['r_H'] - df['i_H']
df['iz'] = df['i_H'] - df['z_H']

Plot color-color diagrams as 2-dimensional histograms (heatmaps).

In [23]:

fig, ax = plt.subplots(1, 2, figsize=(8, 3))

h = ax[0].hist2d(df['gr'], df['ri'],
                 bins=(np.arange(-0.5, 1.5, 0.01),
                       np.arange(-0.5, 0.75, 0.01)),
                 norm='log')
ax[0].set_xlabel('g - r')
ax[0].set_ylabel('r - i')
ax[0].grid()
plt.colorbar(h[3])

h = ax[1].hist2d(df['ri'], df['iz'],
                 bins=(np.arange(-0.5, 0.75, 0.01),
                       np.arange(-0.75, 0.5, 0.01)),
                 norm='log')
ax[1].set_xlabel('r - i')
ax[1].set_ylabel('i - z')
ax[1].grid()
plt.colorbar(h[3])

fig.suptitle('Color-Color Plots for the SSObject Catalog')
fig.tight_layout()
plt.show()

There are two colors used in the simulation - but this is not the case for real Solar System objects. These plots will look very different in the future, when they are made with real Rubin data.

5. The `SSSource` table¶

As described above, Solar System Processing links together the DiaSources (detections in the individual difference images) from moving objects into SSObjects.

The SSSource table contains the 2-d (sky) coordinates and 3-d distances and velocities for every SSObject at the time of every LSST observation of that SSObject. The SSSource and DiaSource tables are 1:1, as they each contain data per observation, whereas SSObject contains data per object.

5.1. Size¶

It can take up to a minute to retrieve the size of the SSSource catalog.

In [24]:

results = service.search("SELECT COUNT(*) from dp03_catalogs_10yr.SSSource")
results.to_table().to_pandas()

Out[24]:

	COUNT1
0	653005444

This table contains over 653 million sources!

5.2. Columns¶

Option: print the column information for the SSSource table.

In [25]:

# results = service.search("SELECT column_name, datatype, description, "
#                          "unit from TAP_SCHEMA.columns "
#                          "WHERE table_name = 'dp03_catalogs_10yr.SSSource'")
# results.to_table().to_pandas()

5.3. Retrieve data for one `SSObject`¶

It is possible to obtain the SSSource data for a set of SSObjects. For example, to retrieve all SSSources for the SSObjects retrieved in Section 4, use a query such as:

SELECT * FROM dp03_catalogs_10yr.SSSource
WHERE ssObjectId BETWEEN 9038903462544093184 AND 9223370875126069107

However, the better way to demonstrate the data in the SSSource table is to look at just one SSObject, and the one with an ssObjectId = 6793512588511170680 is a fun choice.

Retrieve the heliocentric (sun-centered) and topocentric (observatory-centered) X and Y coordinates.

In [26]:

results = service.search("SELECT heliocentricX, heliocentricY, "
                         "topocentricX, topocentricY, ssObjectId "
                         "FROM dp03_catalogs_10yr.SSSource "
                         "WHERE ssObjectId = 6793512588511170680")
df_xy = results.to_table().to_pandas()
print('Retrieved ', len(df_xy), ' rows.')

Retrieved  319  rows.

Options to display the table in full or use the info or describe functions.

In [27]:

# df_xy

In [28]:

# df_xy.info()

In [29]:

# df_xy.describe()

5.3. Plot the locations of one `SSObject`¶

Plot the locations of the selected SSObject at the time of every LSST observation using the X and Y heliocentric (Sun-centered; orange star) and topocentric (observatory-centered; blue circle) coordinates. This can be considered a projection of the orbit into the plane of the Solar System.

Notice how the points are not regularly spaced. This is because there is one point per LSST observation of the object, and in some years it receives more or fewer observations.

Notice how the points appear in an ellipse around the Sun with heliocentric coordinates (left). This is because the selected object is in the main asteroid belt and close enough to the Sun to complete at least on orbit during the 10-year LSST survey. Had the selected object been in the outer Solar System, or were this tutorial using the 1-year data set, the plot below would show an arc instead of an ellipse.

The plot of the topocentric coordinates (right) does not appear elliptical because the motion of the Earth with respect to the object over the 10 years of the LSST is imprinted into the data. For the topocentric coordinates, the Earth's rotation also contributes, but it is a much smaller effect on the scale of these plots.

In [30]:

fig, ax = plt.subplots(1, 2, figsize=(8, 4))
ax[0].grid()
ax[0].plot(df_xy['heliocentricX'], df_xy['heliocentricY'],
           'o', ms=4, mew=0, color='black')
ax[0].plot(0, 0, '*', ms=15, color='darkorange')
ax[0].set_xlabel('heliocentric X (au)')
ax[0].set_ylabel('heliocentric Y (au)')
ax[0].set_title('Heliocentric Coordinates')
ax[1].grid()
ax[1].plot(df_xy['topocentricX'], df_xy['topocentricY'],
           'o', ms=4, mew=0, color='black')
ax[1].plot(0, 0, 'o', ms=15, color='dodgerblue')
ax[1].set_xlabel('topocentric X (au)')
ax[1].set_ylabel('topocentric Y (au)')
ax[1].set_title('Topocentric Coordinates')
fig.suptitle('XY Path of 6793512588511170680')
fig.tight_layout()
plt.show()

6. The `DiaSource` catalog¶

Last but definitely not least, the DiaSource table - which is actually the first to be generated by the Prompt Processing pipeline. This table will contain measurements for all sources detected with a signal-to-noise ratio of at least 5 in a difference image, including moving and static-sky sources.

However, for simulated DP0.3 the DiaSource table contains only moving objects, and no static sky time-domain objects (and no detector artifacts).

For DP0.3 the simulated DiaSource table contains only a subset of the columns that the real DiaSource table will have; see Table 1 of the Rubin Data Products Definitions Document. Furthermore, the DiaSource table contains a few extra truth columns, such as raTrue, decTrue, magTrue.

For DP0.3, no photometric variability due to, e.g., the rotation of non-spherical bodies, collisions, or outgassing events were simulated. All evolution in apparent magnitudes with time are due to the phase curve, which is explored in another tutorial.

6.1. Size¶

Option to retrieve the size of the DiaSource table (it is 1:1 with the SSSource table, so the same size).

In [31]:

# results = service.search("SELECT COUNT(*) from dp03_catalogs_10yr.DiaSource")
# results.to_table().to_pandas()

6.2. Columns¶

Option to print the column information.

In [32]:

# results = service.search("SELECT column_name, datatype, description, "
#                          "unit from TAP_SCHEMA.columns "
#                          "WHERE table_name = 'dp03_catalogs_10yr.DiaSource'")
# results.to_table().to_pandas()

6.3. Retrieve data for one `SSObject`¶

Similar to what we did for SSSource data in Section 5, it is possible to obtain the DiaSource data for a set of SSObjects. For example, to retrieve all DiaSources for the SSObjects retrieved in Section 4, use a query such as:

SELECT * FROM dp03_catalogs_10yr.DiaSource
WHERE ssObjectId BETWEEN 9038903462544093184 AND 9223370875126069107

However, as we saw for the SSSource data in Section 5, the better way to demonstrate the data in the DiaSource table is to look at just one SSObject. Here, as in Section 5, we will use the one with an ssObjectId = 6793512588511170680.

Retrieve the right ascension, declination, and the TAI at the exposure's midpoint (MJD) of the observation.

In [33]:

results = service.search("SELECT ra, dec, midpointMjdTai "
                         "FROM dp03_catalogs_10yr.DiaSource "
                         "WHERE ssObjectId = 6793512588511170680")
df = results.to_table().to_pandas()

df

Out[33]:

	ra	dec	midpointMjdTai
0	348.416159	-2.408505	62096.01616
1	351.997357	1.212909	62043.12454
2	53.506941	19.244016	63430.44136
3	182.188871	-1.424286	62709.97076
4	137.800325	7.798673	62560.10471
...	...	...	...
314	312.379754	-12.443726	63148.08742
315	313.660126	-12.585707	63160.04304
316	147.164625	4.901712	62510.30546
317	220.150693	-12.830205	61555.00272
318	17.293268	9.615766	60830.43848

319 rows × 3 columns

6.4. Plot time-domain data for one `SSObject`¶

Here, we plot the equatorial coordinates for all observations, colored by the MJD of the observation.

This plot shows the location of the object on the sky, as seen from Earth. Recall from the plots in Section 5 that this object is only about 2 au from the Sun, and so is at the inner edge of the Main Asteroid Belt. Parallax from the Earth's orbit contributes to the appearance of this plot.

In [34]:

cmap = colormaps['viridis']
fig, ax = plt.subplots(1, 1, figsize=(12, 3))
im = ax.scatter(df['ra'], df['dec'], c=df['midpointMjdTai'], cmap=cmap)
ax.set_xlabel('ra (deg)')
ax.set_ylabel('dec (deg)')
fig.suptitle("Path of one SSObject in equatorial coordinates over 10 years")
fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.85, 0.15, 0.02, 0.7])
fig.colorbar(im, cax=cbar_ax, label='TAI')
ax.grid(True)

Use of the magnitude and filter data retrieved from the DiaSource table is left as an exercise for the learner in Section 7.

7. Exercises for the learner¶

1. Calculate the semi-major axis for a subset of objects.

The orbital element of semi-major axis ($a$) is not pre-computed in the MPCORB table because it can be derived from the orbit’s ellipticiy ($e$) and perihelion distance ($q$) with $a = q/(1-e)$.

Copy the query from Section 3.3 and alter it to retrieve only $e$ and $q$ for a subset of objects. Then add a column to the results table for semi-major axis and plot eccentricity versus semi-major axis.

Hints:

Restrict the query to objects with eccentricities between 0 and 1.
df['a'] = df['q'] / (1.0 - df['e'])

2. How does an SSObject's magnitude change with time?

In Section 6.4, the magnitude and filter for all detections of the SSObject in the DiaSource table were not retrieved; alter the query to retrieve mag and band, and plot the magnitude as a function of time for the filter of your choice (recall: there will be no u- or y-band magnitudes). Note that the only simulated cause of photometric changes in DP0.3 objects is the distance from Earth and the phase curve. Correcting for distance, and the applications phase curves, are covered in another tutorial.