01. Introduction to DP03#
Introduction to the DP0.3
Contact author(s): Bob Abel, Douglas Tucker, and Melissa Graham
Last verified to run: 2024-01-31
LSST Science Piplines version: Weekly 2024_04
Container size: medium
Targeted learning level: beginner
Description: An overview of the contents of the DP0.3 moving object catalogs.
Skills: Use the TAP service and ADQL to access the DP0.3 tables.
LSST Data Products: TAP dp03_catalogs.
Packages: lsst.rsp.get_tap_service
Credit: The DP0.3 data set was generated by members of the Rubin Solar System Pipelines and Commissioning teams, with help from the LSST Solar System Science Collaboration, in particular: Pedro Bernardinelli, Jake Kurlander, Joachim Moeyens, Samuel Cornwall, Ari Heinze, Steph Merritt, Lynne Jones, Siegfried Eggl, Meg Schwamb, and Mario Jurić.
Mario Jurić provided essential assistance in the initial stages of creating this notebook. The notebook authors would also like to acknowledge help from Leanne Guy, Pedro Bernardinelli, Sarah Greenstreet, Megan Schwamb, Brian Rogers, Niall McElroy, and Jake Vanderplass, among others.
Get Support: Find DP0.3-related documentation and resources at dp0-3.lsst.io. Questions are welcome as new topics in the Support - Data Preview 0 Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
This notebook demonstrates how to access the simulated Data Preview 0.3 (DP0.3) data set in the Rubin Science Platform.
For the DP0.3 simulation, only moving objects were simulated, and only catalogs were created (there are no images). The DP0.3 simulation is entirely independent of and separate from the DP0.2 simulation.
DP0.3 is a hybrid catalog that contains both real and simulated Solar System objects (asteroids, near-earth objects, Trojans, trans-Neptunian objects, comets, and even a simulated spaceship... but no major planets nor the Moon). See the DP0.3 documentation for more information about how the hybrid catalog was created.
In Rubin Operations, these tables would be constantly changing, updated every day with the results of the previous night's observations. However, for DP0.3, both a static 1-year catalog and a static 10-year catalog have been simulated.
Notice: To re-iterate, for DP0.3, there are tables available for both the 1-year and 10-year catalogs. For the remainder of this notebook, though, unless otherwise noted, we will just consider the tables for the 10-year catalog. Users are encouraged to explore the tables for the 1-year catalog on their own.
1.1. Package Imports¶
Import general python packages and the Rubin TAP service utilities.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colormaps
import pandas as pd
from lsst.rsp import get_tap_service
2. Create the Rubin SSO TAP Service client¶
The DP0.3 data sets are available via the Table Access Protocol (TAP) service, and can be accessed with ADQL (Astronomical Data Query Language) statements.
TAP provides standardized access to catalog data for discovery, search, and retrieval. Full documentation for TAP is provided by the International Virtual Observatory Alliance (IVOA). ADQL is similar to SQL (Structured Query Langage). The documentation for ADQL includes more information about syntax and keywords.
Get an instance of the SSO TAP service, and assert that it exists.
Notice: The DP0.3 TAP service is called
ssotap
(whereastap
is used for DP0.2).
service = get_tap_service("ssotap")
assert service is not None
2.1. What ssotap
schemas are available?¶
Create an ADQL query to select all (*) available schemas, and use the TAP service to execute the query.
Use to_table
to convert to an astropy
table, which will display the results in a user-friendly way.
query = "SELECT * FROM tap_schema.schemas"
results = service.search(query).to_table()
results
description | schema_index | schema_name | utype |
---|---|---|---|
str512 | int32 | str64 | str512 |
Data Preview 0.3 (ten-year version): Contains the catalog products of a Solar System Science Collaboration simulation of the results of SSO analysis of the wide-fast-deep data from the full LSST ten-year dataset. | 0 | dp03_catalogs_10yr | |
Data Preview 0.3 (one-year version): Contains the catalog products of a Solar System Science Collaboration simulation of the results of SSO analysis of the wide-fast-deep data, based on analyzing only the first year of LSST observations. | 1 | dp03_catalogs_1yr | |
A TAP-standard-mandated schema to describe tablesets in a TAP 1.1 service | 100000 | tap_schema | |
UWS Metadata | 120000 | uws |
2.2 What DP0.3 tables are available?¶
There are four tables for the 1-year and 10-year simulation:
MPCORB
, SSObject
, SSSource
, and DiaSource
.
Find descriptions of the tables and their schema, plus information and advice about accessing and querying the DP0.3 tables (including which columns are currently unpopulated), in the DP0.3 data products definitions documentation .
This tutorial will explore the tables individually, with table joins to be demonstrated in future tutorials (see also this advice on DP0.3 table joins).
Select all available tables and display information about them.
query = "SELECT * FROM tap_schema.tables " \
"WHERE tap_schema.tables.schema_name LIKE 'dp03_catalogs%' " \
"ORDER BY table_index ASC"
results = service.search(query).to_table()
results
description | schema_name | table_index | table_name | table_type | utype |
---|---|---|---|---|---|
str512 | str512 | int32 | str64 | str8 | str512 |
LSST-computed per-object quantities. 1:1 relationship with MPCORB. Recomputed daily, upon MPCORB ingestion. | dp03_catalogs_1yr | 1 | dp03_catalogs_1yr.SSObject | table | |
LSST-computed per-object quantities. 1:1 relationship with MPCORB. Recomputed daily, upon MPCORB ingestion. | dp03_catalogs_10yr | 1 | dp03_catalogs_10yr.SSObject | table | |
The orbit catalog produced by the Minor Planet Center. Ingested daily. O(10M) rows by survey end. The columns are described at https://minorplanetcenter.net//iau/info/MPOrbitFormat.html | dp03_catalogs_1yr | 2 | dp03_catalogs_1yr.MPCORB | table | |
The orbit catalog produced by the Minor Planet Center. Ingested daily. O(10M) rows by survey end. The columns are described at https://minorplanetcenter.net//iau/info/MPOrbitFormat.html | dp03_catalogs_10yr | 2 | dp03_catalogs_10yr.MPCORB | table | |
Table to store 'difference image sources'; - sources detected at SNR >=5 on difference images. | dp03_catalogs_1yr | 3 | dp03_catalogs_1yr.DiaSource | table | |
Table to store 'difference image sources' - sources detected at SNR >=5 on difference images. | dp03_catalogs_10yr | 3 | dp03_catalogs_10yr.DiaSource | table | |
LSST-computed per-source quantities. 1:1 relationship with DIASource. Recomputed daily, upon MPCORB ingestion. | dp03_catalogs_1yr | 4 | dp03_catalogs_1yr.SSSource | table | |
LSST-computed per-source quantities. 1:1 relationship with DiaSource. Recomputed daily, upon MPCORB ingestion. | dp03_catalogs_10yr | 4 | dp03_catalogs_10yr.SSSource | table |
3. The MPCORB
table¶
During Rubin Operations, Solar System Processing will occur in the daytime, after a night of observing. It will link together the difference-image detections of moving objects and report discoveries to the Minor Planet Center (MPC), as well as compute derived properties (magnitudes, phase-curve fits, coordinates in various systems).
The MPC will calculate the orbital parameters and these results will be passed back to Rubin, and stored
and made available to users as the MPCORB
table
(the other derived properties are stored in the other three tables explored below).
The DP0.3 MPCORB
table is a simulation of what this data product will be like after 10 years of LSST.
Notice: The MPC contains all reported moving objects in the Solar System, and is not limited to those detected by LSST. Thus, the
MPCORB
table will have more rows than theSSObject
table.
Notice: For DP0.3, there was no fitting done by the MPC and the MPCORB table is the orbital elements used in the simulation (the
MPCORB
table is a truth table).
For more information about Rubin's plans for Solar System Processing, see Section 3.2.2 of the Data Products Definitions Document. Note that there remain differences between Table 4 of the DPDD, which contain the anticipated schema for the moving object tables, and the DP0.3 table schemas.
3.1. Size¶
Use the TAP service to count of the number of rows in the MPCORB
table.
results = service.search("SELECT COUNT(*) FROM dp03_catalogs_10yr.MPCORB")
results.to_table()
COUNT1 |
---|
int64 |
14462388 |
There are 14.5 million rows in the MPCORB
table.
3.2. Columns¶
Use the TAP service to query for the column information from MPCORB
.
Print the results as a pandas
table.
results = service.search("SELECT column_name, datatype, description, "
" unit from TAP_SCHEMA.columns "
"WHERE table_name = 'dp03_catalogs_10yr.MPCORB'")
results.to_table().to_pandas()
column_name | datatype | description | unit | |
---|---|---|---|---|
0 | arc | float | MPCORB: Arc (days), for single-opposition objects | d |
1 | arcEnd | char | MPCORB: Year of last observation (for multi-op... | |
2 | arcStart | char | MPCORB: Year of first observation (for multi-o... | |
3 | computer | char | MPCORB: Computer name | |
4 | e | double | MPCORB: Orbital eccentricity | |
5 | epoch | double | MPCORB: Epoch (in MJD, .0 TT) | d |
6 | flags | int | MPCORB: 4-hexdigit flags. See https://minorpla... | |
7 | fullDesignation | char | MPCORB: Readable designation | |
8 | incl | double | MPCORB: Inclination to the ecliptic, J2000.0 (... | deg |
9 | lastIncludedObservation | float | MPCORB: Date of last observation included in o... | d |
10 | mpcDesignation | char | MPCORB: Number or provisional designation (in ... | |
11 | mpcG | float | MPCORB: Slope parameter, G | |
12 | mpcH | float | MPCORB: Absolute magnitude, H | mag |
13 | mpcNumber | int | MPC number (if the asteroid has been numbered;... | |
14 | n | double | MPCORB: Mean daily motion (degrees per day) | deg/d |
15 | nobs | int | MPCORB: Number of observations | |
16 | node | double | MPCORB: Longitude of the ascending node, J2000... | deg |
17 | nopp | int | MPCORB: Number of oppositions | |
18 | peri | double | MPCORB: Argument of perihelion, J2000.0 (degrees) | deg |
19 | pertsLong | char | MPCORB: Precise indicator of perturbers (blank... | |
20 | pertsShort | char | MPCORB: Coarse indicator of perturbers (blank ... | |
21 | q | double | MPCORB: Perihelion distance (AU) | AU |
22 | reference | char | MPCORB: Reference | |
23 | rms | float | MPCORB: r.m.s residual (") | arcsec |
24 | ssObjectId | long | LSST unique identifier (if observed by LSST) | |
25 | tperi | double | MPCORB: MJD of pericentric passage | d |
26 | uncertaintyParameter | char | MPCORB: Uncertainty parameter, U |
In some cases, the column descriptions cut off in the table above.
Execute the following to see, for example, the full description for the mpcDesignation
column.
results['description'][10]
'MPCORB: Number or provisional designation (in packed form)'
3.3. Retrieve a random subset¶
To retrieve a random subset of rows, make use of the fact that ssObjectId
is a randomly assigned 64-bit long unsigned integer.
Since ADQL interprets a 64-bit long unsigned integer as a 63-bit signed integer,
these range from a very large negative integer value to a very large positive integer value.
This will be fixed in the future so that all identifiers are positive numbers.
Notice: By using
ssObjectId
, the following query returns a random subset ofMPCORB
rows that are associated with a row in theSSObject
table. In other words, this limits the query to only retrieve moving objects from theMPCORB
table that have been detected by LSST.
First, figure out the full range of ssObjectId
values by using the ADQL MIN
and MAX
functions.
results = service.search("SELECT min(ssObjectId), max(ssObjectId)"
"FROM dp03_catalogs_10yr.MPCORB")
results.to_table()
min1 | max2 |
---|---|
int64 | int64 |
-9223370383071521539 | 9223370875126069107 |
Define a search range for ssObjectId
that would return no more than 1% of all objects in MPCORB
. We will do this by estimating a new minimum ssObjectId
that is 1% below the maximum ssObjectId
for the full range of ssObjectId
values.
Notice: Since the range of
ssObjectId
's (-9223370383071521539 --> +9223370875126069107) is much larger than the number of rows in theMPCORB
table (14600302), we don't expect to get exactly 1% of the rows fromMPCORB
via this method, but we should get approximately 1%, as long as thessObjectId
values are distributed reasonably uniformly over their large range.
min_val = int(results[0].get('min1'))
max_val = int(results[0].get('max2'))
print('Full range: ', min_val, max_val)
min_val = int(max_val - 0.01*(max_val-min_val))
print('Search range: ', min_val, max_val)
Full range: -9223370383071521539 9223370875126069107 Search range: 9038903462544093184 9223370875126069107
Execute the search, and retrieve all (*) columns from the MPCORB
table.
Store the results in df
as a pandas
dataframe.
results = service.search("SELECT * FROM dp03_catalogs_10yr.MPCORB "
"WHERE ssObjectId BETWEEN 9038903462544093184 "
"AND 9223370875126069107")
df = results.to_table().to_pandas()
Display the resulting dataframe. Note that it is automatically truncated for both columns and rows (look for the "..." halfway down, and halfway across).
df
arc | arcEnd | arcStart | computer | e | epoch | flags | fullDesignation | incl | lastIncludedObservation | ... | nopp | peri | pertsLong | pertsShort | q | reference | rms | ssObjectId | tperi | uncertaintyParameter | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | 0.143340 | 60065.000000 | 0 | 2011 1062 T-3 | 6.230810 | NaN | ... | 0 | 62.518599 | 1.975493 | NaN | 9154661686854389525 | 59969.017393 | |||||||
1 | NaN | 0.112557 | 60065.000000 | 0 | 2011 1093 T-2 | 6.322560 | NaN | ... | 0 | 72.646946 | 2.061410 | NaN | 9059993893438746957 | 60310.434546 | |||||||
2 | NaN | 0.049899 | 60065.000000 | 0 | 2011 1184 T-2 | 4.306140 | NaN | ... | 0 | 279.168196 | 2.426167 | NaN | 9170955388068648464 | 60191.630055 | |||||||
3 | NaN | 0.105047 | 60065.000000 | 0 | 2011 1186 T-3 | 16.188500 | NaN | ... | 0 | 357.091672 | 2.757674 | NaN | 9077066921741880727 | 59093.291192 | |||||||
4 | NaN | 0.176216 | 60065.000000 | 0 | 2011 1464 T-2 | 4.495110 | NaN | ... | 0 | 81.310304 | 2.121439 | NaN | 9053987333942294876 | 59808.773603 | |||||||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
144467 | NaN | 0.919897 | 60255.126890 | 0 | 2011 LPCD750 | 106.184487 | NaN | ... | 0 | 209.420137 | 6.642560 | NaN | 9133287149408159006 | 60251.935331 | |||||||
144468 | NaN | 0.527282 | 61444.848873 | 0 | 2011 LPCD823 | 11.480147 | NaN | ... | 0 | 92.507010 | 16.379375 | NaN | 9140884185118476131 | 61444.842271 | |||||||
144469 | NaN | 0.536121 | 62656.471653 | 0 | 2011 LPCD889 | 117.420822 | NaN | ... | 0 | 12.259819 | 15.988392 | NaN | 9189745973987900787 | 62656.437497 | |||||||
144470 | NaN | 0.602921 | 63093.436806 | 0 | 2011 LPCD940 | 44.898648 | NaN | ... | 0 | 81.527019 | 15.572242 | NaN | 9181407122537090723 | 63092.423677 | |||||||
144471 | NaN | 0.655530 | 62223.336444 | 0 | 2011 LPCD941 | 177.067536 | NaN | ... | 0 | 276.729975 | 12.396072 | NaN | 9211044985939642883 | 62223.322988 |
144472 rows × 27 columns
As we see, 144,472 rows were returned, which -- as expected -- is almost (but not exactly) 1% of the 14,462,388 rows in the MPCORB
table.
Notice: There are several columns that currently contain
NaN
(not a number) values. For the simulated DP0.3 data these columns might be replaced in the near future, and for real data releases there will not be all-NaN
columns. If desired, users can drop all-NaN
columns with, e.g.,df.dropna(axis=1, how='all', inplace=True)
. However the better practice is to understand the columns and retrieve only what you are going to use.
Optional: use the pandas
dataframe info
function to learn more about the values in the retrieved columns.
Uncomment the following code cell (remove the # symbol) and execute the cell to see the info on df
.
# df.info()
Optional: use the pandas
fuction describe
to display statistics for the
numerical columns in the dataframe (count, mean, standard deviation, etc.).
# df.describe()
3.4. Plot histograms of selected columns¶
Wikipedia provides a decent beginner-level guide to orbital elements.
For a quick reference, distributions are shown below for five key orbital elements and the absolute $H$ magnitude (see Section 4.4 for a description of $H$).
fig, ax = plt.subplots(2, 3, figsize=(10, 6), sharey=False)
ax[0, 0].hist(df['e'], bins=100, log=True)
ax[0, 0].set_xlabel('Eccentricity')
ax[0, 0].set_ylabel('log(Number)')
ax[0, 1].hist(df['incl'], bins=100, log=True)
ax[0, 1].set_xlabel('Inclination (deg)')
ax[0, 1].set_ylabel('log(Number)')
ax[0, 2].hist(df['mpcH'], bins=100, log=True)
ax[0, 2].set_xlabel('Absolute Magnitude, H (mag)')
ax[0, 2].set_ylabel('log(Number)')
ax[1, 0].hist(df['node'], bins=50)
ax[1, 0].set_xlabel('Longitude of Ascending Node (deg)')
ax[1, 0].set_ylabel('Number')
ax[1, 0].set_ylim(0,3500)
ax[1, 1].hist(df['peri'], bins=50)
ax[1, 1].set_xlabel('Argument of Perihelion (deg)')
ax[1, 1].set_ylabel('Number')
ax[1, 1].set_ylim(0,3500)
ax[1, 2].hist(df['q'], bins=100, log=True)
ax[1, 2].set_xlabel('Perihelion Distance (au)')
ax[1, 2].set_ylabel('log(Number)')
fig.suptitle('Histograms for Key Orbital Elements')
fig.tight_layout()
plt.show()
4. The SSObject
table¶
During Rubin Operations, Prompt Processing will occur during the night, detecting sources in
difference images (DiaSources
, see Section 6) and associating them into static-sky transients
and variables (DiaObjects
, not included in DP0.3).
The Solar System Processing which occurs in the daytime, after a night of observing,
links together the DiaSources
for moving objects into SSObjects
.
Whereas the MPCORB
table contains the orbital elements for these moving objects,
the SSObjects
contains the Rubin-measured properties such as phase curve fits and absolute magnitudes.
Notice: no artifacts or spurious difference-image sources have been injected into the DP0.3 catalogs.
Notice: although there are columns for them, no u- or Y-band data were simulated for DP0.3. Unless noted, we will ignore u- or Y-band data for the remainder of this notebook.
4.1. Size¶
Use the ADQL count function to return the size.
results = service.search("SELECT COUNT(*) from dp03_catalogs_10yr.SSObject")
results.to_table().to_pandas()
COUNT1 | |
---|---|
0 | 4443479 |
The DP0.3 data set contains 4.4 million solar system objects detected by Rubin.
This is less than the 14.5 million objects in the MPCORB
catalog.
It is left as an exercise for the learner in Section 7 to determine the characteristics of those
objects from the MPCORB
table are missing from the SSObject
table.
4.2. Columns¶
Option: display a list of column names, data types, descriptions, and units.
# results = service.search("SELECT column_name, datatype, description, "
# "unit from TAP_SCHEMA.columns "
# "WHERE table_name = 'dp03_catalogs_10yr.SSObject'")
# results.to_table().to_pandas()
4.3. Retrieve a random subset¶
Use essentially the same query as was used for the MPCORB
table, above.
results = service.search("SELECT * FROM dp03_catalogs_10yr.ssObject "
"WHERE ssObjectId BETWEEN 9038903462544093184 "
"AND 9223370875126069107")
df = results.to_table().to_pandas()
df
arc | discoverySubmissionDate | firstObservationDate | flags | g_Chi2 | g_G12 | g_G12Err | g_H | g_H_gG12_Cov | g_HErr | ... | y_H_yG12_Cov | y_HErr | y_Ndata | z_Chi2 | z_G12 | z_G12Err | z_H | z_H_zG12_Cov | z_HErr | z_Ndata | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2225.923096 | 61634.00000 | 61640.42378 | 102 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | 0 | NaN | NaN | NaN | NaN | NaN | NaN | 0 |
1 | 3148.112305 | 60308.00000 | 60311.20320 | 66 | 1.821222 | 0.538449 | 0.029092 | 16.201378 | 0.000087 | 0.003903 | ... | NaN | NaN | 0 | 1.004799 | 0.570081 | 2.714864e-02 | 15.410084 | 6.308406e-05 | 4.074336e-03 | 68 |
2 | 3314.104736 | 60544.00000 | 60546.12723 | 66 | 1.481194 | 0.447420 | 0.102737 | 19.085361 | 0.001247 | 0.015581 | ... | NaN | NaN | 0 | 0.496874 | 0.493693 | 1.690698e-01 | 18.293407 | 4.123830e-03 | 3.045361e-02 | 38 |
3 | 3436.664307 | 60362.00000 | 60369.30842 | 66 | 2.093236 | 0.617962 | 0.031888 | 17.350933 | 0.000167 | 0.006134 | ... | NaN | NaN | 0 | 1.211593 | 0.586116 | 4.590447e-02 | 16.562313 | 2.647262e-04 | 7.582015e-03 | 79 |
4 | 3325.815674 | 60476.00000 | 60476.15480 | 66 | 2.019966 | 0.590387 | 0.057489 | 18.055870 | 0.000590 | 0.010950 | ... | NaN | NaN | 0 | 0.913779 | 0.598814 | 6.471661e-02 | 17.270544 | 5.449418e-04 | 1.152790e-02 | 49 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
44606 | 2897.801514 | 60574.00000 | 60580.30096 | 98 | 0.261178 | 0.202494 | 0.468495 | 21.202513 | 0.042931 | 0.101005 | ... | NaN | NaN | 0 | NaN | NaN | NaN | NaN | NaN | NaN | 0 |
44607 | 500.044525 | 62742.18036 | 62735.18036 | 2150 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | 0 | NaN | NaN | NaN | NaN | NaN | NaN | 0 |
44608 | 3341.052979 | 60219.00000 | 60226.23987 | 102 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | 0 | NaN | NaN | NaN | NaN | NaN | NaN | 0 |
44609 | 2105.805664 | 61488.00000 | 61493.23812 | 98 | 1.452305 | 0.541928 | 0.451856 | 20.889555 | 0.032904 | 0.080253 | ... | NaN | NaN | 0 | NaN | NaN | NaN | NaN | NaN | NaN | 0 |
44610 | 1859.874756 | 61727.00000 | 61731.34561 | 70 | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | 0 | 0.859407 | -767251.750000 | 9.258375e+11 | 36.474541 | -1.229126e+18 | 1.327583e+06 | 6 |
44611 rows × 55 columns
There are 44611 rows.
Options: use the info
or describe
functions on the pandas
dataframe to learn more about the retrieved results.
# df.info()
# df.describe()
4.4. Plot a color-color diagram¶
In the displayed dataframe above, it appears that for many SSObjects
the phase-curve fits to derive
absolute magnitudes were not successful, as the <band>_H
(where <band>
is a band u, g, r, i, z, or y) are NaN
. (Recall in particular that no u- or y-band data were simulated for DP0.3.)
Before calculating and plotting the colors, drop all of the rows for which the phase-curve fits were not successful for g, r, i, and/or z bands.
df.dropna(subset=['g_H', 'r_H', 'i_H', 'z_H'], inplace=True)
df.reset_index(inplace=True)
print('Number of rows after dropping rows: %d' % len(df))
Number of rows after dropping rows: 30853
For Solar System objects, absolute magnitudes are defined to be for an object 1 au from the Sun and 1 au from the observer, and at a phase angle (the angle Sun-object-Earth) of 0 degrees.
Absolute magnitudes are derived by correcting for distance, fitting a function to the relationship between absolute magnitude and phase, and evaluating the function at a phase of 0 deg.
The process for fitting phase curves will be covered in another tutorial.
Calculate colors in the Rubin filters for the SSObjects
that have absolute magnitudes.
df['gr'] = df['g_H'] - df['r_H']
df['ri'] = df['r_H'] - df['i_H']
df['iz'] = df['i_H'] - df['z_H']
Plot color-color diagrams as 2-dimensional histograms (heatmaps).
fig, ax = plt.subplots(1, 2, figsize=(8, 3))
h = ax[0].hist2d(df['gr'], df['ri'],
bins=(np.arange(-0.5, 1.5, 0.01),
np.arange(-0.5, 0.75, 0.01)),
norm='log')
ax[0].set_xlabel('g - r')
ax[0].set_ylabel('r - i')
ax[0].grid()
plt.colorbar(h[3])
h = ax[1].hist2d(df['ri'], df['iz'],
bins=(np.arange(-0.5, 0.75, 0.01),
np.arange(-0.75, 0.5, 0.01)),
norm='log')
ax[1].set_xlabel('r - i')
ax[1].set_ylabel('i - z')
ax[1].grid()
plt.colorbar(h[3])
fig.suptitle('Color-Color Plots for the SSObject Catalog')
fig.tight_layout()
plt.show()
There are two colors used in the simulation - but this is not the case for real Solar System objects. These plots will look very different in the future, when they are made with real Rubin data.
5. The SSSource
table¶
As described above, Solar System Processing links together the DiaSources
(detections in the
individual difference images) from moving objects into SSObjects
.
The SSSource
table contains the 2-d (sky) coordinates and 3-d distances and velocities
for every SSObject
at the time of every LSST observation of that SSObject
.
The SSSource
and DiaSource
tables are 1:1, as they each contain data per observation,
whereas SSObject
contains data per object.
5.1. Size¶
It can take up to a minute to retrieve the size of the SSSource
catalog.
results = service.search("SELECT COUNT(*) from dp03_catalogs_10yr.SSSource")
results.to_table().to_pandas()
COUNT1 | |
---|---|
0 | 653005444 |
This table contains over 653 million sources!
5.2. Columns¶
Option: print the column information for the SSSource
table.
# results = service.search("SELECT column_name, datatype, description, "
# "unit from TAP_SCHEMA.columns "
# "WHERE table_name = 'dp03_catalogs_10yr.SSSource'")
# results.to_table().to_pandas()
5.3. Retrieve data for one SSObject
¶
It is possible to obtain the SSSource
data for a set of SSObjects
.
For example, to retrieve all SSSources
for the SSObjects
retrieved in Section 4,
use a query such as:
SELECT * FROM dp03_catalogs_10yr.SSSource
WHERE ssObjectId BETWEEN 9038903462544093184 AND 9223370875126069107
However, the better way to demonstrate the data in the SSSource
table is to look at just one SSObject
,
and the one with an ssObjectId
= 6793512588511170680
is a fun choice.
Retrieve the heliocentric (sun-centered) and topocentric (observatory-centered) X and Y coordinates.
results = service.search("SELECT heliocentricX, heliocentricY, "
"topocentricX, topocentricY, ssObjectId "
"FROM dp03_catalogs_10yr.SSSource "
"WHERE ssObjectId = 6793512588511170680")
df_xy = results.to_table().to_pandas()
print('Retrieved ', len(df_xy), ' rows.')
Retrieved 319 rows.
Options to display the table in full or use the info
or describe
functions.
# df_xy
# df_xy.info()
# df_xy.describe()
5.3. Plot the locations of one SSObject
¶
Plot the locations of the selected SSObject
at the time of every
LSST observation using the X and Y heliocentric (Sun-centered; orange star)
and topocentric (observatory-centered; blue circle) coordinates.
This can be considered a projection of the orbit into the plane of the Solar System.
Notice how the points are not regularly spaced. This is because there is one point per LSST observation of the object, and in some years it receives more or fewer observations.
Notice how the points appear in an ellipse around the Sun with heliocentric coordinates (left). This is because the selected object is in the main asteroid belt and close enough to the Sun to complete at least on orbit during the 10-year LSST survey. Had the selected object been in the outer Solar System, or were this tutorial using the 1-year data set, the plot below would show an arc instead of an ellipse.
The plot of the topocentric coordinates (right) does not appear elliptical because the motion of the Earth with respect to the object over the 10 years of the LSST is imprinted into the data. For the topocentric coordinates, the Earth's rotation also contributes, but it is a much smaller effect on the scale of these plots.
fig, ax = plt.subplots(1, 2, figsize=(8, 4))
ax[0].grid()
ax[0].plot(df_xy['heliocentricX'], df_xy['heliocentricY'],
'o', ms=4, mew=0, color='black')
ax[0].plot(0, 0, '*', ms=15, color='darkorange')
ax[0].set_xlabel('heliocentric X (au)')
ax[0].set_ylabel('heliocentric Y (au)')
ax[0].set_title('Heliocentric Coordinates')
ax[1].grid()
ax[1].plot(df_xy['topocentricX'], df_xy['topocentricY'],
'o', ms=4, mew=0, color='black')
ax[1].plot(0, 0, 'o', ms=15, color='dodgerblue')
ax[1].set_xlabel('topocentric X (au)')
ax[1].set_ylabel('topocentric Y (au)')
ax[1].set_title('Topocentric Coordinates')
fig.suptitle('XY Path of 6793512588511170680')
fig.tight_layout()
plt.show()
6. The DiaSource
catalog¶
Last but definitely not least, the DiaSource
table - which is actually the first to be generated
by the Prompt Processing pipeline.
This table will contain measurements for all sources detected with a signal-to-noise ratio of at least 5
in a difference image, including moving and static-sky sources.
However, for simulated DP0.3 the DiaSource
table contains only moving objects, and no static sky time-domain objects (and no detector artifacts).
For DP0.3 the simulated DiaSource
table contains only a subset of the columns that the
real DiaSource
table will have; see Table 1 of the Rubin
Data Products Definitions Document.
Furthermore, the DiaSource
table contains a few extra truth columns, such as raTrue
, decTrue
, magTrue
.
For DP0.3, no photometric variability due to, e.g., the rotation of non-spherical bodies, collisions, or outgassing events were simulated. All evolution in apparent magnitudes with time are due to the phase curve, which is explored in another tutorial.
6.1. Size¶
Option to retrieve the size of the DiaSource
table (it is 1:1 with the SSSource
table, so the same size).
# results = service.search("SELECT COUNT(*) from dp03_catalogs_10yr.DiaSource")
# results.to_table().to_pandas()
6.2. Columns¶
Option to print the column information.
# results = service.search("SELECT column_name, datatype, description, "
# "unit from TAP_SCHEMA.columns "
# "WHERE table_name = 'dp03_catalogs_10yr.DiaSource'")
# results.to_table().to_pandas()
6.3. Retrieve data for one SSObject
¶
Similar to what we did for SSSource
data in Section 5, it is possible to obtain the DiaSource
data for a set of SSObjects
. For example, to retrieve all DiaSources
for the SSObjects
retrieved in Section 4,
use a query such as:
SELECT * FROM dp03_catalogs_10yr.DiaSource
WHERE ssObjectId BETWEEN 9038903462544093184 AND 9223370875126069107
However, as we saw for the SSSource
data in Section 5, the better way to demonstrate the data in the DiaSource
table is to look at just one SSObject
.
Here, as in Section 5, we will use the one with an ssObjectId
= 6793512588511170680
.
Retrieve the right ascension, declination, and the TAI at the exposure's midpoint (MJD) of the observation.
results = service.search("SELECT ra, dec, midpointMjdTai "
"FROM dp03_catalogs_10yr.DiaSource "
"WHERE ssObjectId = 6793512588511170680")
df = results.to_table().to_pandas()
df
ra | dec | midpointMjdTai | |
---|---|---|---|
0 | 348.416159 | -2.408505 | 62096.01616 |
1 | 351.997357 | 1.212909 | 62043.12454 |
2 | 53.506941 | 19.244016 | 63430.44136 |
3 | 182.188871 | -1.424286 | 62709.97076 |
4 | 137.800325 | 7.798673 | 62560.10471 |
... | ... | ... | ... |
314 | 312.379754 | -12.443726 | 63148.08742 |
315 | 313.660126 | -12.585707 | 63160.04304 |
316 | 147.164625 | 4.901712 | 62510.30546 |
317 | 220.150693 | -12.830205 | 61555.00272 |
318 | 17.293268 | 9.615766 | 60830.43848 |
319 rows × 3 columns
6.4. Plot time-domain data for one SSObject
¶
Here, we plot the equatorial coordinates for all observations, colored by the MJD of the observation.
This plot shows the location of the object on the sky, as seen from Earth. Recall from the plots in Section 5 that this object is only about 2 au from the Sun, and so is at the inner edge of the Main Asteroid Belt. Parallax from the Earth's orbit contributes to the appearance of this plot.
cmap = colormaps['viridis']
fig, ax = plt.subplots(1, 1, figsize=(12, 3))
im = ax.scatter(df['ra'], df['dec'], c=df['midpointMjdTai'], cmap=cmap)
ax.set_xlabel('ra (deg)')
ax.set_ylabel('dec (deg)')
fig.suptitle("Path of one SSObject in equatorial coordinates over 10 years")
fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.85, 0.15, 0.02, 0.7])
fig.colorbar(im, cax=cbar_ax, label='TAI')
ax.grid(True)
Use of the magnitude and filter data retrieved from the DiaSource
table is left as
an exercise for the learner in Section 7.
7. Exercises for the learner¶
1. Calculate the semi-major axis for a subset of objects.
The orbital element of semi-major axis ($a$) is not pre-computed in the MPCORB
table
because it can be derived from the orbit’s ellipticiy ($e$) and perihelion distance ($q$)
with $a = q/(1-e)$.
Copy the query from Section 3.3 and alter it to retrieve only $e$ and $q$ for a subset of objects. Then add a column to the results table for semi-major axis and plot eccentricity versus semi-major axis.
Hints:
- Restrict the query to objects with eccentricities between 0 and 1.
df['a'] = df['q'] / (1.0 - df['e'])
2. How does an SSObject's magnitude change with time?
In Section 6.4, the magnitude and filter for all detections of the SSObject
in the
DiaSource
table were not retrieved; alter the query to retrieve mag
and band
,
and plot the magnitude as a function of time for the filter of your choice
(recall: there will be no u- or y-band magnitudes).
Note that the only simulated cause of photometric changes in DP0.3 objects is the distance from Earth and the phase curve.
Correcting for distance, and the applications phase curves, are covered in another tutorial.