Skip to content

SSC MSOS archive search keywords update#700

Merged
PaulHuwe merged 10 commits into
spacetelescope:mainfrom
cjarnold:msos_20250910
Oct 2, 2025
Merged

SSC MSOS archive search keywords update#700
PaulHuwe merged 10 commits into
spacetelescope:mainfrom
cjarnold:msos_20250910

Conversation

@cjarnold
Copy link
Copy Markdown
Collaborator

This PR will be used to refine and discuss the SSC MSOS archive search keywords for SOC B20.

@ketozhang and @ejoliet to provide feedback on updates needed here.

Initial changes:

  • The keywords will arrive from SSC in a flat JSON object. There is no 'meta' object, so it has been removed here.
  • Per Approach for validation of SSC metadata #693 we are working to allow SSC pipelines to validate their metadata JSON objects against the RAD schema. JSON does not support the tagged items, so switching all tagged asdf/time occurrences to reference the asdf/time iso_time.

Tasks

  • Update or add relevant rad tests.
  • Update relevant docstrings and / or docs/ page.
  • Does this PR change any schema files?
    • Schema changes were discussed at RAD Review Board meeting.
  • Does this PR change any API used downstream? (If not, label with no-changelog-entry-needed.)
News fragment change types:
  • changes/<PR#>.feature.rst: new feature
  • changes/<PR#>.bugfix.rst: fixes an issue
  • changes/<PR#>.doc.rst: documentation change
  • changes/<PR#>.removal.rst: deprecation or removal of public API
  • changes/<PR#>.misc.rst: infrastructure or miscellaneous change

@cjarnold cjarnold requested review from a team and WilliamJamieson as code owners September 10, 2025 20:42
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.48%. Comparing base (f45f0d4) to head (35dd484).
⚠️ Report is 147 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #700   +/-   ##
=======================================
  Coverage   97.48%   97.48%           
=======================================
  Files           8        8           
  Lines         716      716           
=======================================
  Hits          698      698           
  Misses         18       18           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ketozhang
Copy link
Copy Markdown

ketozhang commented Sep 11, 2025

Without a way to look at all the schema merge, it's a little hard for non-devs (and I admit devs too) to give feedback. One way is some way to merge these all into a super schema otherwise we will try to compose the object and use a validator.

I am betting the latter is faster to get started on. For that can we get the tag -> $ref refactoring in the main branch before we attempt to refine the schema fields?

@braingram
Copy link
Copy Markdown
Collaborator

I am betting the latter is faster to get started on. For that can we get the tag -> $ref refactoring in the main branch before we attempt to refine the schema fields?

Thanks for taking a look at this. The tag to $ref refactoring is only for SOC schemas. It sounds like (based on the other comments and PRs) that manifests can never contain tags (since they are JSON files) so there's no benefit to using a tag validator in the SSC manifest schemas. Some $refs might be useful (like the $refs to the time schema added in this PR to reuse the regex to check that the times are iso formatted) but that's independent of SOC tag to $ref updates.

Without a way to look at all the schema merge, it's a little hard for non-devs (and I admit devs too) to give feedback. One way is some way to merge these all into a super schema otherwise we will try to compose the object and use a validator.

Testing this with actual manifests sounds great. I would also encourage testing with known incorrect manifests to make sure that the schemas are validating the expected fields (for example, change file_date to a number, show that the validation fails, etc).

For "merge" do you mean resolve references for a given schema? If so, there is a resolve_references argument to asdf.schema.load_schema.

Running the following prints the wfi_microlensing_light_curve_catalog_level_4 schema with all references resolves:

from importlib.resources import files
import json

import asdf
import asdf_standard

# add ssc resources
resources = asdf.resource.DirectoryResourceMapping(
    files("rad") / "resources" / "schemas" / "SSC",
    "asdf://stsci.edu/datamodels/roman/schemas/SSC/",
    recursive=True
)

asdf.get_config().add_resource_mapping(resources)

print(
    json.dumps(
        asdf.schema.load_schema(
            "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/wfi_microlensing_light_curve_catalog_level_4-1.0.0",
            resolve_references=True,
        ),
        indent=2,
    )
)

Is this what you're looking for? If not, would you expand on what would be helpful?

Click to see output:
{
  "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
  "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/wfi_microlensing_light_curve_catalog_level_4-1.0.0",
  "title": "GBTDS Level 4 Light Curve Catalog",
  "archive_meta": "Placeholder SOC file type",
  "type": "object",
  "allOf": [
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/ssc_basic-1.0.0",
      "title": "Basic keywords across all SSC file types",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "filename": {
          "title": "File Name",
          "description": "The filename as taken from the 'filename' property of the\ndata availability notification (https://github.com/spacetelescope/roman-soc-ssc-common/blob/main/schemas/data_availability.json#L7)\nSSC does not need to include this in the 'metadata' object of the data availability notification.\n",
          "type": "string",
          "maxLength": 1024,
          "archive_catalog": {
            "datatype": "nvarchar(1024)",
            "destination": [
              "CGIExposure.filename",
              "CGIMosaic.filename",
              "CGIAncillary.filename"
            ]
          }
        },
        "file_date": {
          "title": "File Creation Date (UTC)",
          "description": "The file creation date as taken from the 'fileCreationTimestamp' property of the\ndata availability notification (https://github.com/spacetelescope/roman-soc-ssc-common/blob/main/schemas/data_availability.json#L76)\nSSC does not need to include this in the 'metadata' object of the data availability notification.\n",
          "allOf": [
            {
              "type": "string",
              "pattern": "[0-9]{4}-(0[1-9])|(1[0-2])-(0[1-9])|([1-2][0-9])|(3[0-1])[T ]([0-1][0-9])|(2[0-4]):[0-5][0-9]:[0-5][0-9](.[0-9]+)?"
            }
          ],
          "archive_catalog": {
            "datatype": "datetime2",
            "destination": [
              "CGIExposure.filedate",
              "CGIMosaic.filedate",
              "CGIAncillary.filedate"
            ]
          }
        },
        "origin": {
          "title": "Institution / Organization Name",
          "description": "Organization responsible for creating file (\"IPAC/SSC\"), the Science Support Center at Caltech/IPAC.\nSSC does not need to include this in the 'metadata' object of the data availability notification.\nThis will be added automatically by SOC ingest manifest software\n",
          "type": "string",
          "maxLength": 15,
          "archive_catalog": {
            "datatype": "nvarchar(15)",
            "destination": [
              "CGIExposure.origin",
              "CGIMosaic.origin",
              "CGIAncillary.origin"
            ]
          }
        }
      },
      "required": [
        "filename",
        "file_date",
        "origin"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_basic-1.0.0",
      "title": "MSOS basic keywords",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "model_type": {
          "title": "Data Model Type",
          "description": "The type of data model used for the file.  Example value 'MicrolensingEventCatalogSchema'",
          "type": "string",
          "maxLength": 50,
          "archive_catalog": {
            "datatype": "nvarchar(50)",
            "destination": [
              "MSOSCatalog.model_type"
            ]
          }
        },
        "product_type": {
          "title": "Product Type Descriptor",
          "description": "A descriptor for the type of data contained within the\nfile. This corresponds to the standard file suffixes for\narchival data products. Consult the documentation for the list\nof options.\n",
          "type": "string",
          "maxLength": 120,
          "archive_catalog": {
            "datatype": "nvarchar(120)",
            "destination": [
              "MSOSCatalog.product_type"
            ]
          }
        }
      },
      "required": [
        "model_type",
        "product_type"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_common-1.0.0",
      "title": "Common MSOS Keyword",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "INSTRUME": {
          "title": "Instrument",
          "description": "Instrument used",
          "type": "string",
          "maxLength": 8,
          "archive_catalog": {
            "datatype": "nvarchar(8)",
            "destination": [
              "WFIReferenceFrame.instrument_name",
              "MSOSCatalog.instrument_name"
            ]
          }
        },
        "DETECTOR": {
          "title": "Detector",
          "description": "The WFI detector name",
          "type": "string",
          "maxLength": 16,
          "archive_catalog": {
            "datatype": "nvarchar(16)",
            "destination": [
              "WFIReferenceFrame.detector",
              "MSOSCatalog.detector"
            ]
          }
        },
        "APERTURE": {
          "title": "Aperture",
          "description": "The WFI aperture name",
          "type": "string",
          "maxLength": 16,
          "archive_catalog": {
            "datatype": "nvarchar(16)",
            "destination": [
              "WFIReferenceFrame.aperture",
              "MSOSCatalog.aperture"
            ]
          }
        },
        "CALVER": {
          "title": "Calibration Version",
          "description": "The version of the calibration software used",
          "type": "string",
          "maxLength": 16,
          "archive_catalog": {
            "datatype": "nvarchar(16)",
            "destination": [
              "WFIReferenceFrame.detector",
              "MSOSCatalog.detector"
            ]
          }
        },
        "OPTICAL_ELEMENT": {
          "allOf": [
            {
              "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
              "id": "asdf://stsci.edu/datamodels/roman/schemas/wfi_optical_element-1.2.0",
              "title": "Optical Element",
              "description": "Name of the filter element used. See the RDox Optical Element page for more\ndetails on available optical elements and their properties.\n",
              "type": "string",
              "enum": [
                "F062",
                "F087",
                "F106",
                "F129",
                "F146",
                "F158",
                "F184",
                "F213",
                "GRISM",
                "PRISM",
                "DARK",
                "NOT_CONFIGURED"
              ],
              "maxLength": 20
            }
          ],
          "title": "Wide Field Instrument (WFI) Optical Element",
          "description": "Name of the optical element used to take the science\ndata.\n",
          "archive_catalog": {
            "datatype": "nvarchar(20)",
            "destination": [
              "WFIReferenceFrame.optical_element",
              "MSOSCatalog.optical_element"
            ]
          }
        }
      },
      "required": [
        "INSTRUME",
        "DETECTOR",
        "APERTURE",
        "CALVER",
        "OPTICAL_ELEMENT"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_exposure_catalog-1.0.0",
      "title": "Exposure Information For MSOS Catalogs",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "EXPTIME": {
          "title": "Exposure Time",
          "description": "Cadence of photometric measurements for light curve (seconds)",
          "type": "number",
          "archive_catalog": {
            "datatype": "float",
            "destination": [
              "MSOSCatalog.exposure_time"
            ]
          }
        },
        "EXPSTART": {
          "title": "Start Time",
          "description": "First timestamp in the light curve (UTC)",
          "allOf": [
            {
              "type": "string",
              "pattern": "[0-9]{4}-(0[1-9])|(1[0-2])-(0[1-9])|([1-2][0-9])|(3[0-1])[T ]([0-1][0-9])|(2[0-4]):[0-5][0-9]:[0-5][0-9](.[0-9]+)?"
            }
          ],
          "archive_catalog": {
            "datatype": "datetime2",
            "destination": [
              "MSOSCatalog.start_time"
            ]
          }
        },
        "EXPEND": {
          "title": "End Time",
          "description": "Last timestamp in the light curve (UTC)",
          "allOf": [
            {
              "type": "string",
              "pattern": "[0-9]{4}-(0[1-9])|(1[0-2])-(0[1-9])|([1-2][0-9])|(3[0-1])[T ]([0-1][0-9])|(2[0-4]):[0-5][0-9]:[0-5][0-9](.[0-9]+)?"
            }
          ],
          "archive_catalog": {
            "datatype": "datetime2",
            "destination": [
              "MSOSCatalog.end_time"
            ]
          }
        },
        "MJDSTART": {
          "title": "Start Time MJD",
          "description": "First timestamp in the light curve (MJD)",
          "type": "number",
          "archive_catalog": {
            "datatype": "float",
            "destination": [
              "MSOSCatalog.start_time_mjd"
            ]
          }
        },
        "MJDEND": {
          "title": "End Time MJD",
          "description": "Last timestamp in the light curve (MJD)",
          "type": "number",
          "archive_catalog": {
            "datatype": "float",
            "destination": [
              "MSOSCatalog.end_time_mjd"
            ]
          }
        }
      },
      "required": [
        "EXPTIME",
        "EXPSTART",
        "EXPEND",
        "MJDSTART",
        "MJDEND"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_observation-1.0.0",
      "title": "MSOS observation keywords",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "time_mean": {
          "title": "Mean Time of the Product",
          "description": "The Universal Coordinated Time (UTC) mean start time of the exposures used to create the product.\n",
          "allOf": [
            {
              "type": "string",
              "pattern": "[0-9]{4}-(0[1-9])|(1[0-2])-(0[1-9])|([1-2][0-9])|(3[0-1])[T ]([0-1][0-9])|(2[0-4]):[0-5][0-9]:[0-5][0-9](.[0-9]+)?"
            }
          ],
          "archive_catalog": {
            "datatype": "datetime2",
            "destination": [
              "MSOSCatalog.time_mean"
            ]
          }
        },
        "max_exposure_time": {
          "title": "Maximum Exposure Time (s)",
          "description": "Maximum exposure time of all pixels in the product in units of seconds.\n",
          "type": "number",
          "archive_catalog": {
            "datatype": "float",
            "destination": [
              "MSOSCatalog.max_exposure_time"
            ]
          }
        },
        "mean_exposure_time": {
          "title": "Mean Exposure Time (s)",
          "description": "Mean of component image exposure times\n",
          "type": "number",
          "archive_catalog": {
            "datatype": "float",
            "destination": [
              "MSOSCatalog.mean_exposure_time"
            ]
          }
        }
      },
      "required": [
        "time_mean",
        "max_exposure_time",
        "mean_exposure_time"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_photometry-1.0.0",
      "title": "MSOS photometry keywords",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "conversion_megajanskys": {
          "title": "Zeropoint Flux (MJy/sr)",
          "description": "The flux density (in units of megaJanskys per\nsteradian; MJy/sr).\n",
          "type": "number",
          "archive_catalog": {
            "datatype": "float",
            "destination": [
              "MSOSCatalog.conversion_megajanskys"
            ]
          }
        },
        "conversion_megajanskys_uncertainty": {
          "title": "Zeropoint Flux Uncertainty (MJy/sr)",
          "description": "The uncertainty in the flux density (in units of\nmegaJanskys per steradian; MJy/sr)\n",
          "type": "number",
          "archive_catalog": {
            "datatype": "float",
            "destination": [
              "MSOSCatalog.conversion_megajanskys_uncertainty"
            ]
          }
        }
      },
      "required": [
        "conversion_megajanskys",
        "conversion_megajanskys_uncertainty"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_program-1.0.0",
      "title": "MSOS program keywords",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "PROGNUM": {
          "title": "Program number",
          "description": "ID number (integer) assigned to the proposal",
          "type": "string",
          "maxLength": 16,
          "archive_catalog": {
            "datatype": "nvarchar(16)",
            "destination": [
              "WFIReferenceFrame.program",
              "MSOSCatalog.program"
            ]
          }
        },
        "PINAME": {
          "title": "PI Last Name",
          "description": "Principal Investigator (PI) last name; uppercase (e.g., FORD, LEMMON)\n",
          "type": "string",
          "maxLength": 16,
          "archive_catalog": {
            "datatype": "nvarchar(16)",
            "destination": [
              "WFIReferenceFrame.pi_name",
              "MSOSCatalog.pi_name"
            ]
          }
        },
        "PROGTITLE": {
          "title": "Program Title",
          "description": "The title of the program",
          "type": "string",
          "maxLength": 16,
          "archive_catalog": {
            "datatype": "nvarchar(16)",
            "destination": [
              "WFIReferenceFrame.program_title",
              "MSOSCatalog.program_title"
            ]
          }
        },
        "CATEGORY": {
          "title": "Program Category",
          "description": "The submitted proposal category of the program. The\ncategories include calibration (CAL), core community survey\n(CCS), coronagraph technology demonstration (CGI), observatory\ncommissioning (COM), engineering (ENG), general astrophysics\nsurvey (GAS), general investigator (GI), and observing conducted\nat the direction of the National Aeronautics and Space\nAdministration (NASA).\n",
          "type": "string",
          "maxLength": 6,
          "archive_catalog": {
            "datatype": "nvarchar(6)",
            "destination": [
              "WFIReferenceFrame.category",
              "MSOSCatalog.category"
            ]
          }
        },
        "SUBCATEGORY": {
          "title": "Program Subcategory",
          "description": "The submitted proposal subcategory of the program. The\nsubcategories include calibration (CAL), coronagraph technology\ndemonstration (CGI), community research programs (CR),\ndiscretionary research programs (DR), Galactic Bulge Time Domain\nSurvey (GBTD), High-Latitude Time Domain Survey (HLTD),\nHigh-Latitude Wide-Area Survey (HLWA), observational research\nprogram (OR), Wide Field Instrument (WFI), and Wavefront Sensing\nand Control (WFSC). All subcategories belong to only a subset of\nthe meta.program.category values.\n",
          "type": "string",
          "maxLength": 15,
          "archive_catalog": {
            "datatype": "nvarchar(15)",
            "destination": [
              "WFIReferenceFrame.program_subcategory",
              "MSOSCatalog.program_subcategory"
            ]
          }
        },
        "SCIENCE_CATEGORY": {
          "title": "Science Category",
          "description": "The science category assigned during the Time\nAllocation Committee (TAC) review process.\n",
          "type": "string",
          "maxLength": 50,
          "archive_catalog": {
            "datatype": "nvarchar(50)",
            "destination": [
              "WFIReferenceFrame.science_category",
              "MSOSCatalog.science_category"
            ]
          }
        }
      },
      "required": [
        "PROGNUM",
        "PINAME",
        "PROGTITLE",
        "CATEGORY",
        "SUBCATEGORY",
        "SCIENCE_CATEGORY"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_reference_frame-1.0.0",
      "title": "MSOS Reference Frame keywords",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "FLIP": {
          "title": "Flip",
          "description": "Placeholder.  Example value '1'",
          "type": "string",
          "maxLength": 8,
          "archive_catalog": {
            "datatype": "nvarchar(8)",
            "destination": [
              "WFIReferenceFrame.flip"
            ]
          }
        },
        "FIELD": {
          "title": "Field",
          "description": "Placeholder.  Example value '5'",
          "type": "string",
          "maxLength": 128,
          "archive_catalog": {
            "datatype": "nvarchar(128)",
            "destination": [
              "WFIReferenceFrame.field"
            ]
          }
        },
        "SEQUENCE": {
          "title": "Sequence",
          "description": "Placeholder.  Example value '3'",
          "type": "string",
          "maxLength": 10,
          "archive_catalog": {
            "datatype": "nvarchar(10)",
            "destination": [
              "WFIReferenceFrame.sequence"
            ]
          }
        },
        "TILE_ID": {
          "title": "Tile Id",
          "description": "Placeholder.  Example value '35'",
          "type": "string",
          "maxLength": 10,
          "archive_catalog": {
            "datatype": "nvarchar(10)",
            "destination": [
              "WFIReferenceFrame.tile_id"
            ]
          }
        }
      },
      "required": [
        "FLIP",
        "FIELD",
        "SEQUENCE",
        "TILE_ID"
      ]
    },
    {
      "$schema": "asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0",
      "id": "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/keywords/msos_settings-1.0.0",
      "title": "MSOS setting keywords",
      "extName": "SSC",
      "type": "object",
      "properties": {
        "msos_photometry_version": {
          "title": "Msos Photometry Version",
          "description": "Placeholder.  Example value '3.3.0'",
          "type": "string",
          "maxLength": 128,
          "archive_catalog": {
            "datatype": "nvarchar(128)",
            "destination": [
              "MSOSCatalog.msos_photometry_version"
            ]
          }
        },
        "msos_events_version": {
          "title": "Msos Events Version",
          "description": "Placeholder.  Example value '3.3.0'",
          "type": "string",
          "maxLength": 128,
          "archive_catalog": {
            "datatype": "nvarchar(128)",
            "destination": [
              "MSOSCatalog.msos_events_version"
            ]
          }
        },
        "msos_detection_efficiency_version": {
          "title": "Msos Detection Efficiency Version",
          "description": "Placeholder.",
          "type": "string",
          "maxLength": 128,
          "archive_catalog": {
            "datatype": "nvarchar(128)",
            "destination": [
              "MSOSCatalog.msos_detection_efficiency_version"
            ]
          }
        },
        "sequence": {
          "title": "Sequence",
          "description": "Placeholder.",
          "type": "string",
          "maxLength": 128,
          "archive_catalog": {
            "datatype": "nvarchar(128)",
            "destination": [
              "MSOSCatalog.sequence"
            ]
          }
        },
        "season": {
          "title": "Season",
          "description": "Placeholder.  Example value '0'",
          "type": "string",
          "maxLength": 8,
          "archive_catalog": {
            "datatype": "nvarchar(8)",
            "destination": [
              "MSOSCatalog.season"
            ]
          }
        },
        "flip_flag": {
          "title": "Flip Flag",
          "description": "Placeholder.  Example value 'False'",
          "type": "string",
          "maxLength": 8,
          "archive_catalog": {
            "datatype": "nvarchar(8)",
            "destination": [
              "MSOSCatalog.flip_flag"
            ]
          }
        },
        "galactic_center_flag": {
          "title": "Galactic Center Flag",
          "description": "Placeholder.  Example value 'False'",
          "type": "string",
          "maxLength": 8,
          "archive_catalog": {
            "datatype": "nvarchar(8)",
            "destination": [
              "MSOSCatalog.galactic_center_flag"
            ]
          }
        }
      },
      "required": [
        "msos_photometry_version",
        "msos_events_version",
        "msos_detection_efficiency_version",
        "sequence",
        "season",
        "flip_flag",
        "galactic_center_flag"
      ]
    }
  ]
}

@cjarnold
Copy link
Copy Markdown
Collaborator Author

can we get the tag -> $ref refactoring in the main branch before we attempt to refine the schema fields?

I believe @ketozhang is referring to just the asdf/time iso_time changes I am making in this PR?
I am okay with merging this branch and starting a fresh one for the schema field updates.
@braingram @WilliamJamieson does it sound okay?

@WilliamJamieson
Copy link
Copy Markdown
Collaborator

There are no tags referenced by the SSC schemas save some external tags.

@ketozhang
Copy link
Copy Markdown

ketozhang commented Sep 12, 2025

@braingram

For "merge" do you mean resolve references for a given schema? If so, there is a resolve_references argument to asdf.schema.load_schema.

Thank you for crafting this example! This is close, and we're looking for a more focused view on the list of properties in allOf. It did give me this idea; I took your output and filtered it down a bit further (this is an invalid schema, but still illustrate a summary):

cat merge.json | jq "{allOf: [.allOf[].properties | to_entries[] | { (.key): {title: .value.title, description: .value.description, type: .value.type}}]}" | yq -p json
Output

allOf:
  - filename:
      title: File Name
      description: |
        The filename as taken from the 'filename' property of the
        data availability notification (https://github.com/spacetelescope/roman-soc-ssc-common/blob/main/schemas/data_availability.json#L7)
        SSC does not need to include this in the 'metadata' object of the data availability notification.
      type: string
  - file_date:
      title: File Creation Date (UTC)
      description: |
        The file creation date as taken from the 'fileCreationTimestamp' property of the
        data availability notification (https://github.com/spacetelescope/roman-soc-ssc-common/blob/main/schemas/data_availability.json#L76)
        SSC does not need to include this in the 'metadata' object of the data availability notification.
      type: null
  - origin:
      title: Institution / Organization Name
      description: |
        Organization responsible for creating file ("IPAC/SSC"), the Science Support Center at Caltech/IPAC.
        SSC does not need to include this in the 'metadata' object of the data availability notification.
        This will be added automatically by SOC ingest manifest software
      type: string
  - model_type:
      title: Data Model Type
      description: The type of data model used for the file.  Example value 'MicrolensingEventCatalogSchema'
      type: string
  - product_type:
      title: Product Type Descriptor
      description: |
        A descriptor for the type of data contained within the
        file. This corresponds to the standard file suffixes for
        archival data products. Consult the documentation for the list
        of options.
      type: string
  - INSTRUME:
      title: Instrument
      description: Instrument used
      type: string
  - DETECTOR:
      title: Detector
      description: The WFI detector name
      type: string
  - APERTURE:
      title: Aperture
      description: The WFI aperture name
      type: string
  - CALVER:
      title: Calibration Version
      description: The version of the calibration software used
      type: string
  - OPTICAL_ELEMENT:
      title: Wide Field Instrument (WFI) Optical Element
      description: |
        Name of the optical element used to take the science
        data.
      type: null
  - EXPTIME:
      title: Exposure Time
      description: Cadence of photometric measurements for light curve (seconds)
      type: number
  - EXPSTART:
      title: Start Time
      description: First timestamp in the light curve (UTC)
      type: null
  - EXPEND:
      title: End Time
      description: Last timestamp in the light curve (UTC)
      type: null
  - MJDSTART:
      title: Start Time MJD
      description: First timestamp in the light curve (MJD)
      type: number
  - MJDEND:
      title: End Time MJD
      description: Last timestamp in the light curve (MJD)
      type: number
  - time_mean:
      title: Mean Time of the Product
      description: |
        The Universal Coordinated Time (UTC) mean start time of the exposures used to create the product.
      type: null
  - max_exposure_time:
      title: Maximum Exposure Time (s)
      description: |
        Maximum exposure time of all pixels in the product in units of seconds.
      type: number
  - mean_exposure_time:
      title: Mean Exposure Time (s)
      description: |
        Mean of component image exposure times
      type: number
  - conversion_megajanskys:
      title: Zeropoint Flux (MJy/sr)
      description: |
        The flux density (in units of megaJanskys per
        steradian; MJy/sr).
      type: number
  - conversion_megajanskys_uncertainty:
      title: Zeropoint Flux Uncertainty (MJy/sr)
      description: |
        The uncertainty in the flux density (in units of
        megaJanskys per steradian; MJy/sr)
      type: number
  - PROGNUM:
      title: Program number
      description: ID number (integer) assigned to the proposal
      type: string
  - PINAME:
      title: PI Last Name
      description: |
        Principal Investigator (PI) last name; uppercase (e.g., FORD, LEMMON)
      type: string
  - PROGTITLE:
      title: Program Title
      description: The title of the program
      type: string
  - CATEGORY:
      title: Program Category
      description: |
        The submitted proposal category of the program. The
        categories include calibration (CAL), core community survey
        (CCS), coronagraph technology demonstration (CGI), observatory
        commissioning (COM), engineering (ENG), general astrophysics
        survey (GAS), general investigator (GI), and observing conducted
        at the direction of the National Aeronautics and Space
        Administration (NASA).
      type: string
  - SUBCATEGORY:
      title: Program Subcategory
      description: |
        The submitted proposal subcategory of the program. The
        subcategories include calibration (CAL), coronagraph technology
        demonstration (CGI), community research programs (CR),
        discretionary research programs (DR), Galactic Bulge Time Domain
        Survey (GBTD), High-Latitude Time Domain Survey (HLTD),
        High-Latitude Wide-Area Survey (HLWA), observational research
        program (OR), Wide Field Instrument (WFI), and Wavefront Sensing
        and Control (WFSC). All subcategories belong to only a subset of
        the meta.program.category values.
      type: string
  - SCIENCE_CATEGORY:
      title: Science Category
      description: |
        The science category assigned during the Time
        Allocation Committee (TAC) review process.
      type: string
  - FLIP:
      title: Flip
      description: Placeholder.  Example value '1'
      type: string
  - FIELD:
      title: Field
      description: Placeholder.  Example value '5'
      type: string
  - SEQUENCE:
      title: Sequence
      description: Placeholder.  Example value '3'
      type: string
  - TILE_ID:
      title: Tile Id
      description: Placeholder.  Example value '35'
      type: string
  - msos_photometry_version:
      title: Msos Photometry Version
      description: Placeholder.  Example value '3.3.0'
      type: string
  - msos_events_version:
      title: Msos Events Version
      description: Placeholder.  Example value '3.3.0'
      type: string
  - msos_detection_efficiency_version:
      title: Msos Detection Efficiency Version
      description: Placeholder.
      type: string
  - sequence:
      title: Sequence
      description: Placeholder.
      type: string
  - season:
      title: Season
      description: Placeholder.  Example value '0'
      type: string
  - flip_flag:
      title: Flip Flag
      description: Placeholder.  Example value 'False'
      type: string
  - galactic_center_flag:
      title: Galactic Center Flag
      description: Placeholder.  Example value 'False'
      type: string

@ejoliet
Copy link
Copy Markdown
Collaborator

ejoliet commented Sep 19, 2025

I believe Sebastiano and Etienne will look at the files for MSOS as well. As Keyo mentioned, is not easy to inspect it for non developers.

I know we discussed also to have lower case keywords instead, which is compatible with database columns naming practices that we want to take advantage of.

@ketozhang
Copy link
Copy Markdown

Here's a list of all the keywords with all its references imported and melted down.

wfi_microlensing_object_catalog_level_4-1.0.0.melt.yaml
wfi_microlensing_event_catalog_level_4-1.0.0.melt.yaml
wfi_microlensing_variability_catalog_level_4-1.0.0.melt.yaml
wfi_microlensing_light_curve_catalog_level_4-1.0.0.melt.yaml
wfi_microlensing_reference_frame_level_3-1.0.0.melt.yaml

Schema melting code

import json
from importlib.resources import files

import asdf
import asdf_standard
import yaml



# add ssc resources
resources = asdf.resource.DirectoryResourceMapping(
    files("rad") / "resources" / "schemas" / "SSC", "asdf://stsci.edu/datamodels/roman/schemas/SSC/", recursive=True
)

asdf.get_config().add_resource_mapping(resources)

uris = [
    "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/wfi_microlensing_variability_catalog_level_4-1.0.0",
    "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/wfi_microlensing_event_catalog_level_4-1.0.0",
    "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/wfi_microlensing_object_catalog_level_4-1.0.0",
    "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/wfi_microlensing_reference_frame_level_3-1.0.0",
    "asdf://stsci.edu/datamodels/roman/schemas/SSC/MSOS/wfi_microlensing_light_curve_catalog_level_4-1.0.0",
]

for uri in uris:
    schema = asdf.schema.load_schema(
        uri,
        resolve_references=True,
    )
    with open(f"{uri.split('/')[-1]}.melt.yaml", "w") as f:
        melted = {}
        for d in schema["allOf"]:
            melted.update(d["properties"])
        for k in melted:
            del melted[k]["archive_catalog"]
        f.write(yaml.dump(melted, indent=2))

Copy link
Copy Markdown

@ketozhang ketozhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at the L4 variability and L4 events catalog. Requesting we remove photometry-related keywords in these catalogs. Here is the melted diff

diff

< APERTURE:
<   description: The WFI aperture name
<   maxLength: 16
<   title: Aperture
<   type: string
< CALVER:
<   description: The version of the calibration software used
<   maxLength: 16
<   title: Calibration Version
<   type: string
30,95d19
< DETECTOR:
<   description: The WFI detector name
<   maxLength: 16
<   title: Detector
<   type: string
< EXPEND:
<   allOf:
<   - pattern: '[0-9]{4}-(0[1-9])|(1[0-2])-(0[1-9])|([1-2][0-9])|(3[0-1])[T ]([0-1][0-9])|(2[0-4]):[0-5][0-9]:[0-5][0-9](.[0-9]+)?'
<     type: string
<   description: Last timestamp in the light curve (UTC)
<   title: End Time
< EXPSTART:
<   allOf:
<   - pattern: '[0-9]{4}-(0[1-9])|(1[0-2])-(0[1-9])|([1-2][0-9])|(3[0-1])[T ]([0-1][0-9])|(2[0-4]):[0-5][0-9]:[0-5][0-9](.[0-9]+)?'
<     type: string
<   description: First timestamp in the light curve (UTC)
<   title: Start Time
< EXPTIME:
<   description: Cadence of photometric measurements for light curve (seconds)
<   title: Exposure Time
<   type: number
< INSTRUME:
<   description: Instrument used
<   maxLength: 8
<   title: Instrument
<   type: string
< MJDEND:
<   description: Last timestamp in the light curve (MJD)
<   title: End Time MJD
<   type: number
< MJDSTART:
<   description: First timestamp in the light curve (MJD)
<   title: Start Time MJD
<   type: number
< OPTICAL_ELEMENT:
<   allOf:
<   - $schema: asdf://stsci.edu/datamodels/roman/schemas/rad_schema-1.0.0
<     description: 'Name of the filter element used. See the RDox Optical Element page
<       for more
< 
<       details on available optical elements and their properties.
< 
<       '
<     enum:
<     - F062
<     - F087
<     - F106
<     - F129
<     - F146
<     - F158
<     - F184
<     - F213
<     - GRISM
<     - PRISM
<     - DARK
<     - NOT_CONFIGURED
<     id: asdf://stsci.edu/datamodels/roman/schemas/wfi_optical_element-1.2.0
<     maxLength: 20
<     title: Optical Element
<     type: string
<   description: 'Name of the optical element used to take the science
< 
<     data.
< 
<     '
<   title: Wide Field Instrument (WFI) Optical Element
149,164d72
< conversion_megajanskys:
<   description: 'The flux density (in units of megaJanskys per
< 
<     steradian; MJy/sr).
< 
<     '
<   title: Zeropoint Flux (MJy/sr)
<   type: number
< conversion_megajanskys_uncertainty:
<   description: 'The uncertainty in the flux density (in units of
< 
<     megaJanskys per steradian; MJy/sr)
< 
<     '
<   title: Zeropoint Flux Uncertainty (MJy/sr)
<   type: number
201,212d108
< max_exposure_time:
<   description: 'Maximum exposure time of all pixels in the product in units of seconds.
< 
<     '
<   title: Maximum Exposure Time (s)
<   type: number
< mean_exposure_time:
<   description: 'Mean of component image exposure times
< 
<     '
<   title: Mean Exposure Time (s)
<   type: number
269,277d164
< time_mean:
<   allOf:
<   - pattern: '[0-9]{4}-(0[1-9])|(1[0-2])-(0[1-9])|([1-2][0-9])|(3[0-1])[T ]([0-1][0-9])|(2[0-4]):[0-5][0-9]:[0-5][0-9](.[0-9]+)?'
<     type: string
<   description: 'The Universal Coordinated Time (UTC) mean start time of the exposures
<     used to create the product.
< 
<     '
<   title: Mean Time of the Product

Comment thread latest/SSC/MSOS/wfi_microlensing_variability_catalog_level_4.yaml Outdated
Comment thread latest/SSC/MSOS/wfi_microlensing_event_catalog_level_4.yaml Outdated
@cjarnold
Copy link
Copy Markdown
Collaborator Author

@ketozhang Can I proceed with lowercasing all keywords?

@ketozhang
Copy link
Copy Markdown

@cjarnold Yes, thank you

Comment thread latest/SSC/MSOS/keywords/msos_reference_frame.yaml Outdated
Comment thread latest/SSC/MSOS/keywords/msos_reference_frame.yaml Outdated
Comment thread latest/SSC/MSOS/keywords/msos_program.yaml
Comment on lines +12 to +17
description: Delivery sequence
type: string
maxLength: 10
archive_catalog:
datatype: nvarchar(10)
destination: [MSOSCatalog.sequence, WFIReferenceFrame.sequence]
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is string right now but is the true data type an integer? If so, what is the max value?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integer. The max value is 72 days / 8 days * 6 season = 54 (nominally). I don't know what tolerance MAST needs for their, but I'd go for 100 or higher. On our end, we make no constraint outside of default database integer (int32).

Comment thread latest/SSC/MSOS/keywords/msos_settings.yaml
@ketozhang
Copy link
Copy Markdown

ketozhang commented Sep 29, 2025

Copy link
Copy Markdown

@ketozhang ketozhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few feedback and change requests from the folks in MSOS Photometry

Comment thread latest/SSC/MSOS/wfi_microlensing_object_catalog_level_4.yaml
Comment thread latest/SSC/MSOS/wfi_microlensing_light_curve_catalog_level_4.yaml Outdated
Comment thread latest/SSC/MSOS/wfi_microlensing_object_catalog_level_4.yaml
Comment thread latest/SSC/MSOS/keywords/msos_reference_frame.yaml Outdated
Comment thread latest/SSC/MSOS/wfi_microlensing_light_curve_catalog_level_4.yaml Outdated
Comment thread latest/SSC/MSOS/wfi_microlensing_variability_catalog_level_4.yaml
@ketozhang
Copy link
Copy Markdown

ketozhang commented Sep 30, 2025

@cjarnold Should all astronomical timestamps be in MJD? We are fine with engineering ones like file_date to stay UTC, however there is a preference for MJD for astro-related fields. I will leave it up to how SOC/MAST end up storing and representing these in your search feature. We can deliver either.

https://github.com/cjarnold/rad/blob/253aef4efcf01db079b041f34df17c87b425c562/latest/SSC/MSOS/keywords/msos_observation.yaml#L10-L19

@cjarnold
Copy link
Copy Markdown
Collaborator Author

@cjarnold Should all astronomical timestamps be in MJD? We are fine with engineering ones like file_date to stay UTC, however there is a preference for MJD for astro-related fields. I will leave it up to how SOC/MAST end up storing and representing these in your search feature. We can deliver either.

https://github.com/cjarnold/rad/blob/253aef4efcf01db079b041f34df17c87b425c562/latest/SSC/MSOS/keywords/msos_observation.yaml#L10-L19

Good question- I see a mixture of UTC and MJD in existing RAD schemas, so I am not sure. Will need some input from @WilliamJamieson , @jbrookens and @scfleming here.

An overview of what I see in existing RAD.

UTC:

MJD:

So it looks like we only use MJD in the SourceCatalog/SegmentationMap?

In this PR, I think when @scfleming initially added the "SOC Proposed" fields, he added both UTC and MJD to msos_exposure:

  • expstart (UTC)
  • expend (UTC)
  • mjdstart (MJD)
  • mjdend (MJD)

In msos_observation we only have UTC:

  • time_mean (UTC)

@scfleming
Copy link
Copy Markdown
Contributor

scfleming commented Sep 30, 2025

Our multi-mission search uses MJD for all time stamps, however, we recognize that most users prefer to use UTC calendar dates. To save on processing costs while supporting both our multi-mission database (MJD only) and our Roman-specific databases (can be either or both), the request is to include both in the YAML. That way the Roman Search form can easily support queries of either (there are common scenarios where you will be searching time ranges in either UTC or MJD), while also permitting efficient loading into our multi-mission database (which requires MJD).

Comment thread latest/SSC/MSOS/keywords/msos_observation.yaml
@scfleming
Copy link
Copy Markdown
Contributor

@ketozhang @cjarnold can I confirm everyone is on the same page, that the YAML called "wfi_microlensing_object_catalog_level_4.yaml" corresponds to what is in other places refered to as "Fiducial Object Catalog" and the YAML called "wfi_microlensing_variability_catalog_level_4.yaml" corresponds to "Periodic Variable Catalog"?

And @ketozhang from the SSC perspective, is there a preference of wording for these? Should we start referring to them as "Object Catalog" and "Variability Catalog" or is it better to refer to them as "Fiducial Object Catalog" and "Periodic Variable Catalog"?

@ejoliet
Copy link
Copy Markdown
Collaborator

ejoliet commented Sep 30, 2025

@scfleming no, i believe we need 2 more types/schema , one for fiducial one for periodic. metadata search keywords should be very similar.
The draft ICD has those in place but not the latest ICD in TDMS.
Should be wfi_microlensing_object_fiducial_catalog_level_4.yaml and wfi_microlensing_object_periodic_catalog_level_4.yaml (or without 'object'?).
@cjarnold We should add those to roman-soc-ssc-common notification availability schema as well.

@ketozhang
Copy link
Copy Markdown

The keywords between the two Object catalog variants will be nearly identical. So, here we decided not to worry about it and a future PR once the ICD is set, is simple to make (just dup and rename the current msos object catalog keyword schema)

@ketozhang
Copy link
Copy Markdown

ketozhang commented Sep 30, 2025

the request is to include both in the YAML.

Essentially the delivery is responsible for including both. That's fine with us, just make sure the schemas here have two fields for each timestamp.

For the field names, I'm not seeing much consistency. I do see one pattern to prefix mjd to the associated UTC field name. Another pattern suffixes with _mjd. Does the latter seem more adopted by SOC?

@scfleming
Copy link
Copy Markdown
Contributor

@scfleming no, i believe we need 2 more types/schema , one for fiducial one for periodic. metadata search keywords should be very similar. The draft ICD has those in place but not the latest ICD in TDMS. Should be wfi_microlensing_object_fiducial_catalog_level_4.yaml and wfi_microlensing_object_periodic_catalog_level_4.yaml (or without 'object'?). @cjarnold We should add those to roman-soc-ssc-common notification availability schema as well.

Ah, so there are going to be THREE object catalogs: object_fiducial, object_periodic, and a general object, along with a larger variable catalog, for a total of four? Would those be named/grouped so we could sort them all into a common file root (and therefore from a user perspective, a single fileset from which they could choose to download any or all 4 of them?

@ejoliet
Copy link
Copy Markdown
Collaborator

ejoliet commented Sep 30, 2025

@scfleming no, i believe we need 2 more types/schema , one for fiducial one for periodic. metadata search keywords should be very similar. The draft ICD has those in place but not the latest ICD in TDMS. Should be wfi_microlensing_object_fiducial_catalog_level_4.yaml and wfi_microlensing_object_periodic_catalog_level_4.yaml (or without 'object'?). @cjarnold We should add those to roman-soc-ssc-common notification availability schema as well.

Ah, so there are going to be THREE object catalogs: object_fiducial, object_periodic, and a general object, along with a larger variable catalog, for a total of four? Would those be named/grouped so we could sort them all into a common file root (and therefore from a user perspective, a single fileset from which they could choose to download any or all 4 of them?

Sorry, my message was confusing. It’s the object catalog that is split into fiducial and periodic. So 2 object catalogs: object_fiducial and object_periodic.

@scfleming
Copy link
Copy Markdown
Contributor

Oh OK, I do see them now in the draft ICD. So in total there are three catalogs then:

  • object_fiducial
  • object_periodic
  • variability

Thanks!

Comment thread latest/SSC/MSOS/keywords/msos_exposure.yaml
Copy link
Copy Markdown

@ketozhang ketozhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reply, otherwise all else LGTM. Thank you all and especially @cjarnold for handling the changes.

Comment thread latest/SSC/MSOS/keywords/msos_exposure.yaml
Comment thread latest/SSC/MSOS/keywords/msos_reference_frame.yaml
Comment thread latest/SSC/MSOS/keywords/msos_common.yaml Outdated
Copy link
Copy Markdown
Collaborator

@jbrookens jbrookens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say for sure what the tables for the archive destinations will be yet, so consider these placeholders. Approving.

@cjarnold
Copy link
Copy Markdown
Collaborator Author

cjarnold commented Oct 2, 2025

I have no more outstanding changes to make. If you approve @PaulHuwe I think we should merge this in now. We will come back with a new PR for any archive table name adjustments, as well as any other final tweaks needed by SSC.

@PaulHuwe
Copy link
Copy Markdown
Collaborator

PaulHuwe commented Oct 2, 2025

@ejoliet Can you approve this PR, then I can merge?

@jsobeck
Copy link
Copy Markdown
Contributor

jsobeck commented Oct 2, 2025

Hi @PaulHuwe . Please note that I did not see Emmanuel's name on the list of reviewers (though I see that @ketozhang already gave his approval). If needed, my name could be added as a reviewer.

Copy link
Copy Markdown
Collaborator

@PaulHuwe PaulHuwe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PaulHuwe PaulHuwe merged commit bdfb97d into spacetelescope:main Oct 2, 2025
17 checks passed
@PaulHuwe
Copy link
Copy Markdown
Collaborator

PaulHuwe commented Oct 2, 2025

Merged with @jsobeck 's approval in place of @ejoliet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants