Contents

Case Study - PeopleSoft HR to Campus Person Sync

This case study describes the modernization of a long-running PeopleSoft-to-PeopleSoft person sync at a large public university. A consolidated PeopleSoft HR system owned employee data for the entire system; three downstream PeopleSoft Campus Solutions databases needed employee records in order to assign instructors to classes, run advising, and drive identity provisioning. The legacy integration – a set of SQR jobs that moved fixed-width files between servers – had been running for more than twenty years. The replacement is a near-real-time, pull-model REST integration built on three web services hosted in the HR database and consumed by an event-driven subscription worker in each Campus database.

The lessons here apply to any PeopleSoft-to-PeopleSoft person or data sync, and to many legacy-modernization projects where flat files and FTP can be retired in favor of HTTP and JSON.

The Legacy Problem

The legacy “HR Loop” consisted of nightly SQR jobs that exported employee data to fixed-width files, transferred those files between application servers, and imported them back in with more SQR.

  
flowchart LR
    HR["HR PeopleSoft\n(system of record)"] --> SQR1["Nightly SQR\nExport Job"]
    SQR1 --> FILE["Fixed-Width File\n(full dump every run)"]
    FILE --> FTP["FTP / File\nTransfer"]
    FTP --> SQR2A["SQR Import\n(Campus A)"]
    FTP --> SQR2B["SQR Import\n(Campus B)"]
    FTP --> SQR2C["SQR Import\n(Campus C)"]
    SQR2A --> CSA["Campus A\nPeopleSoft"]
    SQR2B --> CSB["Campus B\nPeopleSoft"]
    SQR2C --> CSC["Campus C\nPeopleSoft"]

It worked for a long time, but it accumulated problems that modern integrations are expected to solve:

  • Full-file exports every run. The HR export included the most-recent effective-dated row for every EMPL_RCD, so an employee with a long job history appeared multiple times in the same file – each row carrying a snapshot of the same bio/demo data. File sizes grew every year.
  • No change detection. The process ran on a schedule and re-processed everything in scope whether it had changed or not. A holiday outage meant stale Campus data until the next run.
  • No error isolation. One malformed row could silently corrupt downstream processing. There was no per-record status, no retry primitive, and no way to ask “what happened to EMPLID X last night?” without opening log files.
  • Impossible to re-sync one person. Functional users who needed a single record refreshed had to wait for the next full run, or ask a developer to hand-craft a one-off file.
  • Fragile infrastructure coupling. File paths were baked into SQR; an environment move meant editing code, not configuration.

The assessment that preceded the redesign concluded that incremental fixes would not address the structural issues. The team chose a complete rewrite built around RESTful APIs.

Key Decision: Pull, Not Push

The first architectural decision was whether HR should push changes to Campus or Campus should pull from HR. The team chose pull for several reasons:

  • Fewer systems need to know about each other. HR exposes a generic API and has no knowledge of which Campus databases (or other downstream consumers) exist. Campus drives its own schedule, its own retry logic, and its own error handling.
  • Built-in re-sync primitive. If Campus ever needs to reconstitute a person from HR, it calls the GET API. No special batch to run, no “re-extract” ticket to HR’s team.
  • Campus availability does not block HR. In a push design, HR has to buffer or retry when Campus is down. In a pull design, Campus simply catches up when it comes back online.
  • Change detection that tolerates legacy custom code. The delivered WORKFORCE_SYNC and PERSON_BASIC_SYNC async messages only fire when HR data is modified through components and Component Interfaces. In a system with decades of SQR customizations, those messages are not reliably published. A Campus-driven query against audit tables sidesteps that problem entirely.
  • RESTful APIs set a foundation for future consumers. The same Person GET service used by Campus will eventually be consumed by the Identity Management system and any other downstream consumer that needs HR data.

The Three HR Web Services

The HR side of the integration is three services. Nothing more.

  sequenceDiagram
    participant CS as Campus Subscription Worker
    participant HR as HR Web Services
    CS->>HR: (1) GET Person Change List\n(changed in last N minutes)
    HR-->>CS: List of EMPLIDs
    loop For each EMPLID
        CS->>HR: (2) GET Person (HR_EMPLID)
        HR-->>CS: Full person payload (JSON)
        CS->>CS: Search match and update\nCampus tables via CI
        CS->>HR: (3) POST Update External ID\n(Campus EMPLID)
        HR-->>CS: Acknowledged
    end

Person GET

A single-person REST service hosted in HR. The design supports multiple lookup keys from day one:

  • HR EMPLID (the native key)
  • Campus EMPLID (via the EXTERNAL_SYSTEM cross-reference)
  • IDM GUID (via the same cross-reference, reserved for a later phase)

The response is a flat JSON document that consolidates the full person: PERSON, NAMES (all mapped name types), ADDRESSES, PERS_NID, EMAIL_ADDRESSES, PERSONAL_PHONE, all EXTERNAL_SYSTEM rows, and every JOB record with its POSITION_DATA. Most external systems do not need history for bio/demo data, so those sections return only the current effective row; job records return every EMPL_RCD.

  flowchart LR
    HRKEY["HR EMPLID\n(native key)"] --> SVC["Person GET\nService"]
    CSKEY["Campus EMPLID"] --> XREF["PS_EXTERNAL_SYSTEM\nCross-Reference"]
    IDMKEY["IDM GUID\n(future phase)"] --> XREF
    XREF --> SVC
    SVC --> JSON["Flat JSON Payload\nPERSON, NAMES, ADDRESSES,\nPERS_NID, EMAIL, PHONE,\nEXTERNAL_SYSTEM,\nJOB + POSITION_DATA"]

Two decisions in this design are worth calling out. First, the service accepts more than one kind of lookup key. This costs almost nothing at implementation time and saves a significant amount of consumer-side gymnastics later – Campus passes the HR EMPLID it received from the change list, and a troubleshooting user on the Campus side passes the Campus EMPLID they already know. Second, the service is trusted-consumer only. It returns NID and employee data that should never be exposed to end users, and the security model reflects that: a dedicated operator ID, basic authentication, and firewall-level restrictions to known Campus servers.

Person Change List

The Change List service answers one question: which EMPLIDs have changed in HR in the last N minutes? The team built this service without writing a single line of integration code. They used the delivered query-as-a-REST-service pattern (see Query REST Service) and pointed it at a PeopleSoft Query that unions existing audit tables:

  • An Oracle trigger-populated audit on PS_ADDRESSES
  • A consolidated provisioning audit on PS_NAMES, PS_PERS_NID, and PS_EXTERNAL_SYSTEM
  • A delivered job audit on PS_JOB

The query takes a single MINUTES parameter and returns the distinct set of EMPLIDs whose audit rows are newer than SYSDATE - :MINUTES/1440.

Using a query here has concrete benefits:

  • The change-detection SQL is editable without a code migration. If a new audit table needs to feed the list, add it to the UNION in Query Manager. No project, no promote.
  • The owner of the SQL does not have to be a developer. Functional or DBA team members can adjust the criteria.
  • It is trivially composable with ad-hoc syncs. The same query tool can be used to target specific EMPLIDs for re-sync.
  • It returns over-eager results safely. If the query sometimes returns an EMPLID whose changes are not actually relevant to Campus, the Campus worker simply calls Person GET, compares the payload to current Campus state, finds no differences, and closes the event. Over-matching is cheap; missing a record is expensive.

Update External ID

A small POST service that writes the Campus EMPLID back into HR’s EXTERNAL_SYSTEM table. Campus calls this after every successful sync so that both systems know each other’s identifiers. The service wraps an existing Component Interface to perform the update, so the normal PeopleSoft field-level security and audit behavior applies.

This service is what makes the integration bidirectional in an otherwise one-way design. HR owns the data, Campus owns its own EMPLID, and neither system has to know how the other generates IDs.

Campus-Side Processing

The Campus side of the integration is built around a custom event table, UM_HR_SYNC_EVENT, that tracks one row per sync attempt per EMPLID. A lightweight Application Engine queries the Change List API, inserts events into the table, and publishes each event as an async local-to-local message. A subscription worker picks up each message and does the real work.

  flowchart LR
    A["Change List Query\n(HR REST)"] --> B["App Engine\nEvent Creator"]
    B --> C["UM_HR_SYNC_EVENT\n(NEW → QUE)"]
    C --> D["Async Message\nLocal-to-Local"]
    D --> E["Subscription Worker"]
    E -->|"Person GET"| F["HR"]
    E --> G["Search Match\n& CI Updates"]
    E -->|"Update External ID"| F
    E --> C

Each event flows through a small status machine – NEWQUECOMP, with branches for ERR (processing failure), SUSP (search-match requires human review), CANC (superseded by a newer event for the same EMPLID), and SKIP (EMPLID has no configured business unit for this Campus database). The event table is deliberately separate from the Integration Broker message monitor: it gives the support team a business-level view, supports operational reports, and survives IB cleanup jobs.

  flowchart TD
    NEW["NEW\n(event created)"] --> QUE["QUE\n(claimed by worker)"]
    QUE --> COMP["COMP\n(synced successfully)"]
    QUE --> ERR["ERR\n(processing failure)"]
    QUE --> SUSP["SUSP\n(search-match ambiguous)"]
    QUE --> CANC["CANC\n(superseded by newer\nevent for same EMPLID)"]
    QUE --> SKIP["SKIP\n(no configured\nbusiness unit)"]
    ERR -. "re-queue as new event" .-> NEW
    SUSP -. "human resolves, re-queue as new event" .-> NEW

The concurrent-processing mechanics of this pattern – partitioned queues, duplicate cancellation, retry via new events, housekeeping of old rows – are documented in detail in Async Services for Concurrent Processing. That case study is the reference implementation. This integration uses it largely unchanged.

Search-Match Suspend

When HR sends a person that Campus does not already have, the integration does not trust automatic matching. Instead, it runs a Search/Match against Campus, and if the match is ambiguous – any partial match on name, date of birth, or national ID – the event is moved to SUSP and no update is applied. An email notification fires to a configured distribution list.

  flowchart TD
    HR["HR Person Payload"] --> SM["Search/Match\non Campus"]
    SM --> D{"Match result?"}
    D -- "Exact unique match" --> LINK["Auto-link to existing\nCampus EMPLID"]
    D -- "No match" --> NEW["Create new\nCampus EMPLID"]
    D -- "Ambiguous (partial name /\nDOB / NID)" --> SUSP["SUSP\n+ email notification"]
    SUSP --> HUMAN["Support user reviews\ncandidates side-by-side"]
    HUMAN -- "Link existing" --> LINK
    HUMAN -- "Force new" --> NEW
    LINK --> COMP["COMP"]
    NEW --> COMP

A support user opens the event in the management page, sees the candidate Campus records side-by-side with the HR payload, and resolves the event by either:

  • Linking to an existing Campus EMPLID (the search-match hit was correct)
  • Forcing a new Campus EMPLID (the hits were false positives)

This is slower than auto-creating or auto-matching, and it is the right behavior for person data. A duplicate person record in Campus is painful to clean up later; a suspended event is a five-second decision for someone with the right context.

Periodic Re-Sync

Event-driven sync is not sufficient on its own. Audit triggers can be disabled during database maintenance. Bulk UPDATE statements from DBA scripts can bypass audit tables. A message can be lost to an Integration Broker outage that outlasts its retention window. Over time, Campus drifts away from HR in small, invisible ways.

The solution is a periodic full re-sync that runs alongside the event-driven sync. The same event-creation process that handles change-list-driven syncs also accepts an ad-hoc query. The ad-hoc query returns EMPLIDs that have not been synced in the last N days:

SELECT DISTINCT A.UMHR_EMPLID
FROM PS_UM_HR_BDL A
WHERE A.EMPL_STATUS = 'A'
AND NOT EXISTS (
    SELECT 'X' FROM PS_UM_HR_SYNC_LOG B
    WHERE B.EMPLID = A.EMPLID
    AND B.SCC_ROW_ADD_DTTM > SYSDATE - 30
)

Scheduled monthly, this catches drift that the change-detection query missed. Events generated this way flow through the same subscription worker, perform the same comparison against HR, and close cleanly with no changes in the common case. The cost of running it is almost entirely in the (cheap) GET calls for records that have not actually changed.

Database-Keyed Configuration

Configuration lives in a table keyed by %dbname. Each row carries:

  • HR Integration Broker host and node name
  • HR basic-auth token for the REST services
  • The name of the Change List query
  • Business-unit filters (which business units this Campus database cares about)
  • Field-level mapping rules (which HR phone types map to which Campus phone types, and so on)
  • Notification email addresses

Keying on database name has one specific operational benefit: when a production database is refreshed down to a test environment, the production configuration row is simply not present in the refreshed database. The test environment’s row still points at the HR test IB endpoint. No post-refresh cleanup is required. No one has to remember to edit a URL. Test environments never accidentally call production.

This is the same pattern documented in the D2L case study, applied here to an internal integration.

Lessons Learned

  1. When you need a change-detection feed, expose a Query as a REST service. The delivered ExecuteQuery service is an enormous time-saver for this shape of problem. No application-package code, no message definitions, and the SQL stays editable by the team that owns the data model.

  2. Build an event table separate from the IB message monitor. The message monitor is an infrastructure tool. Support staff need a business-level view: event status, last error, history of sync attempts per EMPLID, and a button to re-queue. That view belongs in a table you own.

  3. Suspend on ambiguous matches, do not guess. Silent duplicate creation is the worst outcome for person integrations. A suspend-and-notify workflow costs a few minutes of human time per edge case and eliminates an entire category of data-quality problems.

  4. Build “re-sync one person” on day one. Every support call ends with it. If you have to hand-craft the workaround the first five times, you will build it eventually anyway. Build it first.

  5. Pair event-driven sync with periodic re-sync. Change detection will always miss edge cases. A cheap monthly re-sync that compares-and-does-nothing for 99% of records is the safety net.

  6. Key configuration on database name. It is the only configuration strategy that survives production-to-test refreshes without post-refresh cleanup.

  7. Design the API for more than one lookup key. Accepting both the source and the target system’s identifier costs very little at the API layer and saves significant complexity in every consumer and every support tool.

  8. Anonymize before you publish. Business unit codes, institution names, and external system codes are all implementation details that belong in configuration, not in code examples or case studies.


Author Info
Chris Malek

Chris Malek s a PeopleTools® Technical Consultant with over two decades of experience working on PeopleSoft enterprise software projects. He is available for consulting engagements.

Work with Chris