Dedicated dataset_version API endpoint
P
Pink Jay
When a new dataset version is released each month, we have a large list of users we need to check against the person enrichment API. We maintain a list of previously searched users and build associations between email and PDL IDs from our monthly bulk files (which do not have email). There is an N-day delay between a new dataset release and when we receive monthly bulk files. The monthly files help us know if an associated email shows up in the new version, drastically reducing the number of credits we would use with the person enrichment API.
We currently use the max dataset version across all our monthly data deliverables. Due to the delivery delay (N-days between dataset release and data delivery), there is a small window where the API is returning the next version, but we "expect" it to be the max dataset as stated above. The fear is that an amount of credits are "wasted" because we tag all results coming back with the "expected" dataset version, which would be off.
I currently came up with a way to scrape the release docs to find out the current released version that can then be compared to our expected max dataset version and basically don't do any more API calls until they line up. Having an API to do this would be much more performant than hoping for documentation conisistency with the scraping method.