Vertex AI
Important Capabilities
Capability | Status | Notes |
---|---|---|
Descriptions | ✅ | Extract descriptions for Vertex AI Registered Models and Model Versions |
Extract Tags | ✅ | Extract tags for Vertex AI Registered Model Stages |
Ingesting metadata from VertexAI requires using the Vertex AI module.
Prerequisites
Please refer to the Vertex AI documentation for basic information on Vertex AI.
Credentials to access to GCP
Please read the section to understand how to set up application default Credentials to GCP GCP docs.
Create a service account and assign roles
- Setup a ServiceAccount as per GCP docs and assign the previously created role to this service account.
- Download a service account JSON keyfile.
- Example credential file:
{
"type": "service_account",
"project_id": "project-id-1234567",
"private_key_id": "d0121d0000882411234e11166c6aaa23ed5d74e0",
"private_key": "-----BEGIN PRIVATE KEY-----\nMIIyourkey\n-----END PRIVATE KEY-----",
"client_email": "test@suppproject-id-1234567.iam.gserviceaccount.com",
"client_id": "113545814931671546333",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/test%suppproject-id-1234567.iam.gserviceaccount.com"
}
- To provide credentials to the source, you can either:
Set an environment variable:
$ export GOOGLE_APPLICATION_CREDENTIALS="/path/to/keyfile.json"
or
Set credential config in your source based on the credential json file. For example:
credential:
private_key_id: "d0121d0000882411234e11166c6aaa23ed5d74e0"
private_key: "-----BEGIN PRIVATE KEY-----\nMIIyourkey\n-----END PRIVATE KEY-----\n"
client_email: "test@suppproject-id-1234567.iam.gserviceaccount.com"
client_id: "123456678890"
CLI based Ingestion
Starter Recipe
Check out the following recipe to get started with ingestion! See below for full configuration options.
For general pointers on writing and running a recipe, see our main recipe guide.
source:
type: vertexai
config:
project_id: "acryl-poc"
region: "us-west2"
# Note that GOOGLE_APPLICATION_CREDENTIALS or credential section below is required for authentication.
# credential:
# private_key: '-----BEGIN PRIVATE KEY-----\\nprivate-key\\n-----END PRIVATE KEY-----\\n'
# private_key_id: "project_key_id"
# client_email: "client_email"
# client_id: "client_id"
sink:
type: "datahub-rest"
config:
server: "http://localhost:8080"
Config Details
- Options
- Schema
Note that a .
is used to denote nested fields in the YAML recipe.
Field | Description |
---|---|
project_id ✅ string | Project ID in Google Cloud Platform |
region ✅ string | Region of your project in Google Cloud Platform |
bucket_uri string | Bucket URI used in your project |
vertexai_url string | VertexUI URI |
env string | The environment that all assets produced by this connector belong to Default: PROD |
credential GCPCredential | GCP credential information |
credential.client_email ❓ string | Client email |
credential.client_id ❓ string | Client Id |
credential.private_key ❓ string | Private key in a form of '-----BEGIN PRIVATE KEY-----\nprivate-key\n-----END PRIVATE KEY-----\n' |
credential.private_key_id ❓ string | Private key id |
credential.auth_provider_x509_cert_url string | Auth provider x509 certificate url |
credential.auth_uri string | Authentication uri |
credential.client_x509_cert_url string | If not set it will be default to https://www.googleapis.com/robot/v1/metadata/x509/client_email |
credential.project_id string | Project id to set the credentials |
credential.token_uri string | Token uri Default: https://oauth2.googleapis.com/token |
credential.type string | Authentication type Default: service_account |
The JSONSchema for this configuration is inlined below.
{
"title": "VertexAIConfig",
"description": "Any source that produces dataset urns in a single environment should inherit this class",
"type": "object",
"properties": {
"env": {
"title": "Env",
"description": "The environment that all assets produced by this connector belong to",
"default": "PROD",
"type": "string"
},
"credential": {
"title": "Credential",
"description": "GCP credential information",
"allOf": [
{
"$ref": "#/definitions/GCPCredential"
}
]
},
"project_id": {
"title": "Project Id",
"description": "Project ID in Google Cloud Platform",
"type": "string"
},
"region": {
"title": "Region",
"description": "Region of your project in Google Cloud Platform",
"type": "string"
},
"bucket_uri": {
"title": "Bucket Uri",
"description": "Bucket URI used in your project",
"type": "string"
},
"vertexai_url": {
"title": "Vertexai Url",
"description": "VertexUI URI",
"default": "https://console.cloud.google.com/vertex-ai",
"type": "string"
}
},
"required": [
"project_id",
"region"
],
"additionalProperties": false,
"definitions": {
"GCPCredential": {
"title": "GCPCredential",
"type": "object",
"properties": {
"project_id": {
"title": "Project Id",
"description": "Project id to set the credentials",
"type": "string"
},
"private_key_id": {
"title": "Private Key Id",
"description": "Private key id",
"type": "string"
},
"private_key": {
"title": "Private Key",
"description": "Private key in a form of '-----BEGIN PRIVATE KEY-----\\nprivate-key\\n-----END PRIVATE KEY-----\\n'",
"type": "string"
},
"client_email": {
"title": "Client Email",
"description": "Client email",
"type": "string"
},
"client_id": {
"title": "Client Id",
"description": "Client Id",
"type": "string"
},
"auth_uri": {
"title": "Auth Uri",
"description": "Authentication uri",
"default": "https://accounts.google.com/o/oauth2/auth",
"type": "string"
},
"token_uri": {
"title": "Token Uri",
"description": "Token uri",
"default": "https://oauth2.googleapis.com/token",
"type": "string"
},
"auth_provider_x509_cert_url": {
"title": "Auth Provider X509 Cert Url",
"description": "Auth provider x509 certificate url",
"default": "https://www.googleapis.com/oauth2/v1/certs",
"type": "string"
},
"type": {
"title": "Type",
"description": "Authentication type",
"default": "service_account",
"type": "string"
},
"client_x509_cert_url": {
"title": "Client X509 Cert Url",
"description": "If not set it will be default to https://www.googleapis.com/robot/v1/metadata/x509/client_email",
"type": "string"
}
},
"required": [
"private_key_id",
"private_key",
"client_email",
"client_id"
],
"additionalProperties": false
}
}
}
Code Coordinates
- Class Name:
datahub.ingestion.source.vertexai.VertexAISource
- Browse on GitHub
Questions
If you've got any questions on configuring ingestion for Vertex AI, feel free to ping us on our Slack.